So let’s come to the fun part.
Here is what you need to have available on your workstation:
Botium Speech Processing comes with a reasonable default configuration for a voice platform
Both of them are free and Open Source and a good match to get started with voice technologies, on the other hand they are without a doubt among the best free voice tools available.
Launching it can be done with a few command line calls.
$ git clone https://github.com/codeforequity-at/botium-speech-processing.git
$ cd botium-speech-processing
$ docker-compose up -d
Depending on network speed and hardware this step can take a while.
Pointing your browser to http://localhost will show the API explorer for Botium Speech Processing.
This Github repository includes sample webservice code which adds Speech-To-Text and Text-To-Speech capabilities to SAP Conversational AI.
First, clone the repository (if not already done before) and install the prerequisites:
$ git clone https://github.com/codeforequity-at/botium-speech-processing.git
$ cd botium-speech-processing/connectors/sapcai/server
$ npm install
Now you can launch the webservice with another command line call - replace my-sap-cai-token with your bot token:
$ SAPCAI_TOKEN=my-sap-cai-token npm start
Point your browser to http://localhost:5005 to bring up a minimal text-only chat interface to check if the connection to your SAP Conversational AI bot is already working:
$ git clone https://github.com/codeforequity-at/botium-voice-interface.git
$ cd botium-voice-interface
$ npm install
$ npm run serve
socket.on('user_uttered', async (msg) => {
if (msg && msg.message) {
let textInput = msg.message
if (msg.message.startsWith('data:')) {
const base64Data = msg.message.substring(msg.message.indexOf(',') + 1)
const audioData = Buffer.from(base64Data, 'base64')
const wavToMonoWavRequestOptions = {
method: 'POST',
url: 'https://speech.botiumbox.com/api/convert/WAVTOMONOWAV',
data: audioData,
headers: {
'content-type': 'audio/wav'
},
responseType: 'arraybuffer'
}
const wavToMonoWavResponse = await axios(wavToMonoWavRequestOptions)
const sttRequestOptions = {
method: 'POST',
url: 'https://speech.botiumbox.com/api/stt/en',
data: wavToMonoWavResponse.data,
headers: {
'content-type': 'audio/wav'
},
responseType: 'json'
}
const sttResponse = await axios(sttRequestOptions)
textInput = sttResponse.data.text
}
const requestOptions = {
method: 'POST',
url: 'https://api.cai.tools.sap/build/v1/dialog',
headers: {
Authorization: `Token ${SAPCAI_TOKEN}`
},
data: {
message: {
type: 'text',
content: textInput
},
conversation_id: msg.session_id || nanoid()
}
}
try {
const response = await axios(requestOptions)
for (const message of response.data.results.messages.filter(t => t.type === 'text')) {
const botUttered = {
text: message.content
}
const ttsRequestOptions = {
method: 'GET',
url: 'https://speech.botiumbox.com/api/tts/en',
params: {
text: message.content,
voice: 'dfki-poppy-hsmm'
},
responseType: 'arraybuffer'
}
const ttsResponse = await axios(ttsRequestOptions)
botUttered.link = 'data:audio/wav;base64,' + Buffer.from(ttsResponse.data, 'binary').toString('base64')
socket.emit('bot_uttered', botUttered)
}
} catch (err) {
console.log(err.message)
}
}
})
You must be a registered user to add a comment. If you've already registered, sign in. Otherwise, register and sign in.
User | Count |
---|---|
37 | |
10 | |
5 | |
4 | |
4 | |
3 | |
3 | |
3 | |
2 | |
2 |