How can I deploy botium speech processing on free cloud like okteto?

I’m using botium speech processing on docker in local environment. Using only frontend, STT, TTS and nginx via docker-compose. Many thanks to Florian Treml for providing awesome blog on medium. So I need to use these four on cloud as I have memory shortage. Just for fun and experiments need to deploy it on free cloud provider okteto. How can I achieve this ? I guess I need to make a dockerfile also.

i am pretty sure that there is not sufficient memory available for kaldi and marytts on free services

They are providing total of 8gb memory per namespace and 3gb/pod
My container/pod consumption is under 3gb as I’m only using 1 worker for kaldi, marytts consumes around 2.5Gb which is under 3Gb and frontend and nginx takes a lot less.

Botium Speech Processing comes with Dockerfiles, docker-compose templates and build scripts, or you can use the Docker images from Dockerhub. But for Kubernetes, you will additionally need the kubectl resource configuration files, and they are not part of the distribution (yet).

There are helm charts available here: https://github.com/codeforequity-at/botium-box-helm/tree/master/botium-speech-processing

1 Like

Thanks for providing helm charts your support is really appreciated will try to deploy it. One more thing like in kaldi I have used one worker of kaldi recognition which uses 1.8 Gb memory likewise in marytts I’m using picotts for TTS, how should I make it take less RAM. Like in STT the supervisor config was there, but in TTS I’m still figuring out where and what should I comment the far I go will always lost in code.

For TTS, pico requires way less memory than marytts, and if you are using pico, then you can completely remove the marytts container - the helm chart has the logic to only launch the marytts workload when needed.

Hi Florian I tried to deploy botium speech processing on okteto. using this command : https://raw.githubusercontent.com/Shubhamjugran/botium-box-helm/master/index.yaml.
So I’ve been in touch with okteto team they be saying this :

Have you tried to install the helm chart in a vanilla kubernertes cluster to see if that issue also happen there? The error seems related with permissions but we don’t modify anything in the case of a helm release. For helm charts we just execute helm install, so maybe some configuration is missing in the helm chart definition? The main issue is the usage of a hosted volume in the docker image.

Below is the image for reference.

Logs in the window :

2022-02-23 12:08:56.00 UTCfrontend-58c89f5556-tvmw9[pod-event]Pulling image "botium/botium-speech-frontend:1.3.0"

2022-02-23 12:11:36.00 UTCfrontend-58c89f5556-tvmw9[pod-event]Back-off restarting failed container

2022-02-24 05:45:24.70 UTCfrontend-58c89f5556-tvmw9frontend``

2022-02-24 05:45:24.70 UTCfrontend-58c89f5556-tvmw9frontend> botium-speech-processing-frontend@1.0.0 start-dist /app

2022-02-24 05:45:24.70 UTCfrontend-58c89f5556-tvmw9frontend> node -r dotenv-flow/config ./src/server.js

2022-02-24 05:45:24.70 UTCfrontend-58c89f5556-tvmw9frontend``

2022-02-24 05:45:24.77 UTCfrontend-58c89f5556-tvmw9frontenddotenv-flow: "BOTIUM_API_TOKENS" is already defined in process.env and will not be overwritten

2022-02-24 05:45:24.77 UTCfrontend-58c89f5556-tvmw9frontenddotenv-flow: "BOTIUM_SPEECH_PROVIDER_TTS" is already defined in process.env and will not be overwritten

2022-02-24 05:45:24.77 UTCfrontend-58c89f5556-tvmw9frontenddotenv-flow: "BOTIUM_SPEECH_PROVIDER_STT" is already defined in process.env and will not be overwritten

2022-02-24 05:45:24.77 UTCfrontend-58c89f5556-tvmw9frontenddotenv-flow: "BOTIUM_SPEECH_IBM_STT_APIKEY" is already defined in process.env and will not be overwritten

2022-02-24 05:45:24.77 UTCfrontend-58c89f5556-tvmw9frontenddotenv-flow: "BOTIUM_SPEECH_IBM_STT_SERVICEURL" is already defined in process.env and will not be overwritten

2022-02-24 05:45:24.77 UTCfrontend-58c89f5556-tvmw9frontenddotenv-flow: "BOTIUM_SPEECH_IBM_TTS_APIKEY" is already defined in process.env and will not be overwritten

2022-02-24 05:45:24.77 UTCfrontend-58c89f5556-tvmw9frontenddotenv-flow: "BOTIUM_SPEECH_IBM_TTS_SERVICEURL" is already defined in process.env and will not be overwritten

2022-02-24 05:45:24.77 UTCfrontend-58c89f5556-tvmw9frontenddotenv-flow: "BOTIUM_SPEECH_AZURE_SUBSCRIPTION_KEY" is already defined in process.env and will not be overwritten

2022-02-24 05:45:24.77 UTCfrontend-58c89f5556-tvmw9frontenddotenv-flow: "BOTIUM_SPEECH_AZURE_REGION" is already defined in process.env and will not be overwritten

2022-02-24 05:45:24.77 UTCfrontend-58c89f5556-tvmw9frontenddotenv-flow: "BOTIUM_SPEECH_AWS_REGION" is already defined in process.env and will not be overwritten

2022-02-24 05:45:24.77 UTCfrontend-58c89f5556-tvmw9frontenddotenv-flow: "BOTIUM_SPEECH_AWS_ACCESS_KEY_ID" is already defined in process.env and will not be overwritten

2022-02-24 05:45:24.77 UTCfrontend-58c89f5556-tvmw9frontenddotenv-flow: "BOTIUM_SPEECH_AWS_SECRET_ACCESS_KEY" is already defined in process.env and will not be overwritten

2022-02-24 05:45:24.78 UTCfrontend-58c89f5556-tvmw9frontenddotenv-flow: "BOTIUM_SPEECH_AWS_S3_BUCKET" is already defined in process.env and will not be overwritten

2022-02-24 05:45:25.75 UTCfrontend-58c89f5556-tvmw9frontend/app/node_modules/mkdirp/lib/mkdirp-native.js:35

2022-02-24 05:45:25.75 UTCfrontend-58c89f5556-tvmw9frontend throw er

2022-02-24 05:45:25.75 UTCfrontend-58c89f5556-tvmw9frontend ^

2022-02-24 05:45:25.75 UTCfrontend-58c89f5556-tvmw9frontend``

2022-02-24 05:45:25.75 UTCfrontend-58c89f5556-tvmw9frontendError: EACCES: permission denied, mkdir '/app/resources/.cache/stt'

2022-02-24 05:45:25.75 UTCfrontend-58c89f5556-tvmw9frontend at Object.mkdirSync (fs.js:987:3)

2022-02-24 05:45:25.75 UTCfrontend-58c89f5556-tvmw9frontend at mkdirpNativeSync (/app/node_modules/mkdirp/lib/mkdirp-native.js:29:10)

2022-02-24 05:45:25.75 UTCfrontend-58c89f5556-tvmw9frontend at Function.mkdirpSync [as sync] (/app/node_modules/mkdirp/index.js:21:7)

2022-02-24 05:45:25.75 UTCfrontend-58c89f5556-tvmw9frontend at Object.<anonymous> (/app/src/routes.js:30:26)

2022-02-24 05:45:25.75 UTCfrontend-58c89f5556-tvmw9frontend at Module._compile (internal/modules/cjs/loader.js:1063:30)

2022-02-24 05:45:25.75 UTCfrontend-58c89f5556-tvmw9frontend at Object.Module._extensions..js (internal/modules/cjs/loader.js:1092:10)

2022-02-24 05:45:25.75 UTCfrontend-58c89f5556-tvmw9frontend at Module.load (internal/modules/cjs/loader.js:928:32)

2022-02-24 05:45:25.75 UTCfrontend-58c89f5556-tvmw9frontend at Function.Module._load (internal/modules/cjs/loader.js:769:14)

2022-02-24 05:45:25.75 UTCfrontend-58c89f5556-tvmw9frontend at Module.require (internal/modules/cjs/loader.js:952:19)

2022-02-24 05:45:25.75 UTCfrontend-58c89f5556-tvmw9frontend at require (internal/modules/cjs/helpers.js:88:18) {

2022-02-24 05:45:25.75 UTCfrontend-58c89f5556-tvmw9frontend errno: -13,

2022-02-24 05:45:25.75 UTCfrontend-58c89f5556-tvmw9frontend syscall: 'mkdir',

2022-02-24 05:45:25.75 UTCfrontend-58c89f5556-tvmw9frontend code: 'EACCES',

2022-02-24 05:45:25.75 UTCfrontend-58c89f5556-tvmw9frontend path: '/app/resources/.cache/stt'

2022-02-24 05:45:25.75 UTCfrontend-58c89f5556-tvmw9frontend}

2022-02-24 05:45:25.76 UTCfrontend-58c89f5556-tvmw9frontendnpm ERR! code ELIFECYCLE

2022-02-24 05:45:25.76 UTCfrontend-58c89f5556-tvmw9frontendnpm ERR! errno 1

2022-02-24 05:45:25.76 UTCfrontend-58c89f5556-tvmw9frontendnpm ERR! botium-speech-processing-frontend@1.0.0 start-dist: node -r dotenv-flow/config ./src/server.js``

2022-02-24 05:45:25.76 UTCfrontend-58c89f5556-tvmw9frontendnpm ERR! Exit status 1

2022-02-24 05:45:25.76 UTCfrontend-58c89f5556-tvmw9frontendnpm ERR!

2022-02-24 05:45:25.76 UTCfrontend-58c89f5556-tvmw9frontendnpm ERR! Failed at the botium-speech-processing-frontend@1.0.0 start-dist script.

2022-02-24 05:45:25.76 UTCfrontend-58c89f5556-tvmw9frontendnpm ERR! This is probably not a problem with npm. There is likely additional logging output above.

2022-02-24 05:45:25.77 UTCfrontend-58c89f5556-tvmw9frontend``

2022-02-24 05:45:25.77 UTCfrontend-58c89f5556-tvmw9frontendnpm ERR! A complete log of this run can be found in:

2022-02-24 05:45:25.77 UTCfrontend-58c89f5556-tvmw9frontendnpm ERR! /home/node/.npm/_logs/2022-02-24T05_45_25_769Z-debug.log

I have given them the chart github link. The same thing happened when I tried the docker deployment.

The volume mounted at /app/resources/ has to be writeable, you can see the “permission denied” in the log output

This is what they said :slight_smile:

The issue seems specific to the app, at least with the information we have. Have you tried that helm chart in any kubernetes cluster? minikube, gke, eks or something similar?

The problem is that the image frontend is defining a host volume for the path “/app/resources” with a user with some permissions but then, a volumen is being mounted in the frontend with other permissions, so it cannot access to the path “/app/resources”

Update :

yeah, that is the issue. In the docker image, the volume is generated with a specific user with certain permissions, but then, the volumen in the frontend deployment is not mounted with the same permissions, so it cannot write. Or the volumen in the image is built with the default user or the volumen has to be mounted in the deployment with the image user. If not, the volumen is mounted only with root access

1 Like

Ok, I see what is the issue. Need some days to resolve this.

1 Like

ok boss

It seems to be resolved now. With the latest 1.4.0 I was able to install it on my own K8S cluster, and on Okteto as well (but with docker-compose, not as helm chart).

I forked your latest github master branch. So I’m using your latest 1.4.0 release I guess.
I have deployed it as docker compose on okteto but the endpoint is not resolving and throwing 502 bad gateway. Is it working for you ? What is that IP though ? Is that swagger UI ?

[error] 30#30: *14 frontend could not be resolved (110: Operation timed out), client: 10.8.18.24, server: , request: “GET /favicon.ico HTTP/1.1”, host: “nginx-botium-shubhamjugran.cloud.okteto.net”, referrer: “https://nginx-botium-shubhamjugran.cloud.okteto.net/

[error] 30#30: send() failed (111: Connection refused) while resolving, resolver: 127.0.0.11:53

I asked them about it … They said below :slight_smile:

The nginx config is not right. It seems is trying to connect to the IP 127.0.0.11 that probably doesn’t belong to any of the services deployed

Yes, this happens for me as well. This is because the nginx configuration in docker-compose is tailored for Docker installations (the 127.0.0.11 is a docker internal IP address), and the mapping to Kubernetes by Okteto obviously doesnt work in this case.
I updated the nginx.conf in the repository, please try it again.

1 Like

Amazing it works some tweaks done will now try to make it more secure. The little delay I’m facing probably it’s from my end. Thank you so much you are a lifesaver.

1 Like

awesome, would be nice to post your amendments here or make a pull request on github

Sir is there any research paper you have written on botium speech processing ? If yes do let me know, So that I can cite that.

No sorry I’ve never written a “research paper” on this … :grinning:

I initially wrote about the motiviation for this project in my blog