(Continuous) speech-recognition of limited words in the web-browser_问答_开发者

(Continuous) speech-recognition of limited words in the web-browser

开发者 https://www.devze.com 2023-04-11 10:01 出处：网络

Is there a solution for speech recognition which Only has a few words (2 is enough, 10 would be cool. 100 is awesome. More isn\'t needed)

Is there a solution for speech recognition which

Only has a few words (2 is enough, 10 would be cool. 100 is awesome. More isn't needed)
Runs on mobile browsers too (Is it possible to use flash (rather than java) 开发者_JS百科for this?)
Can be installed on your own server. Preferably with PHP+MySQL (if server-side code is required)

I tried searching but I only found actual transcription services (like the Google Voice Search for Android).

An example of such a solution is touchless-timer, which is based on pocketsphinx.js (also mentioned in Nikolay Shmyrev's post). To answer your bullet points:

it supports a simple alarm clock grammar with ~60 words (phrases like "wake me up in five minutes");
I've managed to run it in Chrome Beta 32.0.1700.99 Android 4.1.2 (on Samsung Galaxy S2), it requires a modern Javascript engine, but does not require Flash;
it does not require a server, because speech recognition is done offline in Javascript, and all the required files can be cached using ApplicationCache.

For this application, the grammar was written in Grammatical Framework and automatically converted to the finite state model and dictionary required by pocketsphinx.js. For a simple "MP3 play/pause" grammar you can easily write the FSA directly.

The English acoustic models in this app are not very good, i.e. they might get confused by the MP3 playing in the background. You might be able to improve on that by training better models. However, better models might be larger (e.g. > 20 MB in Javascript) and not fit into memory anymore or just make the app run/load very slowly.

Screenshot of the app running on mobile:

(Continuous) speech-recognition of limited words in the web-browser

These days you don't even need a server to run speech recognition, you only need a browser which supports Web Audio API (both recent firefox and chrome support it). CMUSphinx now can be executed in javascript in your browser.

For more details see

https://github.com/syl22-00/pocketsphinx.js

http://cmusphinx.sourceforge.net/2013/06/voice-enable-your-website-with-cmusphinx/