What speech recognition application are you most looking forward to?

With the rising popularity of speech recognition in cars and mobile devices it’s not hard to see that we’re on the cusp of making speech recognition a first-class input method across our devices.
However, it shouldn’t be forgotten that what we’re seeing in our smart phones or laptops today is merely the beginning. I am convinced that we will see much more interesting applications of speech recognition technologies in the future.

So today, I wanted to ask: What application of speech recognition technology are you looking forward to the most?

For me personally, I honestly wouldn’t know where to begin.
First I’d probably go for a virtual assistant. Yes, there’s Google Now and Siri already, but those are still obviously not as good as an actual assistant. Especially Siri also suffers from being constrained to the same interaction method than an actual assistant. A virtual assistant can arguably be of much greater value when it takes a more pro-active role, exploiting the vast amount of information it has access to to become more like e.g., iron man’s JARVIS.
Secondly, there is the domain of automatic, simultaneous translation that I think is fascinating. While early implementations already exist from industry greats like Microsoft and Google, there is obviously a lot of room to grow.
And of course from computer-aided memory of real-life conversations to finally understanding the announcements from PA systems in trains – everything is up for grabs.

So given an infinite budget: what speech recognition application would you pick? Please let me know what you think in the comments!

Facebooktwittergoogle_plusredditpinterestlinkedinmailby feather

Peter Grasch


  1. MMDagent : http://www.youtube.com/watch?v=H6NzzTyglEw a voice interaction system
    Instead of the notification system, and as a personal assistant.
    There’s also the prospect of Simon of doing common actions by making non-verbal sounds (click your tong to open kicker…)

    • Cool idea and it’s even utilizing open source speech recognition ๐Ÿ™‚

  2. for instance, speech-to-write would be nice,
    the ability to open $filetype $filename w/o having to navigate to the respective folder
    but of course answering questions like what will be the weather tomorrow, when is the next train leaving to $city, how much is the ticket, buy the ticket etc etc etc.

    I hope you stumpled upon an invinite budget to realize all wishes : )

    • This!

      A dictation tool for LibreOffice, or whatever word processor you use would be very nice. I understand that is pretty difficult to achieve, however, that would be awesome!

    • No budget, sorry. But hey, if someone has a wad of cash lying around, I wouldn’t say no ๐Ÿ˜›

  3. For disabled people

    Collecting new ideas for Simon? ๐Ÿ™‚

    A speech recognition tool for stutterers would be awesome. It know its quite a big task. But maybe you can start with person specific tool, knowing before hand his/her form of stuttering (pauses, repeating certain syllables, etc.), and then you can make it universal by involving some learning there.

    • Interesting idea. Does this stem from a concrete need?
      Stuttering is a form of speech dysfluency. While of course more prominent in stutterers, dysfluent speech is absolutely normal in everyday conversations, meaning that conventional speech recognition systems already account for such speech patterns. For well trained language- models, speech disfluency has actually been shown to not affect the speech recognition accuracy adversely (1995 Language Modeling Workshop at Johns Hopkins).

      If you know someone that could benefit from a speech recognition system for stutterers: encourage him / her to try out conventional systems – they might just work fine.

      • Yeah, I actually know a guy. I will ask him to try it out. ๐Ÿ™‚

  4. For me thesis, I would have really needed a transcription software for my interviews. It could also have other uses, e.g for creating subtitles. I would be really nice to see that, but I guess at the moment this is out of reach, especially since it might be more difficult to train voice recognition for voices which are different, depending on the interviewed person.

    I am not into that virtual assistant stuff at all. I deleted the Samsung version of it from my Android phone. I believe the main use for digital assistants is for disabled people and for people who like a toy.

  5. something like Dragon Naturally Speaking! I have customers who are doing their daily business with it. I would like to switch them to linux, not possible because there is no speech2text software. There is a big hole on business side in desktop/linux.

  6. HAL and pals

    Why aim for less than C3PO, R2D2 and HAL 9000? Natural language and interaction is what I would like to see.

    • Intermidiate

      Being able to execute commands like in the cli, bash auto-complete as a dictionary, as an Intermediate step?

  7. Something that can convert a recorded interview into a rough draft of a transcription (I’m a journalist!)

  8. Transcription, yeah. Kind of like dictation but not quite. But there’s a nice synergy there so any work done on dictation today will aid the development of a transcription tool greatly.

Leave a Reply

Your email address will not be published. Required fields are marked *