Simon on OS X

Let's make this short and sweet: Starting today, the Simon development version officially supports Mac OS X (10.6+).


Conf.KDE.In 2014: A New Generation

About three weeks ago, I was lucky enough to attend Conf.KDE.In at the DA-IICT in Ghandinagar, India. Thank you, KDE e.V. for making this possible.
Much has already been said about the excellent conference in the dot article and personal reports from fellow KDE hackers so I will skip repeating how eager and engaged the students were, how the event was impeccably organized, and even how good the food was.

What I want to talk about instead is something else: The evolution of our community.


Open Academy

Back in 2012, Facebook and Stanford University introduced their "Open Academy" program. The aim was and still is simple: Give University students an opportunity to work on real open source projects in exchange for University credit - and a ton of valuable experience.
This year, KDE has joined as a mentoring organization with a total of 11 students assigned to work on 3 different projects. One of those projects is Simon's upcoming natural language dialog manager: A system building on the current "Dialog" plugin to enable the creation of advanced spoken dialogs like the ones made popular by Apple's Siri.


ReComment: A speech-based Recommender System

Most of you will probably know that as my "day job", I am a student currently pursuing my master's degree in computer science. This, of course, also entails some original research.
In this blog post, I will describe both one of these efforts and a practical use case of Simon's upcoming dictation features, all conveniently rolled up into one project: ReComment.


Launching the Open Speech Initiative

Over the course of the summer, I have been working on bringing dictation capabilities to Simon. Now, I'm trying to build up a network of developers and researchers that work together to build high accuracy, large vocabulary speech recognition systems for a variety of domains (desktop dictation being just one of them).


Open Source Dictation: Demo Time

Over the last couple of weeks, I've been working towards a demo of open source speech recognition. I did a review of existing resources, and managed to improve both acoustic- and language model. That left turning Simon into a real dictation system.


Open Source Dictation: Acoustic Model

After working a bit on the language model last week, I spent some time on improving the used acoustic model which, simply put, is the representation of how spoken words actually sound.


Open Source Dictation: Language Model

A language model defines probable word succession probabilities: For example "now a daze" and "nowadays" are pronounced exactly the same, but because of context we know that "Now a daze I have a smartphone" is far less likely than "Nowadays I have a smartphone". To model such contextual information, speech recognition systems usually use an n-gram that contains information of how likely a specific word is, given the context of the sentence.


Open Source Dictation: Scoping out the Problem

Today I want to start with the first "process story" of creating a prototype of an open source dictation system.

Project scope

Given around a weeks worth of time, I'll build a demonstrative prototype of a continuous speech recognition system for the task of dictating texts such as emails, chat or reports, using only open resources and technologies.

Dictation systems are usually developed for a target user group and then modified for a single user (the one who'll be using the system). For this prototype, the target user group is "English speaking techies" and I myself will be the end-user to whom the system will be adapted to. The software to process and handle the recognition result will be Simon. Any additions or modifications to the software will be made public.



Subscribe to RSS - blogs