Your voice is its command

  • Print

At last. Ever since I first touched a personal computer 25 years ago, I've been wondering how long it would take before I could write an article without touching the keyboard — and without paying someone to transcribe dictation tapes.

That day is here.

In recent weeks, I've put two speech-recognition programs to the test: Microsoft's Windows Speech Recognition and Nuance's Dragon NaturallySpeaking 9.0. I found that with a little preliminary work to get the software used to my way of talking, both programs were easily up to the task.

No, they're not perfect. With only the initial training, I found I had to make about a dozen corrections over a page of double-spaced text.

But even that is plenty good enough to relieve my fingers of a whole lot of drudgery.

And these solutions are not just for dictation. Both programs can also be used to control your computer — open applications and files, add and edit data, and perform virtually any other operation — a capability especially welcome for those with certain disabilities.

In fact, supporting speech recognition for commands is a far easier feat than dictation because the program needs only to determine what the user says from a small subset of possibilities. I found both solutions to be nearly flawless right off the bat.

Training it, training yourself

Voices and speech patterns are unique to individuals, which makes translating speech to text especially challenging for software. To do the job, these programs use complex algorithms to analyze what you say. The analysis even includes the context of what you say.

A program can't, for example, differentiate the way most people say "corps" from the way they say "core." So it looks to nearby words to determine the best spelling.

If the word before it is "apple," the program will spell it "core." If the preceding word is "Marine," "corps" will appear.

To get a head start on accurate recognition, Windows Speech Recognition and Dragon NaturallySpeaking suggest users undertake a short (10- to 15-minute) training session during which they read text the programs offer. That gives the software a basic understanding of how a user pronounces known words.

The training sessions are remarkably effective, greatly reducing the number of corrections you'll have to make.

Even better, once you've finished training the software, these programs keep learning. Every time you use the program and make corrections to recognized speech, the software notes your correction and adjusts.

Bear in mind that training is a two-way street. If you make minor adjustments to the way you speak when dictating, you'll find the results are significantly cleaner.

That means trying to speak clearly and avoiding "ummms" and "ahhhhs." And you'll want to avoid speaking too quickly.

Not to worry — most users will find it a snap to make these adjustments. Many will also find it quickly becomes second nature to pronounce punctuation while dictating.

Microsoft Windows Speech Recognition

We'll have to see if Microsoft's bundling speech-recognition software into the Vista operating system results in more antitrust actions. In the meantime, it's a boon to end users looking for general speech-recognition tools.

I was impressed with the ease of getting Windows Speech Recognition up and running. After plugging in a USB headset/microphone, which was automatically recognized by Windows, the software led me through configuring the headset volume, then moved into the training routine.

Microsoft takes advantage of the training session — which consists of reading text displayed on the screen until it is accurately recognized — to familiarize the user with the capabilities of the software. It teaches that you can correct recognition mistakes verbally, for example, by using such simple commands as "delete word."

The training session also familiarizes users with some commands for opening and working with applications in Windows.

Frankly, I didn't think I'd even make use of this capability. But once I had the headset on for a dictation session, I soon found I was relying on the software for command and control purposes.

It's no surprise Windows Speech Recognition integrates extremely well with Windows applications. I didn't find a single application where it didn't work seamlessly. It could even be used effectively to control non-Microsoft Web browsers.

After going through the first brief training session and a few weeks of use, I found Windows Speech Recognition was almost as accurate as Dragon NaturallySpeaking.

On the downside, I was disappointed by the paucity of documentation.

I searched in vain, for example, for information about where to find the user profile file so that if I need to move to a new computer I don't have to go through all the training again. I'm still waiting to hear from Microsoft on that.

You can get more details on Windows Speech Recognition at

Dragon NaturallySpeaking 9.0

OK, let's start with the downside: You have to pay for Dragon NaturallySpeaking. Specifically, the standard version of the program lists for $199.99.

NaturallySpeaking offers an experience very similar to that of Windows Speech Recognition. Training is a simple matter of reading text to the computer for about 15 minutes. The software also allows you to dictate text and control applications.

So why pay for NaturallySpeaking when Windows Speech Recognition is free?

For starters, Windows Speech Recognition is free only with Vista, so those with earlier versions of Windows don't have that option.

Second, NaturallySpeaking is, in my experience, marginally more accurate, at least in the initial stages of using the dictation tools.

Third, Nuance Communications supports several different editions of Naturally Speaking. There are professional and legal editions, not to mention two different medical editions. These editions offer specialized vocabularies that increase accuracy recognition as well as utilities for creating custom voice commands.

You can get more details on Dragon NaturallySpeaking at

The bottom line

Vista users with basic dictation and command-and-control needs will definitely want to try Windows Speech Recognition before investing in another solution. If your needs are somewhat more challenging, however, Dragon NaturallySpeaking is more likely to answer the call.

Either way, we have reached the point where, with a little time spent training the software, accurate dictation is an attainable reality.

Patrick Marshall writes the weekly Q&A column in Personal Technology.

Joomla SEF URLs by Artio