Vol. 80, No. 9, October 2007
Anyone who has ever seen 2001: A Space Odyssey will easily recall how the computer named Hal could understand spoken commands and interact with the ship's crew. It seemed pure fantasy when the film was made in 1968. What about today? While technology doesn't yet approach the level of sophistication required to allow direct and unfettered interaction between man and machine, it has made significant strides in the area of speech recognition.
Speech recognition is the technology whereby a computer converts a person's speech into text using specialized software. Take note that this is not the same as digital dictation, which should be thought of as the replacement for traditional tape-based transcription units. Speech recognition is literally the scenario of "I talk and the computer types" - although the software isn't actually recognizing a specific user's voice, but rather the user's words (making the term "voice recognition" a misnomer). As Wikipedia defines it, speech recognition is "the process of converting a speech signal to a set of words, by means of an algorithm implemented as a computer program."
Nerino J. Petro Jr., Northern Illinois 1988, is the advisor to the State Bar of Wisconsin Law Office Management Assistance Program (Practice411TM). He assists lawyers in improving their efficiency in delivering legal services and in implementing systems and controls to reduce risk and improve client relations. Visit the Law Practice Management area at www.wisbar.org regularly for practice management guidance. You can reach Petro at (800) 444-9404, ext. 6012; org PracticeHelp wisbar wisbar PracticeHelp org.
This article originally appeared in Law Practice, Vol. 33, Issue 2, March 2007, published by the American Bar Association. Used with permission.
Speech recognition technology for the masses was first introduced in the 1990s, but it always seemed to promise more than it actually delivered. The initial systems were "discrete word" systems that required a clear pause between each and every spoken word. While these systems could capture speech, pausing between each word is not the way in which we interact with each other on a day-to-day basis. These early systems were onerous and cumbersome to use. Discrete word recognition systems never truly caught on with the public or professionals, and work continued on software that could recognize continuous speech.
Finally, in the past 12 to 18 months, speech recognition has made significant improvements to the point where we can consider it ready for prime time. But is it feasible for lawyers to use this technology in their everyday practices? Let's take a closer look.
Improvements and Pitfalls in the State of the Art
The new speech recognition products include not only the ability to dictate documents directly into word processing programs such as Microsoft Word, but can also be used in programs such as Microsoft Outlook, Excel, and more. Moving beyond the mere ability to dictate into programs, the new products can also be used to control the computer via commands.
However, even the latest speech recognition software can be affected by numerous factors, including these:
- difficulty identifying the correct word in the context of the sentence in which it's being used when there is another word that sounds similar (for example, whether the user means "wear" versus "where");
- environmental variables such as background noise levels, machinery, and other noises present in the workplace;
- speaker variables including stress, emotion, speech quality, and health;
- differences such as accents and dialects.
What these factors really come down to is that the quieter the work environment, the better the microphone and the clearer the speech, the better the recognition results will be. Better-quality microphones can overcome some background noise issues, and an extremely quiet environment can compensate for poor microphones. The speaker's ability to clearly enunciate words is also critical. So then, what about the question of whether speech recognition software is ready for the law office?
Until June 2006, my answer would have been a resounding "No." However, at that time I obtained a copy of Dragon NaturallySpeaking Preferred version 8 from Nuance (dragonwww.nuance.com/dragon). The improvements over the prior versions were significant, to say the least. Recognition accuracy was noticeably better and the software's ability to recognize speech was also greatly improved. Then in August 2006, I upgraded to the newly released Dragon NaturallySpeaking Preferred version 9, which offers even greater accuracy. Let's walk through how it works.
Dragon NaturallySpeaking 9 in Action
If you already dictate using a handheld recorder or other type of transcription unit, there's really not that much of an adjustment to make when switching to Dragon NaturallySpeaking. Just as you normally use your handheld transcriber, you dictate your document including all punctuation and symbols to be included.
You can also talk your way through the menu structure. To insert a date, for instance, you can either dictate the entire date such as "January second two thousand seven" resulting in "January 2, 2007" in the text, or you can use the command sequence "Insert date," "Move down 2," "OK." This sequence of commands opens the Date and Time drop-down list under the Insert menu. The command to "Move down 2" selects the format of "January 2, 2007" and "OK" clicks the OK button, inserting the date into your document.
In addition, you can navigate throughout your document using voice commands such as "New paragraph" to insert a new paragraph; "Go to end of line" to move to the end of the existing line; or "Select paragraph" to select the entire paragraph. You can also use menu items such File, Edit, Insert and the like by stating their name and then the "move up" or "move down" commands. A user can quickly master the basic navigational commands in a very short time. Note that not all Word commands are available through NaturallySpeaking, but the most important commands are all built in. A full listing is set out in the Command Browser inside NaturallySpeaking.
Although it is not overly difficult, dictating commands does take some getting used to - especially when words that you are attempting to put on paper are interpreted as commands. In these instances, you must speak with a pause between each word. With a reasonable amount of training you can overcome this potential obstacle so that you can dictate entire documents without using your keyboard or mouse. You can use speech commands to select a line, a specific number of words, a paragraph or an entire document. You can bold, underline, and italicize text; copy, cut, and paste; enter Roman numerals; and otherwise format text. You can also spell out words that you wish to train the system to recognize. Commands are also available within Microsoft Excel, Outlook, Corel WordPerfect, Internet Explorer, and Mozilla Firefox.
While Dragon NaturallySpeaking Preferred 9 can be used out-of-the-box without any training, I strongly recommend that you do the initial training and then continue to train on the software each time you use it. This will give you a higher degree of accuracy and save you time in the long run. Training entails reading a script that is displayed on your screen.
Another nice feature that enhances recognition is that the program will look to your Microsoft Outlook or Outlook Express and any documents that you have in your My Documents folder to recognize your writing patterns.
Ultimately, accomplished typists may not notice much improvement in production by using speech recognition - but for those of us who are keyboard-challenged, speech recognition technology is a vast improvement over hunt-and-peck typing.
What Your Bucks Will Buy You
While there are several smaller companies providing speech recognition software (even Microsoft has added the capability to its Windows XP and its newly released Vista operating system), the current champ for speech recognition in the law office is Dragon NaturallySpeaking. The program comes in several different versions, progressing from the least-expensive Standard to the mid-priced Preferred and the top-of-the-line Professional version. You can purchase the Professional version with a specialized, profession-specific dictionary for legal practices, too.
The Professional version also gives you the ability to create complex macros, provides for a "roaming user," which allows someone else on a different computer to play your speech file and edit the transcription, and more. However, the Professional version is pricey at $800 or more. For most lawyers who are looking to experiment with speech recognition, Dragon NaturallySpeaking's Preferred version is the place to start - it currently sells for around $199.
When you purchase the Dragon NaturallySpeaking Preferred software, you also receive a headset that plugs into the sound card on your computer. Generally, though, with speech recognition software, I recommend that you purchase a better headset that connects to your computer via a USB port - this bypasses your sound card entirely and removes any sound card-compatibility issues from the equation. Headsets from Parrott, Plantronics, Sennsenheiser, VXI Corp., GN Netcom, and Andrea Electronics are approved for use and will generally provide better recognition and accuracy. The Philips Speechmikes can also be used with the program and have received high ratings for accuracy from Nuance. Dragon NaturallySpeaking 9 also supports Bluetooth and other wireless-technology-enabled microphones. You can even use your handheld digital recorder to transfer dictation that you take on the road into NaturallySpeaking.
You should also check out other types of third-party products that are available for use with Dragon NaturallySpeaking, including specialized libraries and command sets. One such provider is KnowBrainer (www.knowbrainer.com), whose products include advanced scripting additions, a large number of new commands, legal and medical dictionaries, and microphones and headsets as well. You can also find third-party Dragon NaturallySpeaking command files for Time Matters practice management software at Premier Software (www.premiersoftware.com).
These kinds of "add-ons" can give you a leg up in maximizing your use of Dragon NaturallySpeaking, and hopefully there are more of them to come as speech recognition technology improves still further in accuracy and ease of use. For now, the bottom line is that speech recognition is definitely worth considering for lawyers who are not good typists, or where physical challenges like carpal tunnel syndrome are at issue. With a minimal amount of training, you can begin using it in a way that reduces the amount of actual typing needed, thereby saving time (and aggravation) and allowing for greater productivity.