A User's Guide to Speech Recognition Technology

Then the radiologists at Chestnut Hill Hospital advocated the complete automation of their department 3 years ago, the administrators and clinicians were skeptical. The radiologists had identified three basic categories of clinician concerns: report turnaround time, access to films, and patient scheduling. The department then developed a plan and road map to address each of these issues.

Speech recognition (SR) was implemented to improve report turnaround. Systems were evaluated and the project was planned during 1998. SR went online in December 1998. The next year was spent evaluating PACS systems. PACS with computed radiography (CR) was implemented in January 2000. The two systems were integrated in May 2000, resulting in a filmless department, with the exception of mammography, and a paperless reading room. The addition of SR and PACS has eliminated the issues of report turnaround time and film access. An outpatient imaging center is currently being planned to address the last remaining concern.

SR has been available for many years for radiology reporting. Until recently, however, speed and accuracy limitations have been a major factor in acceptance. Current advancements in hardware and software, including language model speech recognition and decreasing memory costs, have allowed natural and rapid dictation with high levels of accuracy. The acquisition of these systems is still somewhat controversial among radiologists, however.1 The main benefits of SR are decreased report turnaround time and rapid return on investment.2 Transcription cost savings can be in the six figures depending on department size. These benefits, however, are felt only indirectly by the radiologists. The disadvantages of SR include the potential for increased dictation time as well as distraction from image interpretation. These are shouldered entirely by the radiologists. For this reason, a number of radiology groups have resisted the implementation of these systems.

The central question then becomes how can the time penalty and potential distraction be minimized or even eliminated? Attention to a number of details can yield excellent results.

Plan for Success

The first month or so of system use is often a difficult one for the radiologists and the entire department. Work-flow habits acquired over many years are changing and there is often variable acceptance of the new technology. Some radiologists, frequently the older ones, may be intimidated by computer applications. Plans should be made for increased staffing during this period.

All SR systems have a built-in medical vocabulary, but user-specific training of the system is necessary during initial use. The extra time initially spent actively correcting problem words and phrases will bring increased satisfaction during later use. Active correction of words is a feature that is especially useful for non-native speakers. For those radiologists who pronounce words consistently but differently from native speakers, using the active correction feature can quickly improve accuracy.

Not only are the radiologists training the system, but the system also is training the radiologists. Users who have the most success with SR have a consistent dictation style. The speech system learns and remembers not only words but also language patterns. After months of use, it will more easily choose the correct word or phrase if it has heard this particular structure repeatedly. For these radiologists, the system can be a joy to use. For those who ramble and whose dictations are unstructured, there may be more difficulty. The system encourages a very structured and consistent reporting style-something that the clinicians at Chestnut Hill Hospital appreciate.

There are often a handful of words that present a persistent problem. These are specific and different for each user. Active correction can help the system learn these words, but some mistakes tend to recur. Some of these words require a slightly different pronunciation. For others, the system simply has repeated difficulty in recognition. In these cases it is easier to “walk around” these problem words than to wrestle with the technology. Meeting the system partway and using an alternate pronunciation or a synonym, when possible, for these few words will yield better results.

Macros and Templates

David Weiss, MD

The creative use of macros and templates will greatly decrease dictation times. Macros are reports stored within the memory of the system. Templates are simply macros with one or more blank spaces for words or numbers to be inserted during dictation. The radiologist can activate a macro by voice command or mouse click. There is no limit to the number of macros that can be utilized. Each user can customize these to his or her specific preferences. Radiologists can also import a macro or template from any other user of the system. These macros save time in two ways. The radiologist need not dictate the entire report but may simply wish to modify one or more sentences or paragraphs within an existing macro. Another often-overlooked advantage is the time saved in reading and correcting macro reports. Once a macro is chosen, if no changes need to be made, the report need not be read for accuracy.

The customizable user preference features have become more sophisticated since the early years of SR. The system can be configured to specifically match a particular radiologist’s work flow. The changes are stored within each user profile and activate at sign-in, following the user from workstation to workstation. A particularly helpful user preference is the programmable button feature on the microphone. Seven programmable microphone buttons can be mapped to virtually any keyboard function. This greatly simplifies navigation, sign-off, and other commands. With the use of these enhancements, correction time and other tasks can be greatly diminished.

Careful attention to microphone and sound card quality is very important. The radiologists at Chestnut Hill have found that the use of a headset microphone with second-generation noise canceling technology (NCAT II) has greatly improved their accuracy. These microphones are inexpensive and readily available at office supply stores. The headset is likely superior because of the constant position of the microphone with relation to the mouth as the head is turned while viewing images. The headset microphone can be used without disabling the very helpful programmable handheld microphone buttons. A number of radiologists have reported distaste for being “tethered” to the workstation with a wired headset. Wireless microphones are becoming available and should improve acceptance.

Many departments, especially older ones, have had reading stations placed almost at random through the years as more radiologists are added to the department. Careful attention to the surrounding environment in terms of noise and distraction will help ensure higher acceptance of the system.

Integration is Key

Figure 1. Once the initial case in a work list is opened, the only tasks performed by the radiologist are those represented by the red ovals.

The enhancement that has had the greatest impact by far on radiologist work flow at Chestnut Hill Hospital is the integration of SR and PACS.3 This feature was developed and tested beginning in May 2000 and is now commercially available.

Before integration, both of the systems ran on separate workstations with a separate keyboard and mouse. The radiologists opened and closed images within PACS and separately entered the demographic information into the SR system. After dictation, the report was signed off within SR using keyboard, mouse, voice, or microphone button commands. The PACS case was closed separately and the cycle was then repeated for the next case. The radiologists reported varying degrees of distraction by the multiple screens and keyboards, particularly during initial use.

Integration of the two applications eliminated this complaint. The speech program is now embedded within PACS and one keyboard and mouse controls both systems. The SR screen appears on the PACS monitor. The screen can be quickly toggled to appear in front or in back of the images by the radiologist. The window can be resized and moved to any position. These features are necessary to ensure that the dictation screen does not interfere with image viewing. User preference settings allow versatile and automated configuration of the position and appearance of the dictation screen.

With integration, when a case is opened within PACS, the specific accession number for that case is automatically sent to the SR system and

the appropriate dictation screen instantly appears with demographics obtained from the RIS. The types of demographics displayed in the SR header are system configurable. The SR/PACS messaging is achieved without any active intervention by the radiologist. No bar coding is needed, and therefore no paper is necessary at the workstation. No demographics need be manually entered at the keyboard, eliminating human error.

After the reader finishes dictation and proofreads the report, the case can be signed off with a single programmable microphone button. SR then sends a message to PACS to close the current case and open the next one available on the work list. PACS messages SR to open the appropriate dictation screen and the reading cycle continues. The sequence is summarized in Figure 1.

PACS/SR integration has removed one of the last obstacles for radiologists considering speech recognition. The entire set of commands previously necessary to open, dictate, and sign off a case has been compressed into one button.

Interface Issues

A number of issues needed to be solved during the initial clinical trial of the integration software. With shared applications, the two programs were sometimes in conflict. For example, using the?function within PACS disabled similar functions within SR. These growing pains were quickly corrected with subsequent software versions. The main current outstanding issues are twofold: certain PACS functions such as scrolling through a stack of images are very memory intensive; likewise, speech recognition requires a high percentage of central processing unit use. The simultaneous use of these two features by the radiologists can cause slowdowns and deterioration of recognition accuracy. Once this limitation is realized, it is relatively easy to avoid, which eventually becomes second nature.

Another issue reflects the inherent instability of the Windows NT platform. The combination of PACS and SR in the same application seems to exacerbate this problem. During heavy use, it is helpful to reboot PACS and SR once daily. In spite of these limitations, the addition of integration has been markedly positive. Work-flow efficiency has increased to the point where human fatigue is the only remaining limiting factor. The purchase of custom-fitted ergonomic and adjustable chairs has relieved this problem to some extent.

The results of these hardware and software innovations have been markedly positive for the radiologists as well as the entire health care system. The addition of SR resulted in a decrease in report turnaround time from 72 hours to less than 1 day. Return on investment was just over 12 months and annual savings have been approximately $100,000 per year. All radiology reports are now dictated using SR, and no transcriptionists are necessary.

One feature that is often neglected is the speech/radiology information system (RIS) interface. This interface is more complex? than PACS/RIS because it is necessarily bidirectional. SR must receive demographics from the RIS. In addition, the finished text must be sent back to the RIS and routed to the proper printer or fax machine. The interface is a critical component to successful implementation, and detailed attention should be paid to site-specific work flow issues in interface design.

Service Leapt Forward

With the installation of PACS, service improvements took another great leap forward. In the past, the radiologists were at the mercy of the file room personnel for sorting and preparing cases for review and dictation. This would often occur during the late afternoon, leaving many cases to be read in the evening. PACS allows more rapid case review. Currently, daytime cases are read in real time. With this, combined with instant transcription, clinicians are often receiving typed, faxed reports before their patients arrive back at their offices. If a report is not received by a clinician within 20 minutes or so of test completion, it is not uncommon to receive a phone call inquiring what is wrong today. This has certainly raised the bar at Chestnut Hill Hospital for radiology service expectations. The mean turnaround time for all studies is now less than 10 hours and most daytime cases are completed within 30 minutes. The difference reflects the fact that nighttime cases and mammograms are not read immediately.

Another positive result of SR and PACS has been an eventual shortening of the day for radiologists. While SR initially resulted in increased reading times, once the learning curve was completed and with the addition of PACS, the work list typically consists of only a few cases at any point during the day. In contrast, prior to PACS there was never a time when the workday could be considered “complete.”

With the addition of the integrated SR/PACS, efficiency improved further. With the automatic sharing of accession numbers between PACS and SR, all paper requisitions within the reading room were eliminated. Entering patient demographics no longer distracts the radiologists. Their eyes and minds are now back solely on the images where they belong. Navigation commands, including sign-off functions, can be achieved by voice command or microphone button manipulation without the necessity of visual keyboard search or other distractions.

Clearly, SR/PACS integration has virtually eliminated the distractions of reading and dictating in soft copy. This removes perhaps the last obstacle preventing many radiologists from embracing this technology. Clinicians and administrators at Chestnut Hill Hospital have become true believers. Cost savings and service improvements have been tremendous. The ultimate question was recently put to the radiologists at Chestnut Hill. Would you go back to reading conventional film or dictating without speech recognition? The answer by all was instant and emphatic: No!

David Weiss, MD, is chairman of the Department of Radiology, Chestnut Hill Hospital, Philadelphia.

References:

Houston JD, Rupp FW. Experience with implementation of a radiology speech recognition system. J Digit Imaging. 2000;13(suppl 1):124-128.
Antiles S, Couris J, Schweitzer A, Rosenthal D, Da Silva RQ. Project planning, training, measurement and sustainment: the successful implementation of voice recognition. Radiol Manage. 2000;22(1):18-36.
Weiss DL, Hoffman J, Kustas G. Integrated voice recognition and picture archiving and communication system: development and early experience. J Digit Imaging. 2001;14(suppl 1):233-235.

A User’s Guide to Speech Recognition Technology

Plan for Success

Macros and Templates

Integration is Key

Interface Issues

Service Leapt Forward

Recent Posts

Subscribe To Our Newsletter

You have Successfully Subscribed!

A User’s Guide to Speech Recognition Technology

Plan for Success

Macros and Templates

Integration is Key

Interface Issues

Service Leapt Forward

Related Posts

NIBIB: Discovering the Future of Radiologic Imaging

Technology Update: New in the Cardiac Cath Lab

High Resolution Display Solution

News Story

Recent Posts

Subscribe To Our Newsletter

You have Successfully Subscribed!