The experience of a major Pennsylvania insurer suggests that targeted educational feedback — unlike negative financial incentives — can improve quality and reduce inappropriate utilization.

Radiologists and others have long believed that a significant number of diagnostic imaging studies are either inappropriate or noncontributory. In a multistate analysis the authors present data that suggest the volume of such studies approaches 30% to 40% of all studies performed. Of paramount importance is the fact that inappropriate studies frequently generate ambiguous and/or noncontributory results that lead to further inappropriate studies and inadequate treatment. This is the finding of the first study of its kind conducted by Highmark Blue Cross Blue Shield and National Imaging Associates using a systematic method to determine the appropriateness of imaging studies. Acknowledgment of and correction of this problem benefit all parties concerned. The member/subscriber has less exposure to radiation and/or false-positive results and has a better chance of an accurate diagnosis, appropriate treatment, and better outcome. The provider is able to do his job quicker and better and has enhanced ability for best outcomes, and the implicit decreased cost pleases the purchaser. The accompanying data demonstrate that there can be a significant educational and performance impact from the systematic application of traditional utilization management tools, but the next incremental improvement will likely depend on the width and depth of education achievable only through the use of a Web-enabled approach.

INTRODUCTION

The era of managed care has forced the medical profession to look at appropriate utilization of imaging studies from the broad definition of quality of care, which must include both the risk to the patient, and the cost-effectiveness of the care. With the single exception of the public debate over the use of screening mammography, the risk-to-benefit ratio of utilization has largely been ignored.

Knowledge of the importance of Demand Management in health care is a relatively recent phenomenon. It is estimated that up to one-third of all health-care expenditures in the United States are consumed by the worried well1, with pharmaceuticals, mental health treatment, and diagnostic imaging constituting the most significant components of this expenditure. The unfortunate combination of frustrated physicians and misinformed enrollees frequently seek inordinate reassurance for themselves through the inappropriate use of medical imaging technology. With that fact in mind, it should be apparent that the most effective utilization management approach is one that relies on a multi-constituent educational effort.

Consideration of these facts led Highmark Blue Cross Blue Shield, the eighth-largest health care insurer in the United States, to embark on a program that would ensure that medical imaging is used in a high-quality, cost-effective manner. This narrative deals with that health plan’s endeavors to attain the most appropriate studies at the right time with accurate interpretation through its partnership with National Imaging Associates Inc (NIA), a radiology utilization management company. Highmark Blue Cross Blue Shield began its approach to the problem of medical imaging management through a rigorous process of privileging. This initiative was designed to minimize self-referral abuse and ensure that only the highest quality providers render diagnostic imaging services to the plan’s enrollees. Additionally, Highmark Blue Cross Blue Shield instituted a program of preauthorization for all non-emergency outpatient CT, MRI, nuclear cardiac, and bone density studies and implemented a system of profiling as demonstrated throughout this presentation. This report concludes that physician and enrollee use of diagnostic imaging can be significantly improved through a systematic, education-oriented, utilization management program such as the one commenced in July 1998 by Highmark Blue Cross Blue Shield.

METHODS

An appropriateness measurement tool was built around the link between the working diagnosis code expressed in the standard International Classification of Diseases, Ninth Revision, Clinical Modification (ICD-9-CM) and the procedure code expressed in Current Procedural Terminology, Fourth Edition (CPT-4). Using the relationship between these codes as reported by the rendering physicians on their claims abstracts, a binary yes-no appropriateness table was constructed. Nearly 23 million life-years of enrollee data generated 22.15 million imaging-related claims. From this data, 162,000 unique code pairs were identified. Each of these code pairs was considered to be either appropriate or inappropriate by two separate radiologist-observers, with all pairs reviewed for appropriateness using available clinical guidelines and literature in addition to the clinical experience of the reviewers. When opinions differed, resolution included discussion by the two radiologist-observers and a third-party internist. The appropriateness of each individual examination was determined across geographically disparate populations. An example of this method is presented in Exhibit 1.

It is recognized that the utility of this tool depends on the integrity of the coding by the rendering physician and/or office staff. To estimate the accuracy of the claims information, an audit was performed comparing claims to medical records. A random sample of nonconsecutive claims from geographically separate data sets revealed a 2% overstatement of appropriateness. This is not surprising as experienced physicians and their business operations can reasonably be expected to bias coding for expeditious payment. This finding is consistent across the geographically separate data sets.

To further test the validity of the code pairs, the result of the imaging examination expressed as either positive or negative was obtained on each of 450,000 examinations. The supposition was that inappropriately ordered examinations would have a low positive yield and if the results demonstrated the converse, then the code pairs would be assumed to be invalid. Exhibit 2 demonstrates examination results plotted against appropriateness, which were determined by the code-pair table. Only providers with a cell size of 100 or more (N>100) cases were plotted. Exhibit 2 suggests validation of the appropriateness criteria.

The experience of earlier attempts at imaging management was flawed and met considerable resistance because in their model, negative results were equated with inappropriateness while positive results were generally considered appropriate. Exhibit 2 shows that supposition to be incorrect and that, when ordered appropriately, imaging examinations generate a relatively even distribution between positive and negative results. Note the paucity of data points in the right lower quadrant, which appears to validate the clinical reasonableness of the code-pair determinations.

DISCUSSION

Referring provider specialty identification is not consistently collected by health plans. The Health Care Financing Administration (HCFA) 1500 Form has required the provision of Unique Provider Identification Numbers (UPINs) on all billing forms for the past several years. However, commercial carriers have not been consistent in their request for similar information; some are diligent in their collection of this important data while others do not collect it at all. Or they collect the data, but discard it in their claims processing system. This is an important flaw to note because the evaluation of the appropriate use of, and education related to the use of, most imaging studies should be targeted at the decision-maker, the referring physician.

In contrast, the rendering provider specialty information is nearly always collected and can be correlated with provider profiles in virtually all health plan data sets. An audit showed this to be accurate to > 99%. Exhibit 3 therefore demonstrates the appropriateness scores of imaging services when actually delivered by varying specialties. The reader should note that with the exception of radiologists, these are nearly always representative of self-referrals and while some suspicions of significant self-referral abuse appear confirmed, others are mitigated.

With this tool, the appropriate use of diagnostic imaging demonstrated in each of these plans is relatively consistent and approximates 60%. This means, across widely separated and unmanaged populations, nearly 40% of diagnostic imaging can be judged as inappropriate or, at best, noncontributory.

A comprehensive literature search was conducted in conjunction with this analysis and no previous or comparable study was found. The American College of Radiology has published an excellent two-volume set of appropriateness criteria, implicitly recognizing that misuse of imaging is a problem, but the publication does not speak to the extent or prevalence of inappropriate studies.

Exhibits 4 through 8 represent an analysis of the 22.15 million imaging studies according to the referring providers’ specialty performance, rendering providers’ specialty performance, and examination category. When applied to individual referring providers, a bell-shaped curve (Exhibit 4) validates the methodology and demonstrates its variability among individual providers. In this exhibit and the ones that follow, Exhibits 5 and 6, individual provider ordering patterns were limited to those demonstrating at least 100 events during the 6-month time frame used in each example.

The data in Exhibit 5 appear to confirm the conventional wisdom that specialties with a relatively high degree of consensus (such as obstetrics and gynecology) demonstrate a similar high degree of appropriateness as evidenced by a considerably compressed curve. Appropriateness performance for the broader specialty of internal medicine demonstrates a comparatively high degree of inappropriate use and a relatively wide curve as demonstrated in Exhibit 6. Profiles of all recognized specialties have been developed and are currently being analyzed to determine the acceptability of unique performance patterns based on the nature of the specialty.

Exhibit 7 demonstrates aggregate appropriateness across various specialties when a provider acts as both the referring and the rendering provider. With the exception of the radiology specialty, all of these are subject to significant self-referral overuse. For this study, hospitals are considered to be a specialty as a result of a nuance in the Highmark Blue Cross Blue Shield claims system that identifies hospital/specialty/clinic-generated referrals simply as hospital.

The volume attributable to each specialty is relatively low, but in the aggregate, non-radiology imaging represented 38.25% by volume, with the largest representation by primary care physicians at 16%. Office- based imaging by obstetrician/gynecologists and urologists was nearly entirely related to sonography. The 74% appropriateness rating by the radiology group represented a range from slightly above 50% to more than 90%. Scrutinizing the sites of delivery by radiologists, the study shows that low rates of appropriateness tend to originate from (professionally understaffed) freestanding centers, while the higher rates reside among hospital-based practices. This is likely a reflection of the degree of interaction between the radiologist and referring physician in the hospital setting, which provides a greater opportunity for consultation.

Exhibit 8 reflects appropriateness by diagnostic imaging category. Mammography is virtually always used as a screening procedure; it is considered to be appropriate in all circumstances. In descending order of appropriateness, obstetrical ultrasound and general ultrasound appear to be used most appropriately. Bone density is considered to be inappropriate if the interval is less than 24 months unless special circumstances exist. This explains the 19% inappropriate rate in this frequently performed procedure. The data also indicate that two highly technical procedures, CT and MRI, are frequently used in an inappropriate or noncontributory manner. Analysis of the use of angiography and interventional procedures reveals a very low volume — less than 2% — and is subject to considerable coding error, which would explain the relatively low appropriateness rating. In nearly all cases, when subjected to chart review, angiography and interventional procedures are supported by appropriate clinical indications. The lowest appropriate score was found in nuclear studies, which are frequently in-office procedures, and subject to considerable self-referral abuse.

The results presented in the following exhibits represent the data-driven application of proven utilization management techniques, including privileging, preauthorization, and profiling. Exhibits 9 and 10 demonstrate a gradual but significant increase in the appropriate use of imaging studies as evidenced by the Highmark Blue Cross Blue Shield data. They each have, at their underpinnings, a strong educational imperative. Only a targeted analysis-education-feedback approach, unlike negative financial incentives such as capitation, can both improve quality while at the same time reduce inappropriate utilization.

A rise in the appropriateness of examinations ordered was clearly documented, from 57% in July 1998 to 70% in December 1998. It should be noted that the claim counts for November and December are considerably less than the earlier months due to the delayed claim submission at the time of analysis. Subsequent submissions did not measurably change the appropriateness percentage.

None would deny that advances in technology related to diagnostic imaging have had an enormous positive impact on the ability to discover and diagnose disease at earlier and more treatable stages. For instance, until the practical use of MRI was available, multiple sclerosis (MS) was a disease of exclusion. But now, because of MRI, MS can be diagnosed more accurately, leading to more timely and more efficient treatment. However, the increased complexity of this technology requires a concomitant degree of understanding of its diagnostic value by both the potential patient and the referring practitioner. We believe that the data presented in this article support the value of the systematic application of proven utilization management tools. Since the essential ingredient to success in this endeavor has been varying forms of education, we believe that if we are to achieve significant further improvement, we must find a more efficient method to communicate/educate.

The current and near term strategy of Highmark Blue Cross Blue Shield and National Imaging Associates is to Web-enable all of our constituents, enrollees, referring physicians, as well as rendering providers. This method of information transfer developed in partnership with a software developer currently enables specified rendering physicians to obtain, in a secure environment, authorization numbers and pertinent patient information, both clinical and demographic, prior to their examinations. Referring physicians will be able to access our Virtual Call Center without the choke points inherent in a telephonic system, and perhaps most important, the site provides clinical information to the enrollee/patient.

The provision of balanced clinical information is a key element in the process of a demand management program directed at enrollee education. Clinical content including radiology-specific chat rooms promises a sound alternative to the hyperbole and misinformation ever present in the popular press. Explanations and diagrams pertinent to planned examinations can be downloaded by referring or rendering physicians and presented to patients as an adjunct to their care plan. The challenge of patient education is not a simple matter of explanation; it has become a matter of overcoming misinformation and false expectations.

CONCLUSION

Inappropriate medical imaging is a serious quality/economic issue. Data presented in this article suggest that inappropriate medical imaging is in the range of 30-40%. The authors readily admit that these results present a challenge to determine what the right number should be. Is this similar to the old metaphor of the appendix? If you are not removing some normal appendices, then you are missing abnormals. Likewise, the physician who always orders appropriate examinations that are always positive raises the question of underutilization. On the basis of continuous feedback from participating physicians as well as ongoing health plan oversight in this case report, we are confident that examinations deemed inappropriate can be removed from delivery to the population without compromising its health.

In considering the goal of expanding individual physician’s use of this technology, the working hypothesis has been that the right number may not be 100% but certainly higher than 60%. As the data indicate, a significant change has been effected for the Highmark Blue Cross Blue Shield population under management. It is expected that improvement will continue but that the challenge will stiffen because the substrate of inappropriate examinations will change from easily explained, obvious misuse to subtle variations from accepted diagnostic approaches. With this realization, it is clear that further management requires a strong technology interface.

Thomas G. Dehn, MD, FACR is executive vice president and chief medical officer, National Imaging Associates Inc, Upper Saddle River, NJ.

Brent O?Connell, MD, MHSA, is vice president of provider relations and senior medical officer, Highmark Blue Cross Blue Shield, Camp Hill, PA.

R. Norton Hall, MD, is medical director, Highmark Blue Cross Blue Shield, Erie, PA.

Timothy Moulton, MS, is senior vice president and chief information officer, National Imaging Associates Inc. The URL for the Web site described above is www.RadMD.com.