Measuring and Improving Surgical Quality

November / December 2009

Measuring and Improving Surgical Quality

Over the past decade, the number of quality measurement programs — both mandatory and voluntary — has grown exponentially as hospitals respond to public and government demands for greater transparency and accountability and improved patient care. Many on the front lines of hospital quality improvement efforts may find it difficult to tell which measurement programs are having the greatest impact on patient care. At the end of the day, we all want to know: Is our quality of care improving?

To the public, it may seem quality and patient safety only came into the spotlight in the past decade with the release of the Institute of Medicine’s report To Err Is Human (2000). In fact, nearly a century ago Massachusetts physician Ernest Codman recognized the importance of tracking patient “end results” in order to improve care for future patients. Yet even Codman recognized that it would likely take “several generations” to achieve the quality improvement he envisioned.

Today, we are on the brink of significant change in our healthcare system. While quality of care has been an important focus in recent years, the spotlight has shined only brighter in recent months. Medical professionals, public and private health officials, and research scientists don’t always agree, but they do agree on one thing: If we are to improve healthcare by establishing quality surgical practices and clinical guidelines, we need to measure surgical quality using sound scientific evidence and appropriate and reliable data.

Quality Improvement Requires Quality Data
The first attempts to measure the quality of surgical care focused primarily on in-patient mortality and morbidity rates, based on administrative data. But as patients and payers began to demand more information from surgeons and hospitals about quality of medical care, developing new and better measurement tools took on a new urgency and a range of quality improvement programs were developed to address this need. Approaches to measuring surgical quality tend to fit one of three types, according to the Donabedian model (1966): Structure, or the attributes of healthcare systems that are organized to deliver care; process, or what is done to and for patients; and outcomes, or changes in patients’ health status that may be linked to the healthcare process (Daley, 2002).

Structural measures cover a broad group of variables that reflect the setting in which surgical care is delivered, such as procedure volume, subspecialty training, nurse-to-bed ratios, or the presence of specific amenities, such as “closed” intensive care units or certain types of technology or equipment.

Structural measures are used frequently because they are often quickly, easily, and inexpensively obtained from administrative databases. For instance, the volume-outcome relationship has been used as a proxy for quality and as the basis for surgical quality initiatives. But for certain procedures, only a few hospitals generate enough volume to meet the definition of “high volume.” While the relationship of volume to outcomes has been demonstrated in several procedures, it is unclear how significant the relationship is. The feasibility of implementation also remains unclear in our current healthcare system.

Structural measures also may lead to increased disparities, since selective referral would likely only allow insured patients to go to high-volume centers, leaving the disenfranchised to go where they can — thus fueling the disparities even more so (Lui, 2006). In addition, favoring high-volume hospitals may overwhelm some hospitals that are already operating at capacity, while inhibiting the chances of lower volume hospitals to improve (Daley, 2002).

Process measures assess the activities performed when healthcare professionals provide care to patients and are routinely used as quality indicators in nonsurgical specialties. Such measures address whether “good” medical care has been provided, such as completeness of clinical history, physical examination, and diagnostic testing; justification of diagnosis and therapy; technical competence in performing diagnostic and therapeutic procedures; evidence of preventive management; coordination and continuity of care; and acceptability of care to the recipient (Webb, 2008).

Until recently, the Centers for Medicare and Medicaid Services (CMS) and others have focused their quality improvement efforts on process measures on the theory that by establishing best practices, outcomes will be optimized. One such program is Surgical Care Improvement Project (SCIP).

The advantages of process measures are that they are actionable and often lead to increased adherence to the identified process measures. A large number of practices related to perioperative care have a high level of evidence supporting their effectiveness, including surgical site infection, venous thromboembolism, cardiac complications, and ventilator-associated pneumonia.

But it is difficult to identify meaningful process measures that have a link to improved outcomes. Data linking processes and outcomes in randomized controlled trials remains very limited. At the same time, most outcomes have many processes associated with them. So, a provider may perfect one or a few processes, but those processes may not result in improved outcomes. In fact, recent studies have shown a lack of correlation between some of the SCIP process measures and risk-adjusted outcomes (Ingram, 2009).

Furthermore, process measures may drive providers to “study to the test.” That is, providers improve their results on specific performance measures, but may not focus on non-performance measure issues that also impact outcomes. This unintended consequence was demonstrated in a recent study of a pay-for-performance program in the United Kingdom. Physicians whose pay was tied to certain performance measures improved their results on those measures, but quality declined in areas not tied to incentives. And, once the quality targets were reached, the rate of quality improvement slowed (Campbell et al., 2009).

The Importance of Risk-Adjusted,
Clinical Data in Measuring Outcomes

Today, the focus is increasingly on outcomes measures. Outcome measures assess the effect of care on the health of patients and populations, such as 30-day mortality and morbidity rates, length of stay, readmission rates, patient satisfaction, health-related quality of life, cost-effectiveness, and resource use. Given that improving patient outcomes is the provider’s main goal, these measures have validity with surgeons and are likely to get the greatest buy-in from them. Patients better understand what outcomes are, while they may not understand the importance of processes. And studies have shown that measuring outcomes, in and of itself, routinely leads to improved outcomes.

Past attempts to measure performance outcomes have run into opposition from some physicians, hospitals, and patient advocacy groups that are concerned the programs are designed to steer patients to the least expensive, not the best, physicians (Hensley, 2007), because:

  • the evidence was not risk-adjusted—that is, it did not take into consideration the health risks posed by the condition of the patient, and
  • the evidence used came from billing data gathered through an administrative process, not from the medical chart data gathered by clinically trained personnel, and therefore it had limited value for quality measurement and improvement.

Claims data are often inadequate because they are intended to function as bills, not as clinical records, and administrative data are limited, inconsistent, and subject to misinterpretation when used to measure performance (Hali et al., 2007). Specifically, administrative data does a poor job in assessing quality because it is routinely not clinically detailed enough to evaluate and measure quality of care, regardless of whether the measure is a process or outcomes measure. And although administrative data can be useful in measuring mortality rates, it is not a strong tool for measuring morbidity, complications occurring within hospitals, and especially those arising after the patient leaves the hospital—and studies show the majority of complications occur within 30 days after the patient leaves the hospital.

We all know the phrase, “garbage in, garbage out.” Without quality data, it is difficult to truly measure and improve quality of care. The most effective approach to measuring performance is to use risk-adjusted data that is gathered by clinically trained personnel from medical charts.

One program that takes this approach is the National Surgical Quality Improvement Program (NSQIP), a nationally validated, risk-adjusted, outcomes-based approach to measure and improve the quality of surgical care. NSQIP employs a prospective, peer-controlled, validated database to quantify 30-day, risk-adjusted surgical outcomes, which provide a valid comparison of outcomes among all hospitals in the program.

The NSQIP’s primary objective is to gather data of the highest quality. Rather than relying on administrative data, NSQIP requires a trained surgical clinical reviewer to collect and validate data on a wide range of variables based on a patient’s condition and risk factors before, during and after surgery. The reviewer submits the data to a secure Web site where they are subject to built-in software checks and routine audits.

Improving Quality through Quality Data: History of NSQIP
The NSQIP dates back to the mid-1980s when the Department of Veterans Affairs (VA) came under public scrutiny over the quality of surgical care in its 133 hospitals. In 1986, Congress mandated the VA to report its surgical outcomes annually and compare them with the national average. Given that VA patients were, on average, older and sicker than patients in private hospitals, it became apparent that it would be necessary to develop a way to adjust these outcomes based on the severity of a patient’s illness or condition.

The challenge in complying with this mandate was that there were no national averages, nor were there any risk-adjustment models for the various surgical specialties. As a result, the surgeons at the VA built a statistically reliable database of patients’ pre-operative risk factors and post-operative outcomes, and then created methods for accurate risk adjustment. Using clinical data collected on pre-operative, intra-operative and 30-day outcome variables from a total of more than 117,000 major operations, the VA surgeons developed risk models for 30-day mortality and morbidity in nine surgical specialties.

In 1994, the VA established the NSQIP for monitoring and improving the quality of surgical care across all VA medical centers performing major surgery. From the adoption of NSQIP in 1991 to 2006, 30-day post-operative mortality declined by 47% and morbidity by 43% at VA hospitals—a significant improvement (Khuri et al., 2007).

Researchers concluded that NSQIP was a significant factor in the VA’s improvement in outcomes, suggesting that “the private sector has a lot to learn from a system that was early to adopt the electronic medical record, physician order entry, and other processes of care facilitated by its more centralized infrastructure” (Hutter et al., 2007). Another researcher said that by providing individualized reports to participating hospitals, including site-specific feedback on risk-adjusted outcomes highlighting areas for quality improvement, as well as information on best practices from high performing hospitals (Main et al., 2007), NSQIP’s feedback was effective in changing physician behavior (Grimshaw et al., 2001).

Because of the success of NSQIP in the VA, private sector hospitals became interested in the program. After a 1999 pilot study established NSQIP’s feasibility in private hospitals, the American College of Surgeons (ACS), the world’s largest professional organization of surgeons, developed NSQIP for private hospitals (ACS NSQIP) with support from the Agency for Healthcare Research and Quality (AHRQ) and expanded it further into the private sector. Because the data are clinical and collected by a third party specifically trained and certified to collect the data, there is great buy-in from surgeons — which is absolutely necessary to effect quality improvement in surgery.

ACS NSQIP now collects data on about 140 variables, including preoperative risk factors, intra-operative variables, and 30-day postoperative mortality and morbidity outcomes for patients undergoing major surgical procedures in both the inpatient and outpatient setting. Risk-adjusted 30-day morbidity and mortality outcomes are computed for each participating hospital. Outcomes are reported as observed versus expected (O/E) ratios and are distributed in a semi-annual report. Best practices from hospitals with superior performance are identified and shared among participants.

Soon, ACS NSQIP will be further streamlined. Rather than random sampling cases, participants will target cases that tend to be associated with complications. If the bottom line is to identify areas where we can improve quality, it is logical to look at the cases that tend to have the complications. ACS NSQIP will soon require less data collection as well. While the NSQIP program was adopted from the VA with about 140 variables, through advanced statistical techniques, risk and case-mix adjustment can be performed with a lesser number of variables — hence, the amount of data collection will be decreased by at least some 50%. Data will continue to be audited to ensure validity and data quality.

Knowing you have poor outcomes doesn’t automatically give you the answer on how to improve. Thus, ACS NSQIP also provides Best Practice Guidelines and case studies to assist hospitals in their quality improvement efforts. Many hospitals have also found that working together in a collaborative effort fosters a common spirit of quality improvement and helps hospitals learn from each other. Local collaboratives can jointly address common and costly complications, such as surgical-site infections, by collectively reviewing their outcomes and developing shared goals.

Some argue that if outcomes are going to be measured, then the poorer performing providers might start to refuse treatment for high severity, complex patients, thus potentially limiting access to care. However, looking at if from another angle, it could actually be a good thing to refer patients to the providers with better results.

Quality Is Local: The Value of Clinical, Outcomes-Based Data
It’s been said if you’ve seen one hospital, you’ve seen one hospital. Quality improvement is local. Quality data will point to the issues that need to be addressed, but in order to improve performance, each hospital must evaluate its own outcomes. Since not all hospitals have the same problems — the same cultures, the same patient populations or the same resources — they might not have the same fixes. And those fixes are not a one-time effort, but a continuous, reiterative process.

A key advantage to ACS NSQIP is that each hospital can use its own analytical tools, or those provided by ACS NSQIP, to drill down into its own NSQIP data to search for areas to improve. This way it can determine, for example, if sub-par outcomes are due to the hospital’s processes or the way a treatment is affected by a disease or condition. Quality improvement takes a team effort, and most hospitals find quality issues must be addressed both inside and outside the operating room, with all members of the operating room team.

Although it may be more time-intensive than other quality measurement programs based solely on claims data, clinical studies have concluded that ACS NSQIP not only provides a highly reliable data system to compare risk-adjusted outcomes, but it also provides robust data to allow for intensive quality improvement efforts at the local level (Rowell, 2006). Examples include:

  • St. John Hospital and Medical Center, an 800-bed tertiary care center in Detroit, which used NSQIP data to develop a risk assessment tool and a preoperative pulmonary rehabilitation program that enabled it to reduce the number of patients on ventilation post-operation.
  • Cuyuna Regional Medical Center, a 32-bed multi-specialty hospital in rural Minnesota that was able to reduce stroke incidence from above national average to below average by using NSQIP to implement a new anesthesia plan and standardized anticoagulation orders and protocols. At the same time, the initiative also improved the overall Quality Improvement (QI) process at the hospital, including communication between surgeons and nurses, and QI documentation and monitoring/follow-through.
  • Decatur General Hospital, a 178-bed, non-profit community hospital in Alabama, focused on reducing its rate of urinary tract infections (UTI), a common infection that is on the CMS list of “never events.” After using NSQIP to find that its UTI rate was three times that of its peers, Decatur General reduced its UTI rate from 2.6% to 0.8%, saving $42,000 in the first quarter alone. That also led to a reduction in direct care costs, length of stay, and hospital readmissions — and will help them avoid losing reimbursements for “never” events.

Quality Is an Action Sport
It’s not enough to just join a quality improvement program. Hospitals considering quality improvement initiatives must remember that quality improvement does not come without action. Too often hospitals begin a quality program and receive the results, only to have those results sit on the shelf or in a computer file. No hospital is perfect across the board. Even the best hospitals, those that show above-average outcomes, have areas for improvement. Nine of the top 10 hospitals on the U.S. News & World Report rankings participate in ACS NSQIP, demonstrating that even “top” hospitals are focused on quality improvement.

And among ACS NSQIP participants, even “good” hospitals were able to improve their quality. A study in the September 2009 issue of Annals of Surgery found 82% of ACS NSQIP hospitals improved morbidity rates and 66% improved mortality. Hospitals in the study prevented 250 to 500 complications per hospital per year, and they achieved 11 to 17% improvement in quality each year. The study finds all types of hospitals — large or small, rural or urban — were able to improve their quality, with those that were poorer performers when they enrolled in ACS NSQIP realizing the greatest quality improvements, but even high performers were able to see significant improvements (Hali et al., 2009).

Surgical Quality Measurement: Looking Ahead
Healthcare reform discussions have brought a greater focus on measurement and public reporting. While there are many types of measurements, it will be important as we move ahead to focus on measurement efforts that are proven to improve quality and outcomes, and better consolidate the numerous measurement efforts in which hospitals must or choose to participate.

We can anticipate these changes in the coming years. CMS, which has previously emphasized process measures through such programs as SCIP, is now looking closely at risk-adjusted outcomes measures based on clinical registry data. The ACS recently completed a project with CMS to develop a vascular surgery outcomes measure that was developed using NSQIP variables and data. The measure was recently endorsed by the National Quality Forum.

As we look ahead, we’ll likely see an increasing focus on risk-adjusted outcomes measures based on clinical data. ACS NSQIP has demonstrated that quality improvement can start with outcomes and then work back to the individual processes that hospitals need to fix in order to improve. And even good hospitals will be able to improve. The ability to collect the highest quality data will be further enhanced by the implementation of health information technology, including electronic medical records, so that we have easier access to quality data.

And as our population ages, quality improvement efforts will only become more important and more complex. Improving the quality of care for elderly patients requires a multidisciplinary team approach and strong communication that will further expand quality improvement efforts.
In the end, these efforts will allow us to do what we desired to do when we entered medicine—improve patient care.

Clifford Ko is director of the Division of Research and Optimal Patient Care of the American College of Surgeons. Ko is a practicing surgeon and serves as professor of surgery and health services at UCLA Schools of Medicine and Public Health, director of UCLA’s Center for Surgical Outcomes and Quality, and a research scientist at RAND Corporation. He holds a medical degree, BA in biology, and MS in biological/medical ethics from the University of Chicago, and a MSHS in health service research from the University of California. Ko may be contacted at

Campbell, S.M., et al. (2009). Effects of pay for performance on the quality of primary care in England. New England Journal of Medicine, 361, 368-371.

Daley, J. (2002). Quality of care and the volume-outcome relationship—What’s next for surgery? Surgery 131(1), 16-18.

Donabedian, A. (1966). Evaluating the quality of medical care. Milbank Memorial Fund Quarterly 44, 166–206.

Grimshaw, J. M., Shirran, L., Thomas, R., et al. (2001). Changing provider behavior: An overview of systematic reviews of interventions. Medical Care 39 (Supplement 2), 112–145.

Hall, B. L., et al. (2007). Comparison of mortality risk adjustment using a clinical data algorithm (American College of Surgeons National Surgical Quality Improvement Program) and an administrative data algorithm (Solucient) at the case level within a single institution, Journal of the American College of Surgeons, 205, 767-777.

Hall, B.L., Hamilton, B. H., et al. (2009). Does surgical quality improve in the American College of Surgeons National Surgical Quality Improvement Program. Annals of Surgery 250, 1-14.

Hensley, S. (2007, October 18). NY AG Cuomo on warpath over insurers’ doctor ratings. Wall Street Journal Health Blog. Available at

Hutter, M., et al. (2007). Vascular surgical operations in men. Journal of the American College of Surgeons, 204, 1115-1126.

Ingraham, A. M. (2009). Measuring the quality of surgical care: Process versus outcomes. ACS NSQIP Semiannual Report, 14.

Institute of Medicine. Committee on Quality of Health Care in America. (2000). To err is human: Building a safer health system. Kohn, L. T., Corrigan, J. M., Donaldson, M. S. (Eds.). Washington, DC: National Academy Press.

Khuri, S. F., et al. (2007). The patient safety surgery study. Journal of the American College of Surgeons, 204, 1087-1088.

Lui, J. H., et al. (2006.) Disparities in the utilization of high-volume hospitals for complex surgery. Journal of the American Medical Association, 296(16), 2026-2027.

Main, D. S., et al. (2007). Relationships of processes and structures of care in general surgery to post-operative outcomes: A descriptive analysis. Journal of the American College of Surgeons, 204, 1157-1165.

Roland, M. (2004). Linking physicians’ pay to quality of care — A major experiment in the United Kingdom.” New England Journal of Medicine, 351, 1448-1454.

Rowell, K. S., et al. (2006). Use of National Surgical Quality Improvement Program data as a catalyst for quality improvement. Journal of the American College of Surgeons 204, 1293-1300.

Webb, A., et al. (2008, February). Approaches to assessing surgical quality of care. Hospital Physician, 29-37.