skip to main content
Patient Satisfaction Standard: Meaningless If Not Done Right

Patient Satisfaction Standard: Meaningless If Not Done Right

by Robert Nielsen

This column is the second of a two-part series.

Last week's commentary discussed Gallup's stance on the proposed initiative by the Centers for Medicare and Medicaid Systems (CMS) to mandate a national standard for measuring patient satisfaction (see "National Patient Satisfaction Standards: A Leading -- or Misleading -- Edge?" in Related Items.) We question the virtue of this mandate for ideological as well as methodological reasons. From an ideological standpoint, the potential outcomes are questionable: hospitals not meeting the rating standards for patient satisfaction could be penalized in the amount they receive in reimbursement from Medicare, Medicaid or employer health plans. There appears to be little effort to improve low-performing hospitals in this process, and significantly negative potential consequences for such hospitals and their patients.

This week, I will focus on more fundamental issues regarding the research methodology for the proposed CMS mandate. Patient satisfaction surveys will be conducted with recent hospital patients. The methodology and questions for this poll will be the responsibility of the Agency for Healthcare Research and Quality (AHRQ), another agency within the U.S. Department of Health and Human Services.

Gallup's 67 years of experience as the world's best-known polling organization have given us considerable expertise with regard to the science of collecting and reporting on public opinion. Much of our concern about the CMS patient poll stems from an assessment of the current California patient satisfaction survey initiative conducted by the National Research Corporation/Picker, called the PEP-C (Patients' Evaluation of Performance in California) survey, which may ultimately serve as a model for the nationwide CMS-mandated surveys, especially given the apparent alliance between NRC/Picker and CMS.

High-quality survey science requires extensive research on the potential value of each question to be included, the way in which those questions should be asked, and the procedure for analyzing the resulting data. We feel that many such scientific issues have not been adequately addressed by the PEP-C project. Several specific concerns are discussed below.

Response Rates -- If 600 patients are given the opportunity to respond to a mail survey, as is the case for most of the hospitals participating in the PEP-C survey, and only 200 actually respond (a 33% response rate), there is no certainty that the attitudes of those who responded are representative of all patients. When response rates are consistently below 60%, responsible research demands ongoing tests of the potential bias of non-respondents -- i.e., to determine if they may be different from those who did respond in any consistent and relevant ways. For its own surveys, the federal government requires non-response research for surveys with response rates below 70%. Past research has shown that mail surveys tend to systematically under-represent men, young adults and minorities and over-represent women, older adults and Caucasians.

Validity and Reliability -- Internal validity refers to the question of whether links exist between overall patient satisfaction and the individual items on the survey, ensuring that survey questions are relevant to the overall goal of improving general satisfaction. Survey reliability refers to the consistency of the scores across different samples. In other words, are the findings for overall patient satisfaction of patients using the same questions consistent in their predictive value across different samples of respondents? Researchers often find that surveys with low response rates may appear to have validity, but little or no reliability. The PEP-C project does not report validity or reliability data, nor does it appear to include non-response bias analysis.

Linkage to Outcomes -- Assuming that a survey methodology can be executed with adequate internal validity and reliability, the questions themselves must also be linked to external criteria -- outcomes that are separate from the survey itself such as reduced medical errors, lower infection rates or superior clinical results. The PEP-C survey specifically states that no linkages to clinical outcomes have been discovered. Without establishing such linkages, it's possible that consumers will be misled by the meaning of the results. The PEP-C surveys use a three-star rating system: What are the statistical demarcations that differentiate the one-star hospitals from three-star hospitals? What are the expected ranges of error? As a patient, would it be better to choose a hospital by its star rating rather than by a physician's recommendation? These are all details that should be made clear in any presentation of the data.

If the CMS patient assessment is to truly represent patient attitudes, we believe the following issues must be addressed:

  1. Upfront research must be conducted to determine if the survey questions have more than "face" validity -- i.e., they appear on the surface to be appropriate questions. The questions should be shown to reliably predict patient satisfaction and be linked to important outcomes. Currently, AHRQ requests that survey vendors submit questions that fit into eight categorical constructs, all but one of which were developed by Picker for a failed assessment initiative in Massachusetts, and are currently being used in California. The categories appear to have only face validity with no basis in empirical research. AHRQ should test the validity and reliability of both the questions and the categorical constructs.
  2. A rigorous sampling method must be used so that survey results accurately represent the larger population of patients. In support of full disclosure, every result published for every hospital rated should include sample size, response rates, validity and reliability measures, data collection period and expected ranges of error.
  3. If the response rates are less than 70%, AHRQ should mandate a process to analyze potential non-response bias.
  4. If a rating system is established, it must be based upon a statistically defined measure. For example, if a "star" system is used, there should be a proven statistical difference between one, two or three stars.
  5. All published results should identify the linkages that have been established between the ratings and outcomes that will be important to consumers. Results should also include a statement about what has not been established if consumers are likely to presume linkages that don't exist.

Gallup has committed millions of dollars to research and development of its programs for measuring satisfaction with hospital clients. We hope that CMS and AHRQ will be just as diligent in researching and developing processes. There is a lot at stake for CMS, America's hospitals and the nation's healthcare consumers.


Gallup https://news.gallup.com/poll/6781/Patient-Satisfaction-Standard-Meaningless-Done-Right.aspx
Gallup World Headquarters, 901 F Street, Washington, D.C., 20001, U.S.A
+1 202.715.3030