|
|
||||||||
St George's, University of London, London, United Kingdom
Correspondence and requests for reprints should be addressed to Paul W. Jones, Ph.D., F.R.C.P., Respiratory Medicine, St George's, University of London, Cranmer Terrace, London SW170RE, UK. E-mail: pjones{at}sgul.ac.uk
ABSTRACT
Disturbances to health can be divided broadly into limitations of basic activities of daily living (washing, dressing, etc.) that are common to all patients and other limitations that depend on individual circumstances. A distinction should be drawn between health status and health-related quality of life. Health status questionnaires are standardized for "typical patients," and this should be borne in mind when interpreting such scores from studies that give an average result using population-based measurements. Such studies give a good indication to payers of the average effects of a treatment, but provide no indication other than probability of benefit to individuals. This applies even more with health improvements that manifest uniquely in each patient. Two widely used disease-specific questionnaires in chronic obstructive pulmonary diseasethe Chronic Respiratory Questionnaire and the St. George's Respiratory Questionnaireare health status, rather than quality of life, instruments. Health status scores from questionnaires provide measures of the effects of disease, not measures of the disease itself. The relationship between high health status score and increased risk of dying is due to the fact that both reflect underlying disease activity. In therapeutic studies, health status improves because the underlying disease activity has been moderated by therapy. Improved health does not improve mortality or morbidity per se. It is also important to appreciate that the impact of any measured change in health may also be determined by the patient's baseline state.
Key Words: chronic obstructive pulmonary disease measurement quality of life questionnaires
At first sight, there are obvious differences between patients and payers: one receives and the other delivers medical care. This situation, however, is changing and the role of the patient is transforming from being a passive recipient to a participant in a partnership, or, even more radically, patients are becoming the active commissioners of their care. There are other differences in perception or viewpoint (Table 1). For example, individual patients are concerned about their illness, how it affects their life, and the efficacy of their specific treatments. By contrast, payers must deal with the generality of the disease and large groups of patients. A major area of agreement is that payers and patients wish, respectively, to provide or receive care for which there is evidence of efficacy. Regardless, even here there are two different perspectives. The payer wishes to know the average effect of a treatment, whereas patients want to know how well it will work in them.
|
Most evidence for treatment efficacy comes from randomized controlled clinical trials (RCTs). These are treated as a "gold standard" method of collecting evidence for efficacy, although the precise reasons for this are less well understood. The RCT design is a method for reducing bias in the estimate of a treatment's effect, but only certain biases are reduced. Such biases are due to a lack of concealment of treatment allocation (blinding). Lack of blinding can alter behaviors on the part of the patient or the investigator. Lack of blinding can lead to an overestimation of a treatment's effect (1). Randomization, the other major component of the RCT, is designed to ensure that patients in all treatment arms are matched as closely as possible at the start of the trial.
There are other sources of bias, however, that do reflect the differing objectives and requirements of patient and payer. These are often explicit in terms of the inclusion and exclusion criteria. In clinical trials with new therapy for chronic obstructive pulmonary disease (COPD), for example, those patients with a history of asthma are excluded because there is a concern that, because asthma is a disease of greater reversibility in airway obstruction, inclusion of any patients with any "asthmatic tendency" will provide an overestimate of the treatment efficacy. Indeed, tight requirements concerning evidence of reversibility have restricted recruitment of patients to those in whom a small degree of response was to be expected. This is paradoxical behavior because most of the treatments are directed toward improving airway function.
A less obvious, but equally explicit, source of bias has been the choice of outcome measure used to quantify the effects of treatment. These are frequently set by the agencies responsible for the registration of new drugs for COPD, and they all require evidence of improved airway function measured as the FEV1. This has important implications for the design, execution, and analysis of the trial because investigators will invest more time, resources, and care into the accurate collection of the "primary outcome" data than other "secondary outcomes." Furthermore, the study size will be determined by calculations about the minimum numbers of patients that need to be recruited to each arm of the study to ensure that it is powered sufficiently to detect a planned size of treatment effect.
The choice of trial endpoint is the point at which the needs of patients and payers begin to diverge most markedly. For patients, the outcomes that are important are the ones that they experience directly: their symptoms, disability due to lack of exercise tolerance, malaise and fatigue, exacerbations, and impaired quality of life. Loss of FEV1 is not something that they experience. They suffer from the consequences of impaired lung function, of which the FEV1 is just one marker. Although there are good pathophysiologic grounds for using the FEV1 as a measure of expiratory airflow limitation, its use as a primary outcome measure for clinical trials in COPD has also been conditioned by factors such as ease of measurement, standardization, repeatability, and reliability across investigative sites. The issue of choice of outcome is broader than just selecting alternatives to the FEV1.
ENDPOINTS IN CLINICAL TRIALS
Clinical trial endpoints serve a number of separate purposes. Primarily, they should be used to assess benefit in terms of patient-experienced clinical outcome; however, it is exceedingly rare for clinical trials in COPD to specify a clinical outcome as a primary endpoint of the trial. There are a number of reasons for this, not just familiarity with an established trial endpoint (e.g., the FEV1), together with custom and practice. Some outcomes are difficult to measure directly (e.g., restriction of daily activity), so markers of the outcome are used. These are surrogate measurements that are known to be related to the clinical outcome of interest. For example, laboratory measurement of exercise capacity is a marker for restriction of daily activity. Quite often, an outcome's principal marker is complex or difficult to measure reliably (e.g., cardiopulmonary exercise tests), so a surrogate marker may be used in place of the principal marker. Measures of lung function are used in this way, which is another reason why selection processes for clinical trial outcomes frequently lead back to the FEV1.
Another reason for including markers as clinical trial outcomes is the need to understand the mechanism of a treatment's action. It is important to demonstrate that the therapy was effective through its expected mode of action and that there is a clear biological "audit trail" from basic mechanism within an organ (e.g., bronchodilatation) to secondary effects within that same organ (e.g., a reduction in resting lung volumes and less dynamic hyperinflation upon exercise). These have benefits on whole body function (e.g., exercise capacity), which in turn lead to an improvement in disability. Only the last of these is a clinical outcome as experienced directly by the patient. The correlation between a marker and its outcome may not be very tight, and there may be other factors that determine the outcome. In the context of this example, skeletal muscle weakness is one such factor. Muscle weakness modifies (indirectly) the relationship between the marker (airway function) and its outcome (disability). Although markers give important insights into mechanisms, it is also necessary to quantify the clinical outcome directly if at all possible.
It is important to note that this example of an outcome and its markers uses a well-characterized pathway: lung function, exercise, and disability. In part, our understanding of this pathway has developed because most pharmacologic therapies for COPD have previously been directed toward improved airway function. That situation is changing. The role of exacerbation as an important outcome in COPD is now clearly established (24), as is its selection as a therapeutic target (5, 6). Indeed, several recent, large clinical trials have, at the outset, specified exacerbations as being a key endpoint (79). Future developments in therapies for COPD may change the requirements for markers considerably. For example, there are currently no well-validated markers of the symptoms of chronic bronchitis: cough and sputum production. This clinical outcome is very troublesome for patients and it is possible that the lack of adequate markers has hampered research in this area.
A further important requirement of endpoints in a clinical trial is that they provide an overall estimate of the treatment's efficacy. COPD is a highly complex disease with multiple mechanisms. There are also multiple potential sites of action for drugs. Markers of treatment effects need to have high specificity if they are to be used to quantify very specific pathobiological effects, as is needed to provide clear evidence for a drug's mechanism of action, for example. When there are multiple effectsfor example, improved lung function leading to better exercise capacity and less sleep disturbancethere is a strong case for measuring both, using specific techniques, although this approach will not provide an overall measure of the treatment's effect. This becomes more complex if the therapy has multiple mechanisms of action that are unrelated. For example, some beneficial effects on lung function and exacerbations may occur through entirely separate pathways. There is often a need for integrative measures that sum up all of the treatment effects. Cardiopulmonary exercise tests provide a very useful role in integrating the function of lung, heart, and skeletal muscles into a single high-level marker, but they don't address factors such as sleep, exacerbations, cough, mood, and so on. The only measures that have the potential to address all areas of impairment due to COPD are health status questionnaires because, in theory at least, these can be designed to cover every aspect of COPD.
QUALITY OF LIFE IN COPD
COPD causes major restrictions on patients' exercise tolerance that frequently have a major impact on the level of daily activity that they can sustain and, therefore, reduce quality of life. The level of impact on exercise intolerance will depend on the individual patient's social and domestic circumstances as well as his or her employment and hobbies and the demands that these place on his or her physical capacity. Many other factors are important. For example, effects of COPD on sleep, exacerbations, and mood are relatively easy to identify and quantify, but others, such as embarrassment due to symptoms, social isolation, and stigmatization, are not. Perhaps the most difficult area to assess is expectations, what patients wish to achieve, what they think that they can achieve for themselves, and what the impact of the disease is on the way they perceive themselves. It is reasonable to assume that patients in general want to experience good health, with minimal symptoms and limitation in their daily lives, together with a high sense of well-being. This should minimize the impact of their disease on their quality of life. Having a given disease is not necessarily the major determinant of impaired quality of life, however, even in patients with very serious disease (10). Furthermore, the relative contribution of health-related factors to overall quality of life may change after treatment (11).
Although for patients the clinical trial endpoint that is of greatest importance is the treatment's effect on their health-related quality of life, this outcome is very difficult to assess due to the highly individual nature of people, their lives, and their circumstances. Clinical trial results have to be quantifiable, so methods have to be found that permit quantification of quality of life to produce a valid numeric estimate. The challenge for those interested in measuring health-related quality of life lies not with the methods of turning patients' responses to questionnaires into numbers (there is a well-established science and suite of tools with which to do that) but with identification of the items to include in such a questionnaire.
Measurement implies standardization; in terms of questionnaire development, that means giving all patients the same questions to answer. The application of modern methods of measurement, such as item response theory to questionnaire design and scoring, may allow inclusion of different items within a questionnaire for different, well-characterized groups of individualsfor example, men versus womenbut generally all patients have to respond to all items. That means that all items have to be at least potentially applicable to all patients to whom the questionnaire would be administered. The resulting questionnaire is therefore composed of a set of core or common items that apply generically to all patients with COPDthat is, individuality is removed. This may be illustrated by the example of playing with grandchildren. Most patients with COPD are of an age at which they could have grandchildren. For some of them, playing with their grandchildren is an activity that is an important contributor to their quality of life; however, not all of them have grandchildren of an age who would wish to play with their grandparents, or live close enough to them. Indeed, not all patients with COPD would want to play with children! It is possible to ask the question: "If you had grandchildren, and if you could play physically active games with them, would you want to do so?" but that question contains too many hypothetical conditions for it to be reliable. This specific example shows why questionnaires have to be designed to address general effects of COPD rather than the very detailed way in which the disease can affect the lives of individual people. It is for this reason that questionnaires such as the Chronic Respiratory Disease Questionnaire (CRQ) (12) and the St. George's Hospital Respiratory Questionnaire (SGRQ) (13) are more accurately termed health status questionnaires, rather than health-related quality-of-life questionnaires.
HEALTH STATUS MEASUREMENT
There are three broad types of health status questionnaire (Table 2). Each has its purpose. Disease-specific measures are targeted at patients with COPD directly, so they should be able to identify small differences between levels of disease severity (discriminative properties) and be sensitive to clinically worthwhile changes with therapy (evaluative properties). Generic questionnaires may have reasonably good discriminative properties in COPD, but may not be very sensitive to change (14). Utility instruments come from a slightly different perspective than the other two. They are generic questionnaires, but their design is driven from a health economic perspective in which death is included as a health state. They may have discriminative properties in COPD, but will be weak at evaluating changes. Although most of the evidence for treatment efficacy comes from disease-specific measures, utility estimates are being applied increasingly in the context of healtheconomic treatment evaluation by agencies such as the National Institute for Clinical Excellence in the United Kingdom. These instruments use methods of weighting for the importance of each item with techniques that are based largely on a societal perspective rather than the perspectives of patients. A pattern can be seen to be emerging: individual quality of life cannot be measured directly, so a marker is used: a disease-specific health status score. COPD is only one of many diseases, so comparisons between diseases are made using questionnaires that have limited sensitivity in COPD. In other words, decisions about availability of new treatments for COPD are being made on the basis of instruments that are insensitive in this disorder and that view the disease from the perspective of society more generally, not from that of patients with COPD in particular. Payers and patients agree that health is the important outcome of treatment, but they have different perspectives on the way in which this is measured.
|
Health status questionnaires have been used increasingly in clinical trials over the last few years and much is being learned about their application in this setting. Perhaps their most important role is to assess the overall effect of treatment, but this application has highlighted some important biases that occur during a clinical trial that may have a significant impact on the conclusions to be drawn from it (18).
Health status measurements may be particularly sensitive to some of these biases because they are designed to draw together a range of manifestations of disease. One potential bias is the use of relatively short run-in periods, the implications of which emerged with the observation that SGRQ scores continued to recover for many months after an exacerbation (4). For example, between 4 and 26 wk after an exacerbation, the SGRQ score improved by 7 units (almost twice the threshold of clinical significance). This slow recovery period has taken on a greater significance now that a history of at least one exacerbation in the preceding year is included as a criterion for entry into a trial. This may contribute in part to the small improvement in health usually seen in the placebo limb of clinical trials over the first few months of a study. Another factor contributing to this improvement may be a "Hawthorn effect" (derived from an industrial study in which work productivity improvements were found to reflect the attention of the researcher, not interventions such as improved lighting). That is, the behavior of the patients and their clinician coupled with closer medical attention on entry to the study could produce small improvements in health.
Clinical trials have tight inclusion and exclusion criteria, but by the time the trial has ended many patients will have withdrawn. This process of trial dropout starts during the run-in period, which usually consists of a period of 2 to 6 wk in which treatments of a similar class to those to be included in the trial are withdrawn. A number of patients may then have exacerbations and not even reach the point at which they would be randomized to treatment (15). This is important because there is evidence that patients with more severe disease are more likely to have exacerbations (16). After randomization, patients characteristically withdraw more from the placebo arm of a study than the active treatment arm (79, 17). There is also evidence that the patients who withdraw have more severe COPD than those who remain in the study (18). Because patients from the placebo arm are more likely to withdraw, this "healthy survivor effect" will be greatest in the placebo group. Methods of imputing missing results are used in clinical trials, the most common being the technique of "last observation carried forward." This may overcome this problem to a degree, but it cannot take into account changes that would have occurred in patients after they withdrew. It is known that health status declines progressively, even in patients who are on treatment (3), and not only are patients who drop out of trials more severe but their health status is declining faster (Figure 1) (18). Given that the trajectory of their deterioration is greater than that of the patients who remain in the study, it is likely that the "last observation carried forward" approach may underestimate the size of the difference between the active treatment and placebo groups.
|
HEALTH STATUS IN ROUTINE PRACTICE
Clear differences emerge in the assessment of health status required by patients and payers. From the payers' perspective, there may be three principal reasons for the use of health status measurement:
Health status scores provide an estimate of the overall effect of COPD, so they should be especially useful for measuring the impact of comprehensive treatment programs such as pulmonary rehabilitation, both for validation and audit. The CRQ and SGRQ are both used for this purpose, and their published MCIDs provide a benchmark minimum standard for improvement in patients who enter the programs (19, 20). Furthermore, data from large randomized trials (21) and meta-analyses (22) provide a further method of benchmarking rehabilitation programs. It is noteworthy that there are few published data about the use of health status data to measure the health gain provided by a whole service. This approach would be suitable for auditing comprehensive COPD disease-management programs, but the limiting step is the time required for prospective data collection needed for such audit purposes. For such an approach to be successful, it would be necessary to demonstrate that investment in the necessary data capture and analysis systems would be repaid in terms of greater quality of patient care or cost-effectiveness of care delivery.
One area of disease-management programs in which health status scores may appear to have useful application is in aiding treatment decisions. An example of this is in judging whether to continue maintenance therapy for a patient after a therapeutic trial. Recent clinical trials show that SGRQ scores may improve within a few weeks of the start of treatment (8, 9, 16); however, using the MCID of a health status score as a treatment criterion in individual patients, the repeatability of the measurement should be taken into account. The repeatability of the SGRQ on repeated testing, measured using the intraclass correlation coefficient, is 0.92 (13). This is an excellent value but still too low relative to the 4-unit MCID to permit this to be used in individual patients. This is illustrated in Figure 2, which shows that approximately 50% of patients with stable COPD may have a change in SGRQ greater than ± 4 units on repeated testing. For this reason, health status scores are not suitable for detecting significant change within individual patients. It should be appreciated that this characteristic is not unique the SGRQ. The American Thoracic Society's definition of bronchodilator repeatability includes a requirement for the FEV1 to change by more than 200 ml. This is because the within-patient repeatability is approximately 180 ml. No MCID has been established for the FEV1 but 100 ml is often used and the mean improvement in FEV1 in bronchodilator trials in COPD is typically in the range 120 to 150 mlthat is, within the limits of day-to-day repeatability. Neither the SGRQ nor the FEV1 can be used reliably to identify patients who have had a worthwhile response to treatment.
|
CONCLUSIONS
Patients and payers have a shared perspective about the importance of health status and health-related quality of life. Payers are concerned with the health of groups of patients and with treatment effects measured in standardized ways in standardized clinical trials. Such measurements treat all patients as if they were identical. COPD affects each patient in a uniquely individual manner. Health status measurements provide a method for quantifying the average effect of treatment and treatment programs. Assessing benefit in individual patients requires the clinician to ask the right questions of the patient, listen to the answers, and believe them.
FOOTNOTES
Conflict of Interest Statement: P.W.J. received $3,500 consultation fees from AstraZeneca during 2005. Over the period 20032005 he received fees for attending advisory boards for AstraZeneca ($5,000), GlaxoSmithKline (GSK) ($14,5000), and Boehringer Ingelheim/Pfizer ($6,000). Over the years 20032005, he received lecture fees for speaking at conferences sponsored by AstraZeneca ($13,000) and GSK ($9,000). He has received research grants from GSK over the period 20032005 totaling $230,000 and from Boehringer Ingelheim/Pfizer over the period 20042005 totaling $200,000.
(Received in original form December 13, 2005; accepted in final form January 11, 2006)
REFERENCES
This article has been cited by other articles:
![]() |
H. Chen, D. B. Taichman, and R. L. Doyle Health-related Quality of Life and Patient-reported Outcomes in Pulmonary Arterial Hypertension Proceedings of the ATS, July 15, 2008; 5(5): 623 - 630. [Abstract] [Full Text] [PDF] |
||||
| ||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
| HOME | HELP | FEEDBACK | SUBSCRIPTIONS | ARCHIVE | SEARCH | TABLE OF CONTENTS |