Proceedings of the American Thoracic Society Email Content Delivery
HOME HELP FEEDBACK SUBSCRIPTIONS ARCHIVE SEARCH TABLE OF CONTENTS
 QUICK SEARCH:   [advanced]


     


The Proceedings of the American Thoracic Society 4:347-349 (2007)
© 2007 The American Thoracic Society
doi: 10.1513/pats.200701-014HT

This Article
Right arrow Abstract Freely available
Right arrow Full Text (PDF)
Right arrow Alert me when this article is cited
Right arrow Alert me if a correction is posted
Services
Right arrow Related articles in Proceedings of the American Thoracic Society
Right arrow Similar articles in this journal
Right arrow Similar articles in PubMed
Right arrow Alert me to new issues of the journal
Right arrow Download to citation manager
Right arrow reprints & permissions
Citing Articles
Right arrow Citing Articles via Google Scholar
Google Scholar
Right arrow Articles by Goris, M. L.
Right arrow Articles by Robinson, T. E.
Right arrow Search for Related Content
PubMed
Right arrow PubMed Citation
Right arrow Articles by Goris, M. L.
Right arrow Articles by Robinson, T. E.

A Critical Discussion of Computer Analysis in Medical Imaging

Michael L. Goris1, Hongyun J. Zhu1 and Terry E. Robinson2

Divisions of 1 Nuclear Medicine/Radiology and 2 Pediatric Pulmonary Medicine/Pediatrics, Stanford University, Stanford, California

Correspondence and requests for reprints should be addressed to M. L. Goris, M.D., Ph.D., Division of Nuclear Medicine, H0101, Stanford University School of Medicine, Stanford, CA 94305-5281. E-mail: mlgoris{at}stanford.edu

ABSTRACT

Medical imaging has increasingly provided surrogate endpoints in therapeutic trials. This use assumes that the interpretation of the images can be unbiased and reproducible and that the image attributes included in the interpretation are relevant to the mechanism of the trial. The principal motivation for computer analysis is to evaluate an attribute of the image as a metric in an algorithmic manner, independent of observer bias or variability. The metric is expected to reflect change in rough proportion with at least one aspect of the degree of disease or the effectiveness of the therapeutic intervention. If either condition is satisfied, the measure is quantitative. Visual interpretation explicitly or implicitly tends to be based on multiple image attributes. Explicit combination of multiple attributes yields composite scores. To evaluate the risk or probability of disease, they are useful. But the components of the scores can be combined only if they are mathematically isomorphic. For the evaluation of interventions, they are less useful because the effect on one component may be obscured by the lack of effect on other components. This article reviews quantification of air trapping in cystic fibrosis and quantification in general. Validation of any computer analysis can rely on agreement with visual interpreters (on average), they can be derived from first principles, or by agreement with an alternative method that measures the pathophysiological mechanism directly (xenon washout for air trapping). However, in the context of trials, the validation may come from a superior ability to detect objective change and to discriminate between affected and unaffected individuals.

Key Words: computer analysis • medical images • composite scores

Regional or focal air trapping is part of the pathophysiology of a number of pulmonary disorders, including bronchiolitis obliterans (1, 2), reactive airway disease (3), chronic bronchitis (4), atypical pneumonia (5), bronchiectasis (6), emphysema (7), sarcoidosis (8), eosinophilic granuloma (9), and cystic fibrosis (CF) (1020).

If we can accept that air trapping can easily be quantified on the basis of density distributions in pulmonary computed tomography (CT) scans, we also have to consider what it means, in this context and in general, to quantify. Close analysis will demonstrate that air trapping is reflected in densities, but densities do not always reflect air trapping. The simple metric of air trapping does not define the patient but is useful to measure specific interventions. Validation is not based on a defining test but on the ability to discriminate and measure small effects. The measure is simple and incomplete. Why? In this article, we define the value and limitations of quantitative measures, in the particular context of CF, with examples from cardiology.

DISCUSSION

Quantitative Metric versus Physiological Attribute
A quantitative measure or metric should ideally be singly and directly related to the attribute it tracks, but that is not necessarily the case. Sometimes, the metric is a derived value.

A typical example is the assumption that, with CT, we measure lung density, and lung density in the expiratory scan is the basis for the metric that estimates the degree of air trapping, an important component of the physiopathology of CF. However, this lung density is an artifact of the low resolution of the CT in relation to the pulmonary structures. (More precisely, it is an artifact related to the degree of granularity of the analysis.) Schematically, the lung is composed of structures with tissue density (alveoli, blood vessels, blood) and air. In expiration, none of the components changes in density, but, per cubic centimeter, the structures with tissue density become more prevalent. On average, at the resolution level of the CT, the density increases. In some other cases, the metric is indirect. If we imagine a CT scan with a much higher resolution, voxel values could not be used individually to define lung density (i.e., air trapping) but averaging n voxel values would be necessary.

In a recent article, infarcted myocardium was repopulated with marrow stem cells (21). The metric was the ejection fraction. That may have been a mistake, because the infarct is regional, and the ejection fraction is global. There is no evidence that a decrease of ejection fraction is proportional, let alone related in a linear fashion, to the size of the infarct. This analysis failed to show any benefit (in terms of the ejection fraction). A change in the infarct size, the immediate effect of repopulation, was not measured. In terms of endpoints, using the ejection fraction (as a predictor or surrogate for patient benefit) may have been correct, but as a proof of concept, nothing was contributed.

Algorithmic Analysis Is Reproducible but Not Necessarily Data Acquisition
If the quantification is algorithmically defined, the reproducibility (precision) is high (if not perfect), but the measure is apt to be influenced by unrelated factors: for example, variations in methodology (imaging protocols and technical parameters); interpatient variation in lung density, even in the inspiratory images; the degree of expiration at the time of imaging; and the selection of the density threshold defining air trapping. Elsewhere in this symposium (pp. 310–315), Robinson discusses the influence of the degree of the inspiratory and expiratory effort on the estimation of air trapping from densities.

Myocardial perfusion studies have been quantified for ages (2225). Yet, at a recent unpublished panel discussion, three originators of those methods admitted to disbelieving the results in 15% of the cases. The assumption is that there is reproducibility (for any dataset), but that unrelated influences (patient movement, body habitus, imaging settings, or intervention variability) may disturb any of the datasets.

Heuristic Analysis Is More Resistant to Unrelated Variability Factors
Heuristic (e.g., visual) quantification is less reproducible (the precision is lower) but more resistant to the influence of unrelated attributes because the observer can more easily modulate his or her response by context (e.g., a decreased expiratory effort in the case of air trapping in CF), for example by relying more on contrast than on absolute values. Some nonrelated attributes that would influence the metric can be predicted (like the effect of expiratory effort on lung densities), and can be taken into account to some extent. In the following formula:

Formula
defining a threshold for air trapping, the third term supposedly compensates for the degree of expiration (26). D shows the displacement of the 90th percentile in density between inspiration and expiration. In contrast with the other parameters of the frequency function describing the density distribution during inspiration and expiration (mode, median, mean), the 90th percentile displacement does not seem affected by disease (because, one assumes, that at least 10% of the lung is normal) but by the expiratory effort only (Table 1).


View this table:
[in this window]
[in a new window]

 
TABLE 1. DISPLACEMENT OF THE FREQUENCY DISTRIBUTION OF DENSITIES BETWEEN INSPIRATION AND EXPIRATION IN A NORMAL CASE (CASE 1) AND A CASE WITH ADVANCED AIR TRAPPING (CASE 2)

 
Combining Metrics into Scores to Define Patient Status
The combination of the evaluation of multiple attributes or metrics in a "score" is perfectly valid to define the status of a patient but dangerous in evaluating the response to a specific intervention (because the response of one attribute can be obscured by the lack of response in the others). The principle of isomorphism applies: the meaning of progression or regression should be the same regardless of the score component that changes one unit. Otherwise two equal scores could mean different things. Scores are not mathematically nor necessarily clinically valid. The latter is the case if one component of the score is a later manifestation of the disease, replacing another one that is more prominent in early disease (e.g., air trapping and bronchiectasis). Even so, scores have demonstrated clinical utility.

In cardiology, a strange multiplicative score has been introduced by Hachamovitch and colleagues (27). In myocardial perfusion studies, one observes two attributes of perfusion abnormalities: the degree of hypoperfusion locally, and the extent of the area of hypoperfusion. Hachamovitch and colleagues evaluate the degree of hypoperfusion (in a scale from 0 to 4) in 17 myocardial segments. The score is the sum of the segmental values. Four segments with a value of 4 have the same score as eight segments with the value of 2. There is a conflation of size and gravity or degree and frequency. Nevertheless, this score seems very predictive of distant cardiac events.

One potentially fruitful approach is histogram analysis, in which the x axis is the degree and the y axis represents frequency. Existing approaches based on the histogram end up with a threshold, hence conflating degree and frequency. A universal comparison between frequency distributions has not been proposed.

Spatial Distribution versus Global Attributes
If the attribute that is tracked is heterogeneous in space, sentinel lesions must be identified and tracked. In oncology, the effectiveness of a drug is judged by the response of the tumors that can be measured. The decrease in volume of one lesion (tumor), with an increase in another location, would indicate (at least partial) failure. For radiotherapy, which is local, that is not the case. The inference is that, for local intervention, the metric should be a local metric. In the case of the infarct treatment with bone marrow stem cells, that rule was broken. This brings us to the distinction between real endpoints (quality of life, decrease in the number of hospitalizations, life expectancy increase) and surrogate endpoints to be discussed elsewhere in this symposium and which are beyond the scope of this presentation. However, for proof of concept and early effect evaluation, the best metrics are noncomplex, not composite, and regional where they need to be.

Validation
Finally, how to validate? Comparisons with previous methods, in which validation means "works the same as," do not provide progress, except that the new method may be a time saver. In the absence of a defining (nosological) test (e.g., histopathology for tumors, angiography for pulmonary emboli), we looked at the automated quantification of air trapping by asking if the metric discriminated between healthy subjects and those with minimal disease, and if treatment effects could be followed. We have applied this analytical method to a group of 25 patients with mild CF lung disease and 10 age-matched control subjects using six anatomically matched high-resolution CT spirometry-triggered slice pairs to evaluate the discriminating power (26). This method was also applied to data from a randomized, double-blind, placebo-controlled 1-year trial of dornase {alpha} in 25 children and adolescents with mild CF disease. With the data from this study, the sensitivity for small changes was tested (28). The validation in both cases was that our method was better than all the other ones applied to the same cases, both in discrimination between "normal" and "abnormal," and in tracking progression under treatment. The validation is sometimes purely functional (e.g., predicting outcome) and not necessarily based on ground truth.

CONCLUSIONS

Analysis based on images has a putative advantage in diseases that have heterogeneous or localized expressions in organs. Global organ function measurements will not necessarily reflect significant local changes.

Visual image analysis is complex and often includes a heuristic element, making it subject to intra- and interobserver variability. In addition, if the evaluation is global (implicitly or explicitly by scoring), the result does not necessarily accurately reflect significant changes in the progression of the disease. One of the great advantages is the greater tolerance for methodological variations in the image acquisition or display.

Algorithmic analysis of single image attributes is reproducible by definition, but more dependent on data integrity (expected acquisition parameters). Finally, whereas single attribute metrics do not necessarily predict the ultimate desired study endpoint, they are very useful for the testing of treatment mechanisms (proof of concept).

FOOTNOTES

Conflict of Interest Statement: M.L.G. is a co-investigator in a grant sponsored by Novartis and the Cystic Fibrosis Foundation (CFF). H.J.Z. does not have a financial relationship with a commercial entity that has an interest in the subject of this manuscript. T.E.R. is currently the principal investigator on a Novartis and CFF Therapeutic Development Network grant.

(Received in original form January 10, 2007; accepted in final form March 4, 2007)

REFERENCES

  1. Miller WT Jr, Kotloff RM, Blumenthal NP, Aronchick JM, Gefter WB, Miller WT. Utility of high resolution computed tomography in predicting bronchiolitis obliterans syndrome following lung transplantation: preliminary findings. J Thorac Imaging 2001;16:76–80.[CrossRef][Medline]
  2. Stern EJ, Samples TL. Dynamic ultrafast high resolution CT findings in a case of Swyer-James syndrome. Pediatr Radiol 1992;22:350–352.[CrossRef][Medline]
  3. Laurent F, Latrabe V, Raherison C, Marthan R, Tunon-de-Lara JM. Functional significance of air trapping detected in moderate asthma. Eur Radiol 2000;10:1404–1410.[CrossRef][Medline]
  4. Franquet T, Stern EJ. Bronchiolar inflammatory diseases: high-resolution CT findings with histologic correlation. Eur Radiol 1999;9:1290–1303.[CrossRef][Medline]
  5. Kubo K, Yamazaki Y, Imasubuchi T, Takamizawa A, Yamamoto H, Koizumi T, Fujimoto K, Matsuzawa Y, Honda T, Hasegawa M, et al. Pulmonary infection with Mycobacterium avium-intracellulare leads to air trapping distal to the small airways. Am J Respir Crit Care Med 1998;158:979–984.[Abstract/Free Full Text]
  6. McGuinness G, Naidich DP. CT of airways disease and bronchiectasis. Radiol Clin North Am 2002;40:1–19.[CrossRef][Medline]
  7. Miniati M, Filippi E, Falaschi F, Carozzi L, Milne EN, Sostman HD, Pistolesi M. Radiologic evaluation of emphysema in patients with chronic obstructive pulmonary disease: chest radiography versus high resolution computed tomography. Am J Respir Crit Care Med 1995;151:1359–1367.[Abstract]
  8. Bartz RR, Stern EJ. Airways obstruction in patients with sarcoidosis: expiratory CT scan findings. J Thorac Imaging 2000;15:285–289.[CrossRef][Medline]
  9. Stern EJ, Webb WR, Golden JA, Gamsu G. Cystic lung disease associated with eosinophilic granuloma and tuberous sclerosis: air trapping at dynamic ultrafast high-resolution CT. Radiology 1992;182:325–329.[Abstract/Free Full Text]
  10. Shah RM, Sexauer W, Ostrum BJ, Fiel SB, Friedman AC. High-resolution CT in the acute exacerbation of cystic fibrosis: evaluation of acute findings, reversibility of those findings, and clinical correlation. AJR Am J Roentgenol 1997;169:375–380.[Abstract/Free Full Text]
  11. Marchant JM, Masel JP, Dickinson FL, Masters IB, Chang AB. Application of chest high-resolution computer tomography in young children with cystic fibrosis. Pediatr Pulmonol 2001;31:24–29.[CrossRef][Medline]
  12. Chung MH, Edinburgh KJ, Webb EM, McCowin M, Webb WR. Mixed infiltrative and obstructive disease on high-resolution CT: differential diagnosis and functional correlates in a consecutive series. J Thorac Imaging 2001;16:69–75.[CrossRef][Medline]
  13. Bhalla M, Turcios N, Aponte V, Jenkins M, Leitman BS, McCauley DI, Naidich DP. Cystic fibrosis: scoring system with thin-section CT. Radiology 1991;179:783–788.[Abstract/Free Full Text]
  14. Maffessanti M, Candusso M, Brizzi F, Piovesana F. Cystic fibrosis in children: HRCT findings and distribution of disease. J Thorac Imaging 1996;11:27–38.[Medline]
  15. Brody AS, Molina PL, Klein JS, Rothman BS, Ramagopal M, Swartz DR. High-resolution computed tomography of the chest in children with cystic fibrosis: support for use as an outcome surrogate. Pediatr Radiol 1999;29:731–735.[CrossRef][Medline]
  16. Long FR, Castile RG, Brody AS, Hogan MJ, Flucke RL, Filbrun DA, McCoy KS. Lungs in infants and young children: improved thin-section CT with a noninvasive controlled-ventilation technique—initial experience. Radiology 1999;212:588–593.[Abstract/Free Full Text]
  17. Castile RG, Long FR, Flucke RL, Hayes JR, McCoy KS. Correlation of structural and functional abnormalities in the lungs of infants with cystic fibrosis. Pediatr Pulmonol 2000;20(Suppl):427.[CrossRef]
  18. Robinson TE, Leung AN, Northway WH, Blankenberg FG, Bloch DA, Oehlert JW, Al-Dabbagh H, Hubli S, Moss RB. Spirometer-triggered high-resolution computed tomography and pulmonary function measurements during an acute exacerbation in patients with cystic fibrosis. J Pediatr 2001;138:553–559.[CrossRef][Medline]
  19. Kauczor HU, Hast J, Heussel CP, Schlegel J, Mildenberger P, Thelen M. Focal airtrapping at expiratory high-resolution CT: comparison with pulmonary function tests. Eur Radiol 2000;10:1539–1546.[CrossRef][Medline]
  20. Stern EJ, Webb WR, Gamsu G. Dynamic quantitative computed tomography: a predictor of pulmonary function in obstructive lung diseases. Invest Radiol 1994;29:564–569.[CrossRef][Medline]
  21. Schächinger V, Erbs S, Elsässer A, Haberbosch W, Hambrecht R, Hölschermann H, Yu J, Corti R, Mathey DG, Hamm CW, et al., for the REPAIR-AMI Investigators. Intracoronary bone marrow–derived progenitor cells in acute myocardial infarction. N Engl J Med 2006;355:1210–1221.[Abstract/Free Full Text]
  22. Garcia E, Maddahi J, Berman D, Waxman A. Space/time quantitation of thallium-201 myocardial scintigraphy. J Nucl Med 1981;22:309–317.[Abstract/Free Full Text]
  23. Garcia EV, DePuey EG, Sonnemaker RE, Neely HR, DePasquale EE, Robbins WL, Moore WH, Heo J, Iskandrian AS, Campbell J. Quantification of the reversibility of stress-induced thallium myocardial perfusion defects: a multicenter trial using bull's eye polar maps and standard normal limits. J Nucl Med 1990;31:1761–1765.[Abstract/Free Full Text]
  24. Van Train KF, Maddahi J, Berman DS, Kiat H, Areeda J, Prigent F, Friedman J, and participants of the multicenter trial. Quantitative analysis of stress thallium-201 myocardial scintigrams: a multicenter trial. J Nucl Med 1986;27:17–25.[Abstract/Free Full Text]
  25. Goris ML, Hotz B, Thirion J-P, Similon P. Factors affecting and computation of myocardial perfusion reference images. Nucl Med Commun 1999;20:627–635.[Medline]
  26. Goris ML, Zhu HJ, Blankenberg F, Chan F, Robinson TE. An automated approach to quantitative air trapping measurements in obstructive lung disease using cystic fibrosis. Chest 2003;123:1655–1663.[CrossRef][Medline]
  27. Hachamovitch R, Berman DS, Shaw LJ, Kiat H, Cohen I, Cabico JA, Friedman J, Diamond GA. Incremental prognostic value of myocardial perfusion single photon emission computed tomography for the prediction of cardiac death: differential stratification for risk of cardiac death and myocardial infarction. Circulation 1998;97:535–543.[Abstract/Free Full Text]
  28. Robinson TE, Goris ML, Zhu HJ, Chen X, Bhise P, Sathi A, Sheikh F, Moss RB. Dornase alpha reduces air trapping in children with mild CF lung disease: a quantitative analysis. Chest 2005;128:2327–2335.[CrossRef][Medline]

Related articles in Proceedings of the American Thoracic Society:

Computed Tomography Scanning Techniques for the Evaluation of Cystic Fibrosis Lung Disease
Terry E. Robinson
Proceedings of the American Thoracic Society 2007 4: 310-315. [Abstract] [Full Text]  




This Article
Right arrow Abstract Freely available
Right arrow Full Text (PDF)
Right arrow Alert me when this article is cited
Right arrow Alert me if a correction is posted
Services
Right arrow Related articles in Proceedings of the American Thoracic Society
Right arrow Similar articles in this journal
Right arrow Similar articles in PubMed
Right arrow Alert me to new issues of the journal
Right arrow Download to citation manager
Right arrow reprints & permissions
Citing Articles
Right arrow Citing Articles via Google Scholar
Google Scholar
Right arrow Articles by Goris, M. L.
Right arrow Articles by Robinson, T. E.
Right arrow Search for Related Content
PubMed
Right arrow PubMed Citation
Right arrow Articles by Goris, M. L.
Right arrow Articles by Robinson, T. E.


HOME HELP FEEDBACK SUBSCRIPTIONS ARCHIVE SEARCH TABLE OF CONTENTS