|
|
||||||||
Original Research |
1 Department of Radiology, Leiden University Medical Center, Albinusdreef 2,
Leiden, The Netherlands 2333 ZA.
2 Department of Medical Statistics, Leiden University Medical Center, Leiden,
The Netherlands 2333 ZA.
Received August 12, 2004; accepted after revision October 19, 2004.
Address correspondence to L. J. M. Kroft
(l.j.m.kroft{at}lumc.nl).
OBJECTIVE. In a short period, a variety of technically different digital radiography chest systems have become available for clinical use. Our purpose was to assess the diagnostic performance of eight different digital radiography chest systems for detection of simulated chest disease under clinical conditions.
MATERIALS AND METHODS. Assessed were four different flat-panel detector systems, two different charge-coupled device systems, one selenium-coated drum, and one storage phosphor system. For each system, 10 chest images of an anthropomorphic chest phantom were obtained. Each image contained one to 12 simulated chest lesions. Eight radiologists performed soft-copy interpretations. Entrance dose was measured and effective dose calculated. A semiparametric logistic regression model was used for statistical analysis.
RESULTS. Statistically significant differences were found in the diagnostic performance of the eight digital chest systems (p = 0.01). Best performance was observed with the charge-coupled device system with slot-scan technology, yielding a sensitivity of 46% (132 of 288) lesions detected. The performance of three flat-panel detectors and the selenium-drum system was not significantly different from the slot-scan charge-coupled device system. Fewer lesions were detected with the storage phosphor system than with all other digital technologies, with a sensitivity of 34% (99 of 288) lesions detected, slot-scan charge-coupled device system versus storage phosphor system, p < 0.001. The effective dose varied among the digital systems.
CONCLUSION. We found differences in diagnostic performance among the eight different digital chest systems. Differences in detection rates are predominantly explained by detector design.
Digital radiography has become an important imaging technique for chest radiography. The good diagnostic quality and easy distribution of and access to digital images are the main reasons for replacing film-screen radiography systems with digital systems in many radiology departments. Computed radiography (CR) was the first technique that became available for digital chest radiography. CR systems are based on cassettes containing photostimulable storage phosphors. Dedicated units are requested for the readout of these cassettes. Consequently, cassette handling with CR is similar to film-screen radiography. Cassette handling is regarded as a disadvantage, but it makes CR particularly well suited, for example, for digital bedside chest radiography [1]. Direct readout systems have become available. With these systems, images are obtained using a detector that forms an integral part of the system, and the digital images become available almost instantly after acquisition. In a short period, many digital direct readout chest systems have become available for clinical use. These systems have significant technical differences [2].
The detector material in direct readout systems contains either a photoconductor that converts X-ray photons into an electric charge immediately (selenium [Se] photoconductor flat-panel detector [FPD] systems and the Secoated drum technique) or a scintillator consisting of either cesium iodide (CsI) or gadolinium-oxide sulfide (Gd2O2S, often shortened to GOS) converting X-ray photons into visible light. This light is converted into a charge by a matrix of photodiodes (scintillator FPD systems) or an arrangement of charge-coupled devices (CCDs). With CCD-based systems, designs are adapted to encompass the entire chest field. One CCD-based system resolves this by slot scanning with a small array of CCDs [3]. Other CCD systems use lenses or fiberoptic tapers to project the relatively large latent images on much smaller CCDs [4].
Acquisition of a raw image data matrix with direct-readout systems is either dynamic or static. Dynamic raw-data collection can be achieved by scanning a detector along the region of interest during the exposure (slot-scanning CCD) or by scanning the detector with microelectrometers after the exposure (rotating Se-drum). Static raw-data collection does not involve detector movement. In addition, the systems differ significantly with regard to pixel size and postprocessing.
The first technique that was introduced in clinical practice for digital direct-readout chest radiography was the Se-drumbased technique. Early evaluations of digital images have shown superior performance of the Sedrum detectors compared with CR for detecting simulated lesions in an anthropomorphic study [5], for solitary lung nodule detection in patients [6], and for visualization of structures and assessment of image quality with observer preference studies [7, 8]. Later, FPDs were introduced. FPDs have been found superior to CR in observers' preference studies [912] and in detection of simulated pulmonary nodules on an anthropomorphic chest phantom [13]. Differences in digital image quality among various direct-readout digital systems can also be expected, but it is not clear yet whether these differences will have significance in actual clinical use, such as in detection of lesions in the chest.
Our purpose was to assess the diagnostic performance of eight different digital radiography chest systems for the detection of simulated chest pathology under clinical conditions.
Materials and Methods
Four FPDs (two CsI-FPDs, one GOS-FPD, and one Se-FPD); two CCD systems (one slot-scan CCD and one lens-coupled CCD system); one Secoated drum system; and one storage-phosphor (CR) system were included in the study. Further information and technical specifications are summarized in Table 1.
|
Data Acquisition
Simulated lesions and distributionAn anthropomorphic chest
phantom (PBU-S-3, Kaguku Company) was used for the study. The anthropomorphic
chest phantom was divided into two areas, the lungs and the mediastinum,
because it was expected that differences in lesion detection could occur
depending on the variation of X-ray attenuation in these areas. Lesions
projecting over the heart, the aorta, the spine, and the central pulmonary
vessels and lesions projecting over the lower portions of the lungs that
projected over the diaphragm were considered located in the mediastinum. The
remaining inner-chest area was defined as lungs. Three types of lesions were
created. We simulated chest tumors by nodules of synthetic modeling clay
(SES). These tumors varied in size and were either moderately opaque (flat
shaped) with diameters of 1.0 and 2.0 cm or opaque (round shaped) with a
1.5-cm diameter. Either untwisted rope or coconut fiber soaked in iodinized
contrast agent simulated interstitial-linear disease. Interstitial-nodular
disease was simulated by birdseed.
Sample size calculations were performed aiming at detection of a 10% (or more) difference in lesion-detection probability between any two techniques, at a 5% significance level and with power equal to 80%, assuming that the typical detection probability is 75%. This would require evaluation of 270 lesions for each technique. However, the post hoc power, assuming a typical detection level of 40%, which fits better with the observed rates of detection, was 65% (calculated for a sample size of 270 lesions).
The simulated lesions were fixed on sheets that could be attached on the back of the anthropomorphic phantom for imaging. Ten sheets with different configurations of lesions were made. Each of the six simulated lesions (three tumor variations, two interstitial-linear variations, and one interstitial-nodular lesion type) were used 12 times (yielding a total of 72 lesions in the 10 configurations), with equal distribution for the lungs and mediastinum. The number of the various simulated lesions per configuration varied between one and 12, and the distribution was randomized (Excel 2000, Microsoft). An example of the appearance of the simulated lesions on the chest phantom is shown in Figure 1.
|
Visual markers on the phantom allowed for precise and reproducible alignment of the sheets containing simulated lesions. After each configuration, metal objects encoded for the lesion type were placed onto the lesions, and imaging was repeated to provide standard-of-reference radiographs. The imaging was performed anonymously in a way that the chest system could not be recognized on the images. All images were stored as DICOM files on compact discs and printed on film (Ektascan 2180, Eastman Kodak).
Image interpretationThe digital images were presented to eight observers, all senior radiologists, from the six hospitals where images were acquired. Each observer had clinical experience with one (four observers) or two (four observers) of the included digital imaging systems. The observers worked independently in their own working environments. The images were presented as soft copies on a 2,048 x 1,530 pixel workstation (Dome, Type 3C, Planar Imaging Systems) using eFilm (Merge eFilm). During interpretation, the complete phantom chest radiograph was displayed on the screen. The observers were allowed to alter the window width and window level. Observers were not aware of the number of different types of simulated lesions, the total number of lesions, or the number of lesions within each image. Image interpretation was performed in one session.
The interpretation session started with training, in which three radiographs of the anthropomorphic phantom with the different lesion types were shown on the workstation. Next to the workstation, the corresponding hard-copy radiographs and their matching standard-of-reference radiographs were shown on a film-viewing box. This allowed the observers to become accustomed to images of the phantom, the appearance of the lesions, and the annotation of lesions. These training examples were not included in the evaluation. Each observer judged 40 images. This was randomized in such a way that each digital system was equally judged, with five configurations interpreted per system and per observer. The order of image interpretation was randomized and unique for each system and each observer (Excel 2000). The observers could determine their own interpretation speed.
The observers were instructed to make their decisions based on soft-copy interpretation and precisely draw outer contours of each lesion they detected on the corresponding hard copies to produce annotated images for comparison with the standard-of-reference radiographs. Because observers were able to alter the window width and window level and thus had the ability to enhance subtle differences in contrast, we assumed that soft-copy interpretation was of better quality than hard-copy interpretation and that hard copies did not aid the observers [14]. One image was observed at a time, each image was viewed only once per session, and observers were not allowed to review previous images.
Dose assessmentEntrance skin dose (excluding backscatter) was measured on the anthropomorphic chest phantom during each image acquisition with all digital chest systems (WD 10, Wellhöfer Dosimetrie). Subsequently, effective dose was calculated using a Monte Carlo simulation program, PCXMC (STUK) [15], using in addition to the measured entrance skin dose, the actual radiation quality (tube voltage and total filtration) and geometry (focus-skin distance, field size). Effective dose was calculated for a standard-sized patient (adult, length 174 cm, weight 71.1 kg).
Data Evaluation and Statistical Analysis
Two outcome reviewers evaluated the annotated images by placing the
matching standard-of-reference radiographs over the annotated radiographs. The
diagnostic performance of the imaging system was determined by the probability
of detecting a lesion by the observer and the probability that the judgment of
the observer was correct. Outcome scores for lesions were therefore rated
detected (true-positive) or not detected (false-negative). Annotated areas not
containing simulated lesions were scored as false-positive.
Possible differences between the digital imaging systems with respect to the probability of detecting simulated lesions were analyzed using a random-effect logistic regression modeling model [16]. This is a model for binary outcomes with normally distributed random effects. It takes between-observer variability into account and models the probability to detect lesions. We performed this model analysis by using the PROC NLMIXED function (TS level 01M0) in the SAS software version 8.01 (SAS Institute). Our analysis was corrected for location of lesions, lesion type, and image (sequence) number by first computing a model with parametric terms for these effects only. The analysis then proceeded in two steps. First, we evaluated the primary and a priori hypothesis of no difference in detection probabilities between techniques by expanding the first model to include the technique effect and calculating the likelihood ratio test (LR) with respect to the first model [17]. If we were able to reject the primary hypothesis, we proceeded with the logical subsequent a posteriori step of investigating which differences were most likely to explain rejection of the primary hypothesis. This secondary analysis is necessarily of a more exploratory nature and could be affected by both data-dredging and multiple-testing problems. Our results from this secondary analysis must therefore be cautiously interpreted as indications of where the largest differences may be. A Bonferroni correction has to be applied to the a posteriori results to adjust for multiple testing. To implement this correction, we report p values as calculated by the model but advise the application of a more stringent threshold of 0.007 instead of the more usual 0.05. The above approach is standard within classical statistics. For the evaluation of false-positive annotations, the same logistic regression modeling procedure was used, with corrections for image number and observer.
Results
Table 1 shows technical specifications and image acquisition parameters for the eight digital radiography chest systems. The acquisition parameters reflect normal clinical practice at the included hospitals.
Detection of Simulated Lesions
For the total group of imaging systems, statistically significant
differences were found concerning detection of simulated lesions. The LR for
the general null hypothesis of no difference between the imaging systems was
18.40 with 7° of freedom (Df) and p = 0.010.
No statistically significant differences could be found among imaging systems with respect to potentially different detection probability according to location of the simulated lesions (lungs or mediastinum) (LR = 11.7, Df = 7, p = 0.111). Likewise, no evidence could be found of such differences among systems with respect to the lesion type (LR = 13.5, Df = 14, p = 0.510).
The final best-fitting model for the present study thus distinguishes between detection probability according to machine type only in addition to the correction factors of location, lesion type, and image sequence number only. Table 2 shows estimated effects from this final model for 11 parameters, of which eight describe differences among machines with respect to the reference condition plus three parameters for location and lesion-type differences. The effects for image sequence difference corrections are not shown. The reference category was chosen as "detection of a tumor in the lungs with the imaging system for which most lesions were detected." The choice of the reference condition is rather arbitrary, does not affect the fit of the model in any way, and does not affect any of the above-mentioned LR testing procedures. Negative estimate values indicate that the odds, hence the probability, of detection decrease relative to the reference condition. Positive values indicate that the odds will increase.
|
The frequencies of simulated lesions detected with the different imaging techniques are shown in Table 3. Best performance was shown by the slot-scan CCD system with a sensitivity of 46% (132 of 288) lesions detected. Fewer lesions were detected with the CR system than with the other digital technologies. Sensitivity of the CR system was 34% (99 of 288) lesions detected. The Wald test for specific digital system comparisons with respect to the chosen slot-scan CCD reference condition is shown in Table 2. The difference between the CR system and the slot-scan CCD system has a p < 0.001 (smaller than our Bonferroni correction; adjusted a posteriori threshold, p < 0.01). In addition, the detection of simulated chest lesions was significantly worse with the lens-coupled CCD system and with the GOS-FPD system compared with the slot-scan CCD reference condition (Bonferroni correction, both p < 0.007). A statistically significant difference was not observed in the sensitivity (detection) of the CsI-FPD-1, Se-FPD, Se-drum, or CsI-FPD-2 systems compared with the slot-scan CCD reference condition (all p > 0.007).
|
False-Positive Interpretations
The frequency and percentages of false-positive interpretations are shown
in Table 3. No difference was
found in the frequency of false-positive interpretations among the eight
digital radiography systems (LR = 0, Df = 7, p > 0.999).
Additional Information
Additional statistically significant information that was derived from the
semiparametric logistic regression model, but without effect among the various
imaging systems, is worth mentioning. Lesions in the mediastinum were detected
less frequently than in the lungs. For all imaging systems sensitivity was 23%
(265 of 1,152) lesions detected in the mediastinum versus 58% (664 of 1,152)
lesions detected in the lungs (p < 0.001). Interstitial-linear
lesions were detected more frequently than tumors, and interstitial-nodular
lesions were detected less frequently than tumors. The latter two effects can
be explained by the differences in structure and opacity of the materials used
for lesion creation.
Radiation Exposure
The patient exposure for the different imaging techniques is shown in
Table 4. Patient dose among the
different digital imaging systems was rather variable and under these
conditions not related to diagnostic performance
(Fig. 2). The observed
variation was larger for the entrance skin dose than for the effective dose.
With CsI-FPD-1 and CsI-FPD-2, relatively low effective doses were calculated,
whereas the sensitivities (detection performance) of these systems were high.
With CR, a relatively high effective dose was calculated for imaging, however
the sensitivity of the CR system remained low.
|
|
Discussion
In the present study, the diagnostic performance of eight different digital radiography systems has been assessed for the detection of simulated chest disease. The study was designed to evaluate these various systems as used in routine clinical practice. Hence, the parameter settings used for image acquisition in the study were the same as in daily clinical practice for imaging patients. This included routine preprocessing setups. We consequently did not evaluate the systems with matched effective dose or detector dose but allowed for optimization of image quality and patient dose at each unit. In general, digital chest system manufacturers recommend dose levels that provide good imaging quality. Accordingly, the practical starting point of the present study was perceived good clinical image quality for each digital chest system, with a consequent variation of patient exposure at each digital system.
Little is known about the relationship between dose and clinical image quality for digital chest systems, but this relationship may be weaker than between physical image quality parameters and dose. It is a documented phenomenon that imaging doses may vary significantly without evident effect on diagnostic detection performance. This has been shown for dose reductions up to 65% compared with a 100% standard screen-film dose. Images made with only 35% of the standard dose resulted in increased visible noise but without significant effect on diagnostic quality [18, 19]. The same has been found for dose-increasing studies. Doubling the dose for CR bedside chest radiography did not result in improved diagnostic efficiency [20].
For statistical analysis, we chose logistic regression modeling of effect because the methodology is firmly based on principles of statistical inference and explicitly models and estimates an effect measurement. Furthermore, it has several advantages over receiver operating characteristic (ROC)-type approaches. One of these is that between-observer variability can be accounted for by introducing a random effect in the model. This random effect accounts for effects that would occur through implicit use of distinct between-observer thresholds, for example. Another advantage is that the method allows for the application of factorial design principles, which implies we no longer have to run the experiment at all combinations of factor levels (such as location, lesion types, and so on). Neither is there any requirement for each observer to view each configuration or image. Also, several distinct lesion types can be offered simultaneously at multiple and arbitrary locations on a single chest image (as occurs in clinical patient imaging as well) without the need for any grid or other device to steer the evaluation process. This allowed the observers to view and evaluate the images in conditions that are as close as possible to those in usual clinical practice.
The principal limiting factors for image quality in a digital radiographic system are noise and sharpness [7]. For digital systems, noise sources are mainly quantum noise and electronic noise [21]. Spatial resolution (sharpness) is expressed by the modulation transfer function and is inherently limited by pixel size. A generally accepted parameter describing the efficiency of a digital detector is the detective quantum efficiency (DQE). The DQE expresses the performance of a digital detector relative to a fictive ideal detector. The DQE is expressed as a function of spatial frequency, detector dose, and radiation quality. A higher DQE implies better performance [22]. The DQE is usually reported without taking into account the effect of a grid. The combination of grid and digital detector reduces the DQE of the imaging system by a factor that is correlated to the Bucky factor (the ratio of radiation incident on the grid to the transmitted radiation). Besides differences in detector technology, differences in image postprocessing used by the digital radiography systems could be of great importance for the performance of these systems [23, 24]. Quantitative comparison of the DQE of detectors is complicated since not enough standardized data are available.
The highest sensitivity for lesion detection in the present study was shown by the slotscan CCD system. The DQE of this system is comparable to CR, and therefore much lower than that of comparable indirect FPDs. However, the system compensates for this by a remarkable reduction of scattered radiation, leading to a high effective DQE of the system with high image quality [25]. With this slot-scan CCD system, better contrast-detail performance and improved lesion detection performance has been found compared with film-screen radiography [3]. In the present study, at an intermediate effective dose this slot-scan CCD system showed the best performance when compared with other digital radiography systems as well.
The detection performance with the two Sebased direct-conversion detectors (Se-FPD and Se-drum) and the two indirect CsI conversion detectors (CsI-FPD-1 and CsI-FPD-2) was not significantly different from the slot-scan CCD technique. Good detection performance of Se-based direct-conversion detectors has been attributed to the conversion process in Se, which is virtually free of intrinsic noise sources [7]. The relatively high DQE for CsI scintillators can be explained by the high atomic number and density for CsI [2628] and by greatly reduced light spreading in the needlelike structured phosphor. In addition, a thick phosphor layer may be used, increasing the potential DQE of a CsI detector [2].
Three systems showed significantly poorer detection performance than the slot-scan CCD system: the CR-, lens-coupled CCD-, and GOS-FPD systems. Previous studies have shown better observer-preference performance and detection performance for direct-readout systems (Se-drum detectors and FPDs) compared with CR [513]. The lower DQE for CR systems has been attributed to internally generated noise [7, 9, 29]. With lens-coupled CCD systems, a lens is needed to project a certain imaging field onto a smaller CCD chip. The process of demagnification results in a low optical coupling efficiency with consequently relatively low DQEs [2, 4, 30, 31]. The image quality performance of a granular GOS phosphor screen is lower than can be reached with a structured CsI phosphor screen with similar thickness, with much lower DQE for the granular screen under similar circumstances [32].
The observers interpreted all images in their usual environment but all used the same state-of-the-art workstation (Dome, Type 3C, Planar Imaging Systems). To standardize the method of image interpretation as much as possible, and because the simulated lesions were rather large, the observers were not allowed to magnify the images. With this method, the matrix of the monitor limited the spatial resolution to a pixel size of 200 µm. Digital radiography simulation studies have shown that for the human observer, pixel sizes of 100 and 200 µm give similar results for detection of objects ranging in size from 0.1 to 20 mm. Accordingly, for digital radiography with low-contrast objects, a pixel size of 200 µm has been found sufficient [33]. Because pixel sizes for all detector systems were within the range of 139 to 200 µm and taking into account the rather large size of the simulated lesions, it is unlikely that the results for lesion detection in our study would have been different had magnification been allowed.
Study Limitations
Our study may have had some limitations. A single chest phantom was used
for imaging, whereas the sheets containing the various lesions changed. The
observers might thus have learned phantom chest characteristics as background
on which changes were depicted but similarities not. This may explain the
relatively low number of false-positive interpretations. Nevertheless, we
doubt if the use of a single chest phantom had a substantial influence on the
main study outcome. A possible learning effect was the same for all observers
and was corrected for by the study design.
Since the parameter settings for individual systems were the same as in daily clinical practice for imaging patients, comparison was not done under identical conditions. Consequently, we cannot rule out that the performance per system may have been substantially affected by dose. Furthermore, we cannot ensure that the differences observed in the present study are not influenced by possible inappropriate setup parameters. In fact, guidelines for optimal parameter setup have not been defined yet.
Differences among digital chest systems with regard to technique and clinical implementation manifest in their diagnostic performance. This study shows significant differences in simulated lesion detection among various digital imaging systems in clinical practice. Differences in detection rate are predominantly explained by detector design.
Acknowledgments
We gratefully acknowledge the following institutions and physicists for their cooperation and the participation of the radiologists on the panel of observers: from the Erasmus Medical Center, Rotterdam, The Netherlands: F. van der Meer, N.S. Renken, M.G.J. Thomeer. From the VU Medical Center, Amsterdam, The Netherlands: M. Hofman, R.A. Manoliu, E.A. Heilbron. From the Medical Center Leeuwarden, Leeuwarden, The Netherlands: W. Hummel, J.J. Riemersma. From Bronovo Hospital, The Hague, The Netherlands: W. Stam, N.J.M. Aarts. From Maxima Medical Center, Veldhoven, The Netherlands: J.R. van der Tillaart, J.P.G. Weerdenburg. From the Leiden University Medical Center, Leiden, The Netherlands: A. de Roos.
References
This article has been cited by other articles:
![]() |
F. A. Mettler Jr, W. Huda, T. T. Yoshizumi, and M. Mahesh Effective Doses in Radiology and Diagnostic Nuclear Medicine: A Catalog Radiology, July 1, 2008; 248(1): 254 - 263. [Abstract] [Full Text] [PDF] |
||||
![]() |
L J M Kroft, W J H Veldkamp, B J A Mertens, J-P A van Delft, and J Geleijns Dose reduction in digital chest radiography and perceived image quality Br. J. Radiol., December 1, 2007; 80(960): 984 - 988. [Abstract] [Full Text] [PDF] |
||||
![]() |
M. Korner, C. H. Weber, S. Wirth, K.-J. Pfeifer, M. F. Reiser, and M. Treitl Advances in Digital Radiography: Physical Principles and System Overview RadioGraphics, May 1, 2007; 27(3): 675 - 686. [Abstract] [Full Text] [PDF] |
||||
![]() |
L. J. M. Kroft, W. J. H. Veldkamp, B. J. A. Mertens, J. P. A. van Delft, and J. Geleijns Detection of simulated nodules on clinical radiographs: dose reduction at digital posteroanterior chest radiography. Radiology, November 1, 2006; 241(2): 392 - 398. [Abstract] [Full Text] [PDF] |
||||
| ||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
| HOME | HELP | FEEDBACK | SUBSCRIPTIONS | ARCHIVE | SEARCH | TABLE OF CONTENTS |