|
|
||||||||
Original Research |
1 Department of Radiology, Pitié-Salpêtrière Hospital,
Assistance Publique—Hôpitaux de Paris, University Pierre et Marie
Curie, Paris VI, 47-83 bd de L'Hôpital, 75651 Paris, Cedex 13,
France.
2 R2 Technology, Inc. (now Hologic), Santa Clara, CA.
3 Present address: Vital Images, Minnetonka, MN.
Received October 10, 2006;
accepted after revision May 12, 2007.
Address correspondence to P. A. Grenier
(philippe.grenier{at}psl.aphp.fr).
Abstract
|
|
|---|
4 mm on pairs of MDCT
chest screening examinations using a computer-aided detection (CAD)
system.
MATERIALS AND METHODS. Of 54 pairs of low-dose MDCT chest
examinations (1.25-mm collimation), two chest radiologists in consensus
established that 25 examinations contained 52 nodules
4 mm. All paired
examinations were interpreted on the CAD workstation—first without and
then with CAD input—for the detection and tracking of lung nodules. A
subset of 33 examination pairs was later read on the clinical workstation used
in daily practice, and the results were compared for reading time with those
on the CAD workstation.
RESULTS. After CAD input, the sensitivity for nodule detection
increased statistically significantly for both readers (9.6% and 23%;
p
0.025). One cancer initially missed by one radiologist was
correctly identified with CAD input. The overall reading time on the CAD
workstation and clinical workstation was comparable for both radiologists. On
average, readers spent 4–5 minutes per case to read the paired
examinations on the CAD workstation and 6–8 seconds per CAD mark. The
CAD system successfully matched 91.3% of nodules detected in both
examinations. The overall rate of available CAD growth assessment was 54.9% of
all nodule pairs.
CONCLUSION. In the context of temporal comparison of MDCT screening
examinations, the sensitivity of radiologists for detecting lung nodules
4 mm increased significantly (p
0.025) with CAD input without
compromising reading time.
Keywords: computer-aided detection (CAD) follow-up CT high-resolution MDCT lung nodules reading time
|
|
|---|
Because most nodules identified on CT scans in a screening population prove ultimately to be benign, monitoring nodule growth on follow-up CT scans is a widely accepted approach to evaluate indeterminate nodules measuring 8 mm or less in diameter [20]. On the basis of improved registration of images, automated computer assessment provides the opportunity to rapidly identify and compare nodules over time to detect changes in size. CAD also provides more accurate and reproducible measurement of nodules than does human observation [21–25]. This has clear implications for the use of CT to monitor the growth of small nodules in particular.
The objective of this study is to assess the performance of a commercially available automated CAD system for lung nodules on follow-up MDCT examinations in terms of nodule detection, tracking changes in size, and reading time.
|
|
|---|
|
|
|
Four subjects underwent surgical resection due to suspicious nodular findings. One had a focal persistent ground-glass opacity with a greatest axial diameter of 19 mm, which turned out to be benign (desquamative interstitial pneumonitis with respiratory bronchiolitis and atypical adenomatous hyperplasia). Adenocarcinomas were found in the other three patients. One patient, a 77-year-old man, had an ill-defined nonspecific ground-glass density adjacent to a peripheral pulmonary vessel in the right upper lobe on the baseline examination, which 2 years later became a solid nodule with irregular margins containing air bronchograms with a doubling time consistent with a malignant lesion (cancer # 1) (Fig. 1A, 1B, 1C). The second patient, a 58-year-old man, had, in retrospect on the baseline examination, nonspecific thickening of the wall of a right upper lobe emphysematous bulla, which became a 9-mm solid nodule with irregular margins on the surveillance CT scan obtained 15 months later (cancer # 2) (Fig. 2A, 2B, 2C). The third patient, a 72-year-old woman, had a left lower air-space consolidation with a 12-mm nodular component. Persistence of the nodule after antibiotic therapy combined with positive findings on dynamic contrast-enhanced CT lead to surgical resection (cancer # 3).
|
|
|
CAD System
The CAD system used in this study (ImageChecker CT CAD System, V 2.0, R2
Technology, Inc. [now Hologic]) is designed to detect solid lung nodules from
4 to 30 mm in diameter on MDCT chest examinations. Optimal CAD performance
requires collimation
3 mm, constant slice interval spacing, and dose
10 mAs. The method of use and performance of the automated detection aspects
of the CAD system have been previously described
[7,
8,
10].
In addition to automatic nodule detection, this CAD system also provides two additional features designed to track nodule growth over time. First, unidimensional, bidimensional, and tridimensional (volumetric) measurements are automatically provided for all CAD-identified nodules. For nodules identified by CAD but not marked for the reader, measurements can be displayed with the "probe" tool. For nodules identified by the reader but not by CAD, the nodule must be outlined on the axial slice showing its largest size, the calculated volume being assumed to be a sphere. Second, a temporal comparison tool provides automatic matching of the nodule on the current examination with the nodule (if present) on the prior examination. In addition, the sizes of the matched nodules are computed to provide an assessment of interval change in size, if any.
Interpretations using CAD input were performed on the CAD system's dedicated workstation. The major portion of the screen is devoted to the axial images of the examination, which are read in the usual fashion to allow an unbiased initial reading of the examination. CAD marks are not displayed until requested by the reader. Volumetric measurements of CAD-detected nodules, diameter, average density in Hounsfield units, interval changes in nodule size over time, and percentage value of volumetric growth are also provided.
In addition to the standard 2D axial images, two other views are provided on two smaller screens. In the upper left corner, a lung map corresponding to a maximum intensity projection (MIP) view of the coronal images references the location of the CAD marks. In the lower left corner, an interactive 3D view of the CAD-detected nodule (highlighted in green) is displayed together with the surrounding pulmonary vasculature, which can be rotated in real time to permit the reader to decide if the candidate nodule is truly a real nodule and, if so, to determine if the CAD segmentation correctly includes only that portion of the nodule without adjacent nonnodular structures, such as adjacent vessels and pleural surfaces.
|
Reader Study
One senior thoracic radiologist (O1), and one radiology resident with 5
years of experience, including 1 year in thoracic radiology (O2),
independently read all 54 examination pairs (current and prior) on the CAD
workstation. The readers were asked to detect all lung nodules greater than 2
mm on the current examination as defined on the screening protocol and then to
evaluate any interval change in size since the most recent prior examination.
They were encouraged to read the cases as if they were in the environment of a
busy clinical practice. We recorded the time spent by both observers to read
both examinations and complete the follow-up examination. An independent
observer timed the initial loading of the current case until completion of the
reading of both current and prior cases by the radiologist (including reading
CAD marks when applicable).
The readers first read the current examination with CAD turned off but with
use of all workstation tools available, including the probe tool and MIP of
5-mm thickness if necessary. All nodules
2 mm detected by each reader
were probed, and if the nodule had been identified by CAD, all automated
measurements were automatically added to the user list. If not, the user could
manually outline the margins on the central slice of the nodule and obtain the
area and estimated volumetric measurement. Once the reading without CAD input
was completed, CAD was turned on. This CAD system automatically scrolls to the
axial slice that contains the largest area for each CAD-identified candidate
nodule, at which time the reader can accept or dismiss the CAD finding.
Once the reading of the current scan was completed, the temporal comparison tool was activated. The radiologist proceeded by reading all CAD findings detected on the most recent prior examination, checking for nodules that could have disappeared or been missed in the current examination. If a nodule was automatically matched in both current and prior examinations, then the synchronization was judged optimal and no manual correction was necessary. For all other situations, we noted the precision of the CAD image synchronization (registration) tool by recording the number of slices of manual correction necessary to reach the nodule's central slice in the prior examination.
The 33 most recent examination pairs acquired at times (t, t–1), that
had already been evaluated on the CAD workstation were subsequently read on
our clinical workstation (Advantage Windows AW 4.2, GE Healthcare) by the same
radiologists (O1, O2) 3 weeks later to minimize any memory bias. These
interpretations served as our reference standard. The reading sessions
attempted to emulate our routine practice as follows. Reading of the current
examination was performed with the help of MIP if needed. Each nodule was
reported, including its size (assessed according to the RECIST criteria
[27], i.e., the greatest axis
of the nodule, using an electronic caliper), location (lobar distribution),
distance from the pleura, and density (solid, calcified, nonsolid). After
reading the current examination, the most recent prior examination was loaded,
and synchronization of every identified nodule on the current examination was
done by manually scrolling to the equivalent anatomic location on the prior
examination. The corresponding image number and nodule size in the prior
examination were then manually recorded. We chose a CAD volumetric growth
26% or a doubling time
500 days to be two confident thresholds indicative
of malignant growth [25,
28,
29].
It is important to note that the matching done for interpretations on the clinical workstation was performed on only those nodules found in the current examination (i.e., comparison of current to prior); whereas, matching done for interpretations on the CAD workstation was performed on nodules found on both examinations (i.e., comparison of current to prior and then of prior to current).
Ground Truth and Consensus
The criteria for the diagnosis of a pulmonary nodule were defined as a
well-demarcated, solid, spherical, ellipsoid (length
3 times width), or
more irregular and complex opacity. Only nodules with an average diameter
larger than 2 mm were included in the study as required by our screening
protocol for high-risk patients. Ground-glass opacities (part solid,
nonsolid), which are not detected by the CAD software, were excluded from
consideration.
During a dedicated session, both observers O1 and O2 conjointly read all nodules detected in their independent reading sessions, those candidate nodules marked by CAD and those identified in the official written clinical report. All nodules validated by consensus of O1 and O2 defined the ground truth.
Nodule location was defined as follows: juxtapleural (a portion of the
nodule's circumference abutted a pleural surface, i.e., chest wall, diaphragm,
mediastinum, and fissure), peripheral (distance < 20 mm from pleural
surfaces), and central (distance
20 mm from pleura).
Statistical Analysis
The statistical analysis consisted mostly in the use of McNemar's test for
the sensitivity analysis and the paired Student's t test for the
reading time analysis [30].
Statistics were computed using a statistical software package (JMP 6.0, SAS).
All tests were performed two-tailed with p values less than 0.05
indicating statistical significance.
|
|
|---|
2 mm were
identified (Fig. 3). However,
because the CAD system was designed to detect solid lung nodules
4 mm in
size, all results related to detection included only those 52 nodules in the
54 current cases that met this size criterion (range, 4.0–11.9 mm in
diameter; mean, 5.2 ± 1.6 mm [SD]; median, 4.6 mm). No nodules
4
mm were present in 29 (53.7%) of the 54 examinations. The remaining 25
examinations contained 52 nodules (average, 2.1; range, 1–7 nodules per
case), of which 38.5% (20/52) were juxtapleural, 36.5% (19/52) were
peripheral, and 25.0% (13/52) were central.
Nodule Detection on the Current Examinations
CAD standalone sensitivity for the 52 solid nodules
4 mm that were
present in the 25 current examinations was 65.4% (34/52). The 18 nodules
missed by CAD ranged between 4.0 and 11.9 mm in size and their locations were
juxtapleural (n = 4), peripheral (n = 9), and central
(n = 5). As is customary when evaluating CAD algorithm performance,
the CAD false-marker rate is determined on normal cases. In our study, there
were 29 current examinations that contained no nodules
4 mm. The
false-marker rate for these cases was 3.4 false marks per examination (100
false marks per 29 examinations). Of note, two examinations alone were
responsible for 19% (19/100) of all false marks—a penalty of about 0.6
false mark per examination due to severe leaking of chest wall
segmentation.
Sensitivities of the two readers (O1 and O2) before CAD input (Table 1) compared with the standalone CAD sensitivity of 65.4% were comparable for O1 (57.7%) and inferior for O2 (46.2%, p = 0.03). Sixteen nodules, ranging in size from 4.0 to 6.4 mm in diameter, were missed by both O1 and O2. Their locations were juxtapleural (n = 7), peripheral (n = 5), and central (n = 4). During unaided reading, the readers identified six nodules that were not detected by CAD but that were automatically segmented using the probe tool.
|
However, after CAD input the readers' (O1 and O2) sensitivities increased by 9.6% and 23%, respectively, which is statistically significant for both readers (p = 0.025 and p = 0.0005, respectively). Ten nodules (10/52, 19.2%) ranging from 4.0 to 6.4 mm were identified by CAD but missed by both O1 and O2. As a result, CAD has potential value as a third reader—that is, CAD input further increased the sensitivity of simulated independent double reading, assuming that the combined sensitivity for O1 plus O2 of 69.2% (36/52) increased to 86.5% (45/52) for O1 plus O2 plus CAD, which is statistically significant (p = 0.003). The sensitivity of both O1 and O2 without CAD input was studied on 33 temporal examinations and found comparable on both the CAD workstation and the clinical workstation (Table 2).
|
Both observers and CAD correctly detected cancer # 1 at times t and t–1. Cancer # 2 was missed at examination t–1 by O1 during reading on the CAD workstation (without CAD) but correctly identified with CAD input. Of interest, the radiologist who generated the official clinical report also missed this cancer. Cancer # 3 was also correctly identified by both observers at times t and t–1 but was missed by CAD at time t due to a nonspecific alveolar consolidation appending the nodule.
Performance of the Temporal Comparison Tool
Fifty-one nodules out of 52 were present on both the current and most
recent prior examinations. The CAD system automatically detected and
successfully matched 21 paired nodules (21/51, 41.2%) on both examination
pairs (Table 3). In addition,
CAD identified two nodules that were matched to a wrong nodule in the prior
examination. As a result, for those 23 nodule pairs detected by CAD, correct
nodule matching was achieved for 91% (21/23) of these paired examinations.
|
The overall rate of available CAD growth assessment was 54.9% (28/51) if
one includes seven nodules not marked by CAD but which were probed by the
readers. In five of these 28 examination pairs, the automated CAD assessment
of interval change in nodule size was suggestive of a malignant lesion, as
previously defined [25,
28,
29]. We chose a CAD volumetric
growth
26% or a doubling time
500 days to be two confident
thresholds indicative of malignant growth
[25,
28,
29].
Three nodules were biopsied and proven to be lung cancer. Because no significant change in 2D manual measurements had been noted in the clinical report, the remaining two nodules were not sampled, so no data as to their etiology are available. Of the remaining 23 examination pairs, two nodules were matched to a wrong nodule in the prior examination (2/51, 3.9%). Twenty-one nodules were not detected on one of the two paired examinations (21/51, 41.2%) and were juxtapleural (7/51, 13.7%) and juxtavascular (14/51, 27.5%) in location. In both situations, CAD assessment of nodule growth was not possible. In most instances (n = 19), these nodules were not differentiated from surrounding vessels because of their small size, their oblong or irregular shape, and/or their position in a vessel bifurcation or trifurcation. The CAD classifier rejected two other nodules because of segmentation leaking into the fissures.
Reading Time
Reading time was assessed considering all nodules greater than 2 mm as is
done in clinical practice. The overall reading time for the readers (O1 and
O2) on each of the workstations (CAD workstation and clinical workstation) is
reported in Table 4. On
average, observers O1 and O2 spent 4–5 minutes per case to read the
paired examinations. Note that, in this study, the readers only had to search
for nodules. In clinical practice, radiologists have to deal with a more
complex and time-consuming task; they need to search as well for other
abnormalities in the lung, mediastinum, and chest wall including abdominal
organs and other structures. No statistical difference in reading times among
readers or workstations (clinical workstation vs CAD workstation, first
without and then with CAD input) was observed. More specifically, the time
spent per CAD mark (CAD false-positive marks plus additional CAD true-positive
marks on nodules not initially detected by the reader) was distributed as
follows: O1 and O2 spent, on average, 7.3 seconds and 6.0 seconds per CAD
mark, respectively. The time required to assess all CAD marks per case was, on
average, 39 seconds (Table
5).
|
|
|
|
|---|
4 mm was 57.7% and 46.2%, respectively. This low
sensitivity might be explained in part by the large proportion of small
nodules in our screening database (mean size, 5.2 mm). As extensively pointed
out in the literature [1],
interpretation mistakes, complex anatomic areas such as hilar regions, lack of
concentration, disturbances, and fatigue are major sources of errors.
The CAD standalone sensitivity was 65.4% (34/52) for nodules
4 mm,
including three nodules in three examinations (3/54, 5.6%) not identified by
both observers. These results are consistent with those reported by Yuan et
al. [10] using the same CAD
system in a lung cancer screening program. In that study, CAD detected 72.6%
(456/628) of nodules
4 mm and detected nodules in six (4%) of 150
examinations that were not prospectively identified by radiologists, changing
the imaging follow-up protocol of those subjects.
When used as a second reader, CAD input increased the sensitivity of both
readers by 9.6% and 23%, respectively, which was statistically significant
(p
0.025 and p
0.0005, respectively)
(Table 1). In our study, the
improved performance of each reader using CAD was comparable to the simulated
improved performance when the findings of both readers were
combined—that is, human double reading (p
0.76). This
differs from the statistically significant improvement for CAD as a second
reader over human double reading noted by others
[9,
31].
The CAD system used in this study was trained to detect actionable lung nodules—that is, nodules that radiologists interpret as warranting surveillance or intervention. A few retrospective studies using databases of missed cancers have shown the potential of CAD to increase the detection of missed malignant lesions. Armato et al. [3] showed that CAD, in the context of a screening program, identified 84% of 38 missed cancers, clearly supporting its benefit as a second reader. Li et al. [32] reported that the area under the receiver operating characteristic (ROC) curve (Az) value for all radiologists improved significantly from 0.763 to 0.854 (p = 0.002) with the aid of the CAD scheme. In our study, one of the three malignant lesions (cancer # 2) was missed during the course of the reader study by the most experienced radiologist (O1). However, this oversight was corrected by use of the CAD system, thus positively changing the follow-up management for the patient.
Although CAD input increases reader sensitivity for nodule detection, the impact of CAD on reader workflow and productivity needs to be addressed. This is especially true in the context of chest CT follow-up examinations. Temporal comparison of the current examination with the most recent prior examination is a time-consuming and tedious task for which a computerized automated system has the potential to improve efficiency. The CAD system used in our study also provides automatic image registration (or synchronization), nodule matching, and interval change in size assessment, all of which are usually done manually.
In our evaluation of image synchronization on 120 nodules distributed over 54 CT image pairs, the 3D global registration using an affine transformation was considered as acceptable and robust in most cases, with an average manual correction of 8.2 ± 11.4 [SD] axial slices (median, 4 slices). However, this type of modeling fails to capture nonlinear and nonuniform deformations of the lungs. Shen et al. [33] studied 16 CT image pairs using adjacent anatomic structures to generate the most likely corresponding location and found that the registration accuracy improved by a factor of two. Given a nodule's location on an initial scan, the real-time automated nodule detection system could then predict the precise location of this same nodule on follow-up studies within five 1-mm axial slices 88.2% of the time. For sake of comparison, our (global) rigid registration procedure proved to be within 22 1-mm axial slices 88% of the time. Another factor to consider on the quality of image registration is the presence of severe lung diseases or severe low-dose streak artifacts that may affect chest wall segmentation or localization of key anatomic land-marks such as the carina. Overall, we did not note any statistical differences in nodule registration accuracy between the apices, the middle portion of the lungs, and the lung bases or any impact of patient tilt in the population studied.
Performance of nodule matching is closely related to the quality of image registration. In our study, nodule matching was achieved for 91% (21/23) of those nodules detected by CAD, which is consistent with other evaluations of automatic nodule tracking systems: 100% [34], 97% [35], 95% [36], and 81% [15]. We did not study performance of nodule matching in cancer screening examinations other than lung cancer, for example in cases with pulmonary metastases, which may have greater numbers of nodules and pose a more complex problem for automated nodule matching.
The overall rate of available volumetric assessment was only 55% (28/51), which contrasts with a previous report of 86% (76/88) with the same system [34]. It is explained in large part by the CAD failure to detect and differentiate small 4-mm nodules that have contact with adjacent anatomic structures. Full reliance on this CAD system's detection to perform nodule growth assessment may not be appropriate in situations such as screening examinations, given the system's 4-mm nodule threshold.
Most radiologists currently rely on manual unidimensional and bidimensional
measurements with electronic calipers to assess nodule size and interval
change in size over serial examinations. However, measurement errors with
handheld and electronic calipers lead to large interobserver variations
[21,
22]. Automated measurements
not only reduce such interobserver variability but also produce more accurate
measurements [22]. Further,
there is increasing evidence that volumetric measurements for assessing nodule
growth are more accurate than standard 2D methods
[23,
24]. Revel et al.
[29] reported that
software-calculated volumetric doubling times greater than 500 days computed
for 63 nodules scanned at 1.25-mm collimation had a 98% negative predictive
value for the diagnosis of solid malignant pulmonary nodules. We chose a
volumetric growth
26% and a doubling time
500 days to be two
confident thresholds indicative of malignant growth. In our study, five of 28
nodules that were correctly matched showed an automated volumetric growth or
doubling time compatible with a malignant process—of these five nodules,
three were sampled and showed cancer. Note that manual measurements conducted
under the screening protocol (RECIST) did not reveal any significant changes
in size for the remaining two nodules.
In our study, there were 3.4 CAD false marks per normal case, which is comparable to results reported in the literature [1, 2]. Most CAD false marks were readily dismissed (Table 5). The average reading time per CAD mark was between 6 and 8 seconds, which is in accordance with the times recommended by Rubin et al. [9]. Image interpretation with CAD input increased, on average, 39 seconds per examination, which is comparable with the 49 seconds reported by Wormanns et al. [18]. Notably, 19% (19/100) of all CAD false marks were found in 7% (2/29) of normal examinations, due primarily to severe chest wall segmentation errors. The influence of low-dose CT acquisitions on the detection and segmentation of pulmonary nodules has already been studied [37, 38], but further work is needed to evaluate the impact of low-dose streak artifacts on the robustness of CAD chest wall segmentation, which would contribute to the CAD false marker rate.
The overall time to review an examination pair (first read the current examination, then read the CAD marks on the current examination, followed by comparison of the current examination findings with the prior examination, and nodule matching and size assessment) was comparable on both the CAD and the clinical workstations. Note that the reading on the clinical workstation did not involve reading CAD marks and that matching and measurement were not done the same way on the two workstations—that is, they were done manually on the clinical workstation and automatically on the CAD workstation.
This result is not surprising given the small number of nodules
2 mm
per case (< 4 nodules on average). In addition, on the CAD workstation, CAD
marks were not only read on the current scan and matched with the
corresponding nodule on the prior scan, but CAD marks on the prior scan were
also independently assessed and matched with nodules on the current scan in
case these nodules had not been detected or were not present on the current
examination. This additional effort would contribute to penalizing the reading
time on the CAD workstation compared with the reading time on the clinical
workstation. Although the reading time was similar on both workstations, one
has to consider the added value of CAD, which not only contributes to
increased nodule detection but also provides automatic volumetric measurements
versus only unidimensional or bidimensional measurements for the clinical
workstation in our study. Further investigation is needed to assess whether
the CAD system with automatic nodule matching can improve the reading time
when applied on a different patient population with a greater nodule incidence
(e.g., oncology patients).
Our study had several limitations. First, we limited the nodule definition
to noncalcified solid nodules
4 mm in size. It is important to know that
this CAD software is not designed to detect ground-glass opacities. Among the
33 subjects, seven had one ground-glass opacity greater than 4 mm. Three
ground-glass opacities (
6.1 mm in diameter) required a follow-up and one
part-solid ground-glass opacity measuring 19 mm in diameter underwent surgical
resection but turned out to be benign. As expected, CAD detected none of these
groundglass opacities. Second, the database used in this study was composed of
a limited number of paired temporal examinations (n = 54), which in
turn contained a small number of nodules
4 mm (n = 52). But
despite this limitation, the impact of CAD detection was statistically
significant.
In conclusion, the sensitivity of both observers for detecting solid
nodules
4 mm on MDCT chest examinations increased by 9.6% and 23% with
CAD input, which was statistically significant (p = 0.025 and
p = 0.0005, respectively). This increase in sensitivity was in the
context of temporal comparison of current and prior low-dose MDCT examinations
in a lung cancer screening program and occurred without compromising the
reading time. Finally, the potential of CAD to assess more accurately the
growth of indeterminate nodules may prove useful in allowing an earlier
decision for intervention.
|
|
|---|
This article has been cited by other articles:
![]() |
F. Girvin and J. P. Ko Pulmonary Nodules: Detection, Assessment, and CAD Am. J. Roentgenol., October 1, 2008; 191(4): 1057 - 1069. [Abstract] [Full Text] [PDF] |
||||
| ||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
| HOME | HELP | FEEDBACK | SUBSCRIPTIONS | ARCHIVE | SEARCH | TABLE OF CONTENTS |