Magnetic resonance classification of lumbar intervertebral disc degeneration.
- PubMed: 11568697
Abstract
STUDY DESIGN: A reliability study was conducted. OBJECTIVES: To develop a classification system for lumbar disc degeneration based on routine magnetic resonance imaging, to investigate the applicability of a simple algorithm, and to assess the reliability of this classification system. SUMMARY OF BACKGROUND DATA: A standardized nomenclature in the assessment of disc abnormalities is a prerequisite for a comparison of data from different investigations. The reliability of the assessment has a crucial influence on the validity of the data. Grading systems of disc degeneration based on state of the art magnetic resonance imaging and corresponding reproducibility studies currently are sparse. METHODS: A grading system for lumbar disc degeneration was developed on the basis of the literature. An algorithm to assess the grading was developed and optimized by reviewing lumbar magnetic resonance examinations. The reliability of the algorithm in depicting intervertebral disc alterations was tested on the magnetic resonance images of 300 lumbar intervertebral discs in 60 patients (33 men and 27 women) with a mean age of 40 years (range, 10-83 years). All scans were analyzed independently by three observers. Intra- and interobserver reliabilities were assessed by calculating kappa statistics. RESULTS: There were 14 Grade I, 82 Grade II, 72 Grade III, 68 Grade IV, and 64 Grade V discs. The kappa coefficients for intra- and interobserver agreement were substantial to excellent: intraobserver (kappa range, 0.84-0.90) and interobserver (kappa range, 0.69-0.81). Complete agreement was obtained, on the average, in 83.8% of all the discs. A difference of one grade occurred in 15.9% and a difference of two or more grades in 1.3% of all the cases. CONCLUSION: Disc degeneration can be graded reliably on routine T2-weighted magnetic resonance images using the grading system and algorithm presented in this investigation.
Author-supplied keywords
Magnetic resonance classification of lumbar intervertebral disc degeneration.
©2001, Lippincott Williams & Wilkins, Inc.
Magnetic Resonance Classification of Lumbar
Intervertebral Disc Degeneration
Christian W. A. Pfirrmann, MD,* Alexander Metzdorf, MD,† Marco Zanetti, MD,*
Juerg Hodler, MD,* and Norbert Boos, MD†
Study Design. A reliability study was conducted.
Objectives. To develop a classification system for lum-
bar disc degeneration based on routine magnetic reso-
nance imaging, to investigate the applicability of a simple
algorithm, and to assess the reliability of this classifica-
tion system.
Summary of Background Data. A standardized no-
menclature in the assessment of disc abnormalities is a
prerequisite for a comparison of data from different in-
vestigations. The reliability of the assessment has a cru-
cial influence on the validity of the data. Grading systems
of disc degeneration based on state of the art magnetic
resonance imaging and corresponding reproducibility
studies currently are sparse.
Methods. A grading system for lumbar disc degener-
ation was developed on the basis of the literature. An
algorithm to assess the grading was developed and opti-
mized by reviewing lumbar magnetic resonance exami-
nations. The reliability of the algorithm in depicting inter-
vertebral disc alterations was tested on the magnetic
resonance images of 300 lumbar intervertebral discs in 60
patients (33 men and 27 women) with a mean age of 40
years (range, 10–83 years). All scans were analyzed inde-
pendently by three observers. Intra- and interobserver
reliabilities were assessed by calculating kappa statistics.
Results. There were 14 Grade I, 82 Grade II, 72 Grade
III, 68 Grade IV, and 64 Grade V discs. The kappa coeffi-
cients for intra- and interobserver agreement were sub-
stantial to excellent: intraobserver (kappa range, 0.84–
0.90) and interobserver (kappa range, 0.69–0.81).
Complete agreement was obtained, on the average, in
83.8% of all the discs. A difference of one grade occurred
in 15.9% and a difference of two or more grades in 1.3% of
all the cases.
Conclusion. Disc degeneration can be graded reliably
on routine T2-weighted magnetic resonance images us-
ing the grading system and algorithm presented in this
investigation. [Key words: disc degeneration, interverte-
bral disc, magnetic resonance imaging, reliability] Spine
2001;26:1873–1878
Magnetic resonance imaging (MRI) is the most impor-
tant method for the clinical assessment of intervertebral
disc pathology. The signal characteristics of the disc in
T2-weighted MRIs reflect changes caused by aging or
degeneration.
14,16,20
A standardized nomenclature in the assessment of disc
alterations is a prerequisite for comparison of data from
different investigations.
3
A morphologic grading system
relating to the pathologic changes in the disc is needed.
9
The reliability (interobserver and intraobserver repro-
ducibility) of the assessment has a crucial influence on
the validity of the data. However, reproducibility studies
for the assessment of intervertebral disc degeneration
currently are sparse despite their clinical importance.
The MRI technique is developing continuously. Fast
spin-echo (FSE) imaging was introduced in spinal MRI
during the early 1990s
12
and now is widely used. As
compared with the conventional spin-echo technique,
FSE imaging offers a significant reduction in scanning
time, a better signal-to-noise ratio, and fewer motion
artifacts.
1
However, conventional spin-echo images can-
not be compared directly with FSE images, and disc dis-
ease may not have exactly the same signal characteristics
on these two types of sequences. On FSE images, the
contrast between fat- and fluid-containing structures is
less than on spin-echo images,
1,5
and the normal inter-
vertebral disc has lower signal intensity on FSE images.
Most prior classification systems of degenerative disc dis-
ease were developed and tested on spin-echo MRIs.
3,17
The objective of this investigation was to develop a
classification system for lumbar disc degeneration ob-
served on MRIs, to investigate the applicability of a sim-
ple algorithm, and to assess the reliability of this classi-
fication system.
Methods
Grading System and Algorithm for the Lumbar Interver-
tebral Discs. A comprehensive grading system for lumbar disc
degeneration (Table 1, Figure 1) was developed by the senior
author on the basis of the literature and previously published
work.
2,3,9,13,23
The feasibility of an algorithm to assess the
grades of disc degeneration was tested, and the algorithm was
optimized (Figure 2) by reviews of lumbar MRI examinations
during routine clinical work.
Participants. The study involved lumbar MRIs of 60 patients
(33 men and 27 women) with a mean age of 40 years (range,
10–83 years). Over a period of 3 weeks, 40 routine MRI scans
of the lumbar spine were collected consecutively. To ensure an
adequate number of nondegenerated and adolescent discs, 20
examinations of patients between the ages of 10 and 20 years
were randomly selected from a database of examinations at the
same institution in 1999. All the patients included in this study
presented initially to the outpatient spine clinic, then were re-
From the *Division of Musculoskeletal Radiology, and the †Depart-
ment of Orthopedic Surgery, Orthopedic University Hospital, Balgrist
Zurich, Switzerland.
Supported by grant 32-52927.97 from the Swiss National Science
Foundation.
Acknowledgment date: August 16, 2000.
First revision date: November 27, 2000.
Acceptance date: January 1, 2001.
Device status category: 11.
Conflict of interest category: 14.
1873
used for the development and optimization of the algorithm.
Imaging Technique. The MRIs of the lumbar spine were
performed on a 1-T scanner (Siemens Impact Expert; Siemens
Medical Systems, Erlangen, Germany) using a dedicated re-
ceive only spine coil. The imaging protocol included sagittal
T1-weighted spin-echo (repetition time [TR] 700 msec/echo
time [TE] 12 msec) and T2-weighted FSE (TR 5000 msec/TE
130 msec) images with the following parameters: matrix,
5123 225; field of view, 2253 300 mm; slice thickness, 4 mm;
interslice gap, 0.8 mm; number of excitations, 4; echo train
length (ETL), 15 (the first echo of this sequence is discarded),
and axial T2-weighted axial FSE scans (TR 5000 msec/TE 72
msec; matrix, 210 3 256; field of view, 150 3 150 mm;
interslice gap, 0.8 mm; number of excitations, 2; echo train
length, 7). All the sequences were acquired without fat
saturation.
Image Assessment. Three observers with different levels of
experience analyzing spinal MRIs (i.e., an orthopedic surgeon,
a fellowship-trained musculoskeletal radiologist, and a muscu-
loskeletal senior staff radiologist) graded each of the 300 lum-
bar intervertebral discs on the T2-weighted sagittal images.
The observers were not involved in the development of the
grading system and the algorithm. In all, 60 selected MRIs were
randomly ordered in three sets of 20 MRIs and interpreted
independently by the three observers. Each observer was al-
lowed to review only one set per day to avoid rater fatigue. The
choice of 20 scans represented the usual number of spinal MRIs
read on a daily basis. All the MRIs were analyzed by the ob-
servers on a separate occasion, with a minimum interval of 1
week. The observers were asked to follow the algorithm strictly
as given. A handout of the classification system (Table 1), the
algorithm (Figure 2), and a set of sample MRIs (Figure 1) was
available to the raters during the image review. To obtain a
reference grade for each disc, a consensus readout was per-
formed after all the data were collected.
Data Analysis. The reliability of the MRI evaluations was
estimated using agreement percentage and kappa statistics
within raters (intraobserver reliability) and between raters
(interobserver reliability).
4
According to Landis and Koch,
11
the agreement was rated as follows: kappa 0 to 0.2 indicated
slight agreement, 0.21 to 0.4 fair agreement, 0.41 to 0.60 mod-
erate agreement, 0.61 to 0.8 substantial agreement, and 0.81
upward excellent agreement. With this rating, absolute agree-
ment would be 1. Frequency of disagreement was calculated for
each grade.
Results
Grades of Disc Degeneration in the Study Population
Altogether, 300 lumbar discs were analyzed in a study
population of 60 individuals. The number of disc degen-
eration grades assessed by each reader are summarized in
Table 2. The consensus reading resulted in 14 Grade I
discs (5%), 82 Grade II discs (27%), 72 Grade III discs
(24%), 68 Grade IV (23%), and 64 Grade V discs (21%).
Intraobserver Agreement
The results of the intraobserver agreement are summa-
rized in Table 3. Intraobserver agreement was “excel-
lent” for all three readers, with kappa values ranging
from 0.84 to 0.90. Complete intraobserver agreement
was achieved in a range from 264 (88%) to 277 (92.3%)
of 300 discs. All but one disagreement were within one
grade of difference, with a range of 23 (7.7%) to 35
(11.7%) discs.
Interobserver Agreement
As expected, interobserver agreement was somewhat
lower than intraobserver agreement (Table 3). Neverthe-
less, the agreement ranged from substantial to excellent,
with kappa values ranging from 0.69 to 0.81. Complete
agreement was achieved in a range from 233 (77.7%) to
257 (85.7%) of all 300 discs. A difference of one grade
occurred in 43 (14.3%) to 71 (23.7%) assessments of the
discs, a difference of two grades in two discs (0.7%), and
a difference of three grades in one disc (0.3%).
Evaluation of Distinction Between Grades and
Analysis of Disagreement
The relation between the frequency of the different disc
degeneration grades in the study population and the fre-
quency of disagreement is displayed in Table 4. The rel-
ative disagreement shows a fairly even distribution
among grades, indicating that the proposed grading sys-
tem has good discrimination ability. However, disagree-
ment was more frequent between Grades I and II in terms of
inter- and intraobserver agreement, and between Grades III
and IV in terms of interobserver agreement. The cases with
disagreements of two and three grades referred to disc
spaces with marked narrowing of the disc height and
normal to slightly decreased signal of the nucleus. The
case with a disagreement of three grades referred to a tran-
sitional vertebra at the lumbosacral junction. The fifth lum-
Table 1. Classification of Disc Degeneration*
Grade Structure
Distinction of
Nucleus and Anulus Signal Intensity Height of Intervertebral Disc
I Homogeneous, bright white Clear Hyperintense, isointense to
cerebrospinal fluid
Normal
II Inhomogeneous with or
without horizontal bands
Clear Hyperintense, isointense to
cerebrospinal fluid
Normal
III Inhomogeneous, gray Unclear Intermediate Normal to slightly decreased
IV Inhomogeneous, gray to black Lost Intermediate to hypointense Normal to moderately decreased
V Inhomogeneous, black Lost Hypointense Collapsed disc space
* Modified from Pearce (cited by Eyre et al
9
).
1874 Spine
•
Volume 26
•
Number 17
•
2001
Sign up today - FREE
Mendeley saves you time finding and organizing research. Learn more
- All your research in one place
- Add and import papers easily
- Access it anywhere, anytime


