Article Options


Advanced Search

This service is provided on D[e]nt Publishing standard Terms and Conditions. Please read our Privacy Policy. To enquire about a licence to reproduce material from and/or JofER, click here.
This website is published by D[e]nt Publishing Ltd, Phoenix AZ, US.
D[e]nt Publishing is part of the specialist publishing group Oral Science & Business Media Inc.

Creative Commons License

Recent Articles RSS:
Subscribe to recent articles RSS
or Subscribe to Email.

Blog RSS:
Subscribe to blog RSS
or Subscribe to Email.

Azerbaycan Saytlari

 »  Home  »  Endodontic Articles 2  »  Long-term reliability and observer comparisons in the radiographic diagnosis of periapical disease
Long-term reliability and observer comparisons in the radiographic diagnosis of periapical disease
Discussion - References.

The group of patients used in this study had been studied previously to determine changes in their periapical health status (Halse & Molven 1987, Molven & Halse 1988). In clinical situations, such observations form a basis for diagnostic conclusions regarding both overall and individual treatment results and therapeutic decisions. However, these data and conclusions are influenced by observer variations (Markén 1962, WHO 1997). The value of the findings therefore depends on a satisfactory observer performance and correspondence between the observers’ judgement and what may be regarded as correct diagnoses (Koran 1976, WHO 1997, Wulff & Gøtzsche 2000).
In the present study each examiner, both the two original investigators and the one with a recent scientific and clinical training in endodontics, disclosed normal periapical conditions in approximately three out of four rootfilled roots, periapical disease in 7–10% of the cases, and an increased width of the apical periodontal ligament space in the remaining cases (approx. 15%). Thus, they all judged the sample of endodontically treated roots to be characterized by a few teeth with pathosis and a high number of periapically healthy roots, a characteristic also maintained after the joint evaluation of disagreement and borderline difficult cases. Similar disease status has been reported in other follow-up samples of patients who have had root canal treatment in dental schools (Friedman 1998).
The observers’ assessments of the overall disease status indicate a common opinion amongst two endodontists and the radiologist about the general disease status of the sample. The validity of this finding, however, has to be evaluated to judge its importance, and also because clinicians quite often overestimate their diagnostic competence and ability (Wulff & Gøtzsche 2000). Simultaneously, information about the consistency of each of the three examiners and the variation between them is necessary for two reasons: (1) for revealing the long-term reliability of the original observers, and (2) for comparing their observations with evaluations made by the investigator more recently introduced to the diagnostic strategy.

Long-term reliability of original observers.
The intraexaminer reproducibility, or each observer’s long-term reliability calculated by comparing earlier and present observations, disclosed 83% agreeement for both observers tested. Furthermore, when interobserver comparisons were made, the original investigation also revealed 83% agreement between the two examiners, whilst the present agreement was 86%. These findings indicate good intra- and interobserver agreement rates on both occasions. From a methodological point of view, they satisfy a general requirement that the percentage of agreement between scores should be in the range 85– 98% (WHO 1997). Such levels of observer agreement are regarded almost as normal for the interpretation of radiographic images (Brorsson & Wall 1985) and this should be expected in samples with few periapical pathoses, probably reflecting the observers training and experience and the quality of the images. When the prevalence of disease is low, the figures should be calculated to show levels of reproducibility above those expected to occur by chance (Koran 1976, Bulman & Osborn 1989, Wulff & Gøtzsche 2000). The kappa statistic gives such figures and is a more valid assessment of intra- and interobserver agreement compared to the percentage of agreement between scores. The present kappa values, from 0.53 to 0.61, i.e. true agreement levels from 53% to 61%, are regarded as good ratings for evaluation of skeletal structures (Cockshott & Park 1983). Corresponding values have been disclosed in other endodontic investigations (Trope et al . 1999, Saunders et al . 2000), and higher values, indicating 80% corrected agreement or more have also been presented (Sjögren et al . 1990, Weiger et al . 1997, Kirkevang et al . 2000). Differences pertaining to the number of diagnostic groups and the frequency of diagnoses, may explain the latter values if compared with the present ones. Therefore, it is reasonable and relevant to conclude that the long-term reliability of the two original observers was good with a moderate to substantial agreement between the present observations and findings made several years earlier, for the same cases viewed on the same series of radiographs.

Original observers vs. new examiner.
Long-term follow-up studies often imply that observers are brought in for practical, methodological and also educational purposes. These examiners must be tested against standard requirements of observer judgements, and compared with the performance of so-called experts or more experienced observers. The interpretation, understanding and application of codes and criteria should be uniform (Koran 1976, WHO 1997, Wulff & Gøtzsche 2000). Each observer should examine consistently, and original observers and others more recently introduced to the method should be closely correlated in their judgements.
The present findings indicate that these requirements were fullfilled. The interobserver agreement was above 80%, and the kappa values 0.55, 0.58 and 0.61 revealed good reproducibility. Thus, judgements made by the observer with a more recent scientific and clinical training in endodontics corresponded to those made by the two original observers. The three observers therefore appeared to interpret radiographs in the same way, indicating that they were calibrated against a standard resulting in observations with no marked influence from bias and systematic error (Halse & Molven 1986).

Joint agreement.
Observer error and bias is part of clinical research, and can never be eliminated (Koran 1976, WHO 1997, Wulff & Gøtzsche 2000). Measures must, however, be taken to minimize their effect. Therefore, in studies of treatment results after conventional root canal filling and after endodontic surgery, the importance of joint evaluations as part of the diagnostic strategies has been emphasized (Halse & Molven 1986, Molven et al . 1987). Thorough discussions before deciding about cases recorded as being difficult (that is borderline and deviating cases identified during the investigation) would be expected to increase the chances of obtaining reliable and valid radiographic data. Joint discussions during the study should also ensure that the classification system is continuously repeated and discussed in relation to diagnostic problems, and a calibration effect is likely to be expected. By these measures the risk of serious observer deviations and obvious wrong recordings should be reduced to an acceptable minimum.
In the present study we included three occasions for discussed agreement, one after each separate evaluation of one third of the material. Altogether 12% of the material was subjected to joint discussions and a decision was obtained for all the reevaluated cases including seven rejections. In an earlier investigation by just two of the same observers, about 18% of the material was scheduled for joint evaluation (Molven & Halse 1988). This suggests that several difficult cases can be observed even in samples with a presumably great number of easily detectable normal findings. Comparable figures are not readily found in the literature and should be given to illustrate diagnostic difficulties in studies otherwise satisfying general methodological requirements regarding observer reproducibility.
The diagnostic conclusions in the difficult cases, the disease or no disease decisions, are important for the estimation of the overall success percentages. And, as also discussed by Kvist (2001), they are crucial as a basis for therapeutic decisions in individual cases.


Brorsson B, Wall S (1985) Validity and deduction. Assessment of Medical Technology - Problems and Methods.Swedish Medical Research Council, Stockholm, 34.
Bulman JS, Osborn JF (1989) Measuring diagnostic consistency.British Dental Journal 166, 377-81.
Cockshott WP, Park WM (1983) Observer variation in skeletal radiology.Skeletal Radiology 10, 86-90.
Friedman S (1998) Treatment outcome and prognosis of endodontic therapy.In: Ørstavik, D, Pitt Ford, TR, eds. Essential Endodontology Prevention and Treatment of Apical Periodontitis. London, UK: Blackwell Science, 368-9.
Halse A, Molven O (1986) A strategy for the diagnosis of periapical pathosis.Journal of Endodontics 12, 534-8.
Halse A, Molven O (1987) Overextended gutta-percha and Kloroperka. N-Ö root canal fillings. Radiographic findings after 10-17 years.Acta Odontologica Scandinavica 45, 171-7.
Kirkevang L-L, Ørstavik D, Hörsted-Bindslev P, Wenzel A (2000) Periapical status and quality of root fillings and coronal restorations in a Danish population. International Endodontic Journal 33, 509-15.
Koran LM (1976) Increasing the reliability of clinical data and judgments.Annals of Clinical Research 8, 69-73.
Kvist T (2001) Endodontic Retreatment. Aspects of decision making and clinical outcome.Thesis, Göteborg, Sweden. Swedish Dental Journal (Suppl. 144).
Markén K-E (1962) Studies of deviations between observers in clinico-odontological recording.Thesis. Uppsala, Sweden: Almqvist & Wiksells Boktryckeri AB.
Molven O, Halse A (1988) Success rates for gutta-percha and Kloroperka N-Ö root canal fillings made by undergraduate students: radiographic findings after 10-17 years.International Endodontic Journal 21, 243-50.
Molven O, Halse A, Grung B (1987) Observer strategy and the radiographic classification of healing after endodontic surgery.International Journal of Oral Maxillofacial Surgery 16, 432-9.
Saunders MB, Gulabivala R, Holt R, Kahan RS (2000) Reliability of radiographic observations recorded on a proforma measured using inter- and intra-observer variation: a preliminary study.International Endodontic Journal 33, 272-8.
Sjögren U, Hägglund B, Sundqvist G, Wing K (1990) Factors affecting the long-term results of endodontic treatment.Journal of Endodontics 16, 498-504.
Tronstad L, Asbjørnsen K, Døving L, Pedersen I, Eriksen HM (2000) Influence of coronal restorations on the periapical health of endodontically treated teeth.Endodontics and Dental Traumatology 16, 218-21.
Trope M, Olutayo Delano E, Ørstavik D (1999) Endodontic treatment of teeth with apical periodontitis: Single vs multivisit treatment.Journal of Endodontics 25, 345-50.
Weiger R, Hitzler S, Hermle G, Löst C (1997) Periapical status, quality of root canal fillings and estimated endodontic treatment needs in an urban German population. Endodontics and Dental Traumatology 13, 69-74.
World Health Organization (1997) Oral Health Surveys, Basic Methods , 4th edn. WHO, Geneva, Switzerland: 13-5, 62-3.
Wulff HR, Gøtzsche PC (2000) Rational Diagnosis and Treatment. Evidence-Based Clinical Decision-Making .London, UK: Blackwell Science, 29.