当前位置: 首页 > 期刊 > 《美国整骨期刊》 > 2006年第8期 > 正文
编号:11329974
Interexaminer Reliability of Three Methods of Combining Test Results to Determine Side of Sacral Restriction, Sacral Base Position, and Inno
http://www.100md.com 《美国整骨期刊》
     the Department of Physical Medicine and Rehabilitation (Tong, Heyman, Lado, Isser); and the Spine Program (Tong, Heyman), University of Michigan in Ann Arbor. No funding was received from any pharmaceutical or equipment companies.

    Context: Sacroiliac joint dysfunction is diagnosed based on the combined results of several palpatory examinations. Previous studies have compared the interexaminer reliability of only one of these methods of diagnosis.

    Objective: To compare the interexaminer reliability of three methods of combining palpatory examinations to determine the side of sacroiliac joint dysfunction, sacral base position, and innominate bone position.

    Design: Blinded single-cohort reliability study.

    Methods: Patients with low back pain underwent two identical sets of palpatory examinations given by two physicians, separately, at a university spine center. The results of each set were compiled and interpreted by three methods: using the test result with the highest interexaminer reliability (method 1), requiring at least one test result to be abnormal for the variable to be abnormal (method 2), and requiring all test results to be abnormal for the variable to be abnormal (method 3). The was calculated for each method.

    Results: There were 24 subjects (mean age, 68.3 years), of which 15 (62%) were women. The was consistently higher with method 1, at 0.47, 0.08, and 0.32 for the sacral position, innominate bone position, and side of sacroiliac joint dysfunction, respectively. Corresponding values for method 2 were 0.09, 0.4, and 0.16, and for method 3 were 0.16, 0.1, and –0.33.

    Conclusion: Using the results of the most reliable examination consistently has the best interexaminer reliability.

    A recent review article on the validity and reliability of tests for low-back dysfunction concluded that no single test has been adequately studied to be able to determine its validity and reliability.1 Similar judgments were expressed in two other review articles, which concluded that sacroiliac joint (SIJ) mobility tests were not proven to be reliable.2,3

    It has been suggested that interexaminer reliability may be improved by combining results from several tests into a composite multitest score (MTS).4 Haas4 noted, based on probability calculations, that the expected rate of agreement is lowest when a middle threshold value is used (eg, three of five tests are required to yield positive results before the MTS is considered positive). Thus, the (kappa) statistic is theoretically more likely to be greater when middle threshold values are used. However, this concept has only been evaluated in a few studies.5–7 Two of these studies5,7 evaluated one method of combining the results of four tests to determine the presence of SIJ dysfunction, and had conflicting findings. The method used by Cibulka et al5 required positive results from at least three of four tests before results of the MTS were considered conclusive. The authors showed that a cluster of four tests had substantial interexaminer reliability (=0.88).5 When the same four tests were reevaluated in a multicenter study by Freburger and Riddle,6 the interexaminer reliability was found to be fair (=0.23). A study evaluating the MTSs of four SIJ provocation tests noted substantial reliability (=0.7) when using a method that required five tests, with the results of at least three being positive prior to diagnosis.7 However, these studies did not adequately evaluate the benefit of the MTS because they only presented the reliability of the resulting composite scores and not the interexaminer reliability of the individual tests for comparison.6–9

    Two studies did present the reliability of individual tests and the MTS.9,10 Keating et al9 evaluated 46 subjects and showed only slightly stronger reliability with the MTS. Boline et al10 did not show improvement with the MTS. These two studies suggest that MTSs do not improve interexaminer reliability when compared with the individual test results.9,10 However, these studies only looked at one method of combining the individual test results.

    The effect of MTSs on the interexaminer reliability of diagnoses for the sacral and innominate bone positions was not examined by any of the studies mentioned previously.1–10 When treating patients using a directed manual treatment program, it is not enough to simply determine the presence of an SIJ dysfunction. The results of at least two of the individual palpatory examinations need to be combined to obtain diagnoses for the sacral and innominate bone positions.11

    The purpose of the current study is to compare the interexaminer reliability of three methods of combining the results of osteopathic palpatory examinations to determine the side of SIJ dysfunction, sacral base position, and innominate bone position.

    Methods

    Between December 2002 and April 2003, new patients seen at a university spine center each underwent two separate evaluations by two physicians performing identical palpatory examinations. The first evaluator (H.C.T.) was aware of the patients' clinical histories, while a second evaluator (O.G.H., D.A.L., or M.M.I.) was blinded to these histories and current medical status as well as the results obtained by the first examiner. Subjects were included in the study if they had a chief complaint of low back pain. Subjects were excluded from the study if they could not tolerate the physical examination as a result of pain. Demographic variables, including age, sex, height, and weight, were recorded.

    Palpatory examinations were conducted to evaluate three variables: the presence and side of SIJ dysfunction, the sacral position, and the innominate bone position (Figure). For all of the tests, examiners used their dominant eyes as recommended by Greenman.11 The institutional review board of the University of Michigan Medical School in Ann Arbor approved the study, and informed consent was obtained from all subjects.

    Diagnostic Methods

    Method 1 used the examination that had the best interexaminer reliability to determine the diagnoses for the sacral and innominate bone positions.

    Method 2 required that at least one palpatory examination reveal dysfunction for a variable to be considered abnormal.

    Method 3 required that all of the palpatory examinations reveal dysfunction for a variable to be considered abnormal.

    The resulting data from the three different models were then combined to determine the side of SIJ dysfunction, sacral position, and innominate bone position.

    Statistical Analysis

    Data were analyzed using SPSS software (version 10.1; SPSS Inc, Chicago, Ill). Because all of the tests and the diagnoses for the sacral and innominate bone positions are categorical variables, the value of was calculated to determine interexaminer reliability.12 The statistic reports the amount of agreement seen after adjusting for the amount of agreement that is expected to occur by chance alone.13 Landis and Koch14 recommended using a coefficient of 0.2 as the lower limit for fair, 0.4 for moderate, 0.6 for substantial, and 0.8 for almost perfect reliability.

    Results

    Twenty-four subjects were chosen chronologically. No subject was excluded from the study as a result of exclusion criteria. The demographics of the subjects are described in Table 1. The mean age of participants was 68.3 years, and 15 patients (62%) were women. The mean height of participants was 1.7 m, the mean weight was 78.3 kg, and the mean body mass index was 27.8 kg/m2.

    The interexaminer reliability of each palpatory examination is summarized in Table 2. Of the examinations for SIJ dysfunction, the standing stork test had the best interexaminer reliability, with a of 0.27 (P=.07), and the seated flexion test had the worst interexaminer reliability, with a of –0.06 (P=.68). The standing flexion test had a of 0.14 (P=.37). Sacral base position with trunk flexion (=0.37; P=.002) had better interexaminer reliability than sacral base position with trunk extension (=0.05; P=.26). For the innominate bone position tests, the medial malleolus symmetry test (=0.21; P=.3) had better interexaminer reliability than the supine anterior superior iliac spine symmetry test (=0.15; P=.48).

    The interexaminer reliability of the resulting sacral position diagnosis and innominate bone position diagnosis for the three diagnostic methods are summarized in Table 3. The interexaminer reliability of sacral position when divided into all nine possible categories was incalculable due to the large number of categories and small number of subjects. Consequently, we calculated the interexaminer reliability of sacral positioning as characterized by the two components that determine sacral position: sacral base position (normal, anterior, or posterior) and side of dysfunction (bilateral, left, or right). The interexaminer reliability of sacral base position for all three methods was poor to fair, with scores ranging from 0.08 to 0.16. The interexaminer reliability for determining the side of SIJ dysfunction was fair with the first (=0.32) and second method (=0.4), and poor with the third method (=0.1). The innominate bone position had moderate interexaminer reliability (=0.47) when calculated by method 1 and poor reliability when methods 2 and 3 were used. Method 1 yielded the best results, finding moderate reliability for one variable, fair reliability for one variable, and poor reliability for one variable. Method 2 was second, finding fair reliability for one variable and poor reliability for one variable. Method 3 was the worst, finding poor reliability for all three variables.

    Post hoc analysis was performed because it was noted that one of the secondary examiners consistently had poor correlation with the initial examiner. Because of this finding, the data were reanalyzed, excluding the subjects seen by the examiner in the second group. As shown in Table 2, the interexaminer reliability of all of the tests improved with the remaining 18 subjects. Specifically, the increased to 0.11 (P=.56) for the seated flexion test, 0.5 (P=.009) for the standing stork test, 0.3 (P=.11) for the standing flexion test, 0.47 (P=.001) for the sacral base position with trunk flexion, 0.26 (P=.12) for the sacral base position with trunk extension, 0.29 (P=.26) for the supine anterior superior iliac spine, and 0.49 (P=.046) for medial malleolus symmetry. The resulting reliability of the sacral base position diagnosis with all three methods remained poor ( range, 0.16–0.21). The reliability of the side of dysfunction remained poor with method 3, but improved to 0.49 (P=.009) with method 1 and 0.6 (P=.001) with method 2. The reliability of the innominate bone position diagnosis remained poor with methods 2 and 3, but improved to 0.84 (P<.001) with method 1.

    Comment

    Several different palpatory examinations are used to evaluate joint motion as well as joint position to detect any abnormality. It has been suggested that combining results from several tests to form a composite MTS will increase interexaminer reliability.4 In fact, Haas4 recommends that if an MTS is used, based on expected random chance agreements, an intermediate threshold score should be used to "ensure moderate agreement." However, this theory has not been tested with empiric data to ascertain if it maximizes the interexaminer reliability of MTS.

    Maximizing interexaminer reliability is essential, as most of the studies evaluating palpatory examinations of the sacrum and pelvis have shown poor to fair interexaminer reliability.1 When they noted poor interexaminer reliability, Flynn et al15 correctly decided not to include palpatory examinations in their analyses. Dreyfuss et al16 noted poor interexaminer reliability with the palpatory examinations yet still used their results in the analyses. However, they did not explain whether they used the physician's findings or the chiropractor's findings, calling into question the validity of their conclusion that the palpatory examination had poor sensitivity and specificity in determining SIJ pain.16

    The current study found that using the test with the best interexaminer reliability (method 1) consistently yielded the score with the highest interexaminer reliability. For example, the standing stork test had the best interexaminer reliability in testing for SIJ dysfunction. Using looser criteria (method 2) had slighter better interexaminer reliability to determine the side of SIJ dysfunction. However, this method also had significantly worse reliability when determining the innominate bone position. Using stricter criteria (method 3) consistently had the worst interexaminer reliability for both sides of SIJ dysfunction and innominate bone position.

    Previously mentioned studies either did not give the interexaminer reliability of the individual tests5,7,17 or they evaluated the interexaminer reliability of only one method of combining multiple palpatory examination results.9,18 Our study does not support the recommendation by Haas4 that an intermediate range threshold should be used. However, because of the small number of subjects in our study, the present analysis cannot definitively refute Haas' recommendation.

    Our findings suggest that maximizing the interexaminer reliability is a prerequisite to conducting studies that truly evaluate the sensitivity and specificity of the palpatory examination, and thus validate this aspect of osteopathic medicine. In addition, maximizing interexaminer reliability is important for clinical care because prescribed manual treatments are based on the results of the palpatory examination. By using the most reliable method to diagnose the cause of low back pain, the physician can be more confident in his or her treatment decisions.

    Our study has several strengths. Interexaminer reliability was evaluated for a variety of palpatory examinations and diagnostic methods. In addition, we examined the interexaminer reliability of examinations that detect sacral position and innominate bone position in addition to SIJ.

    When interpreting the results of this study, several limitations should be considered. First, the results need to be replicated by other studies. A larger study with more tests to further evaluate this issue is planned by the authors. Also, even though the medial malleolus symmetry test has better reliability than the supine anterior superior iliac spine symmetry test, the former test may not be a valid measure of innominate bone position if the subject has a significant leg length discrepancy. Finally, interexaminer reliability is not the only factor to determine what integration method should be used to diagnose structural dysfunction. Sensitivity and specificity may take precedence over reliability in certain instances.

    Our study shows that the maximum interexaminer reliability occurs when only the result of the most reliable test is used to determine the side of SIJ dysfunction, sacral base position, and innominate bone position. Therefore, this method should be used when making clinical management decisions to ensure that the most appropriate treatment is implemented for each patient.

    References

    1. Hestbaek L, Leboeuf-Yde C. Are chiropractic tests for the lumbo-pelvic spine reliable and valid A systematic critical literature review. J Manipulative Physiol Ther.2000; 23:258 –275.

    2. van der Wurff P, Hagmeijer RH, Meyne W. Clinical tests of the sacroiliac joint. A systemic methodological review. Part 1: reliability [review]. Man Ther.2000; 5:30 –36.

    3. Freburger JK, Riddle DL. Using published evidence to guide the examination of the sacroiliac joint region. Phys Ther.2001; 81:1135 –1143.

    4. Haas M. Interexaminer reliability for multiple diagnostic test regimens. J Manipulative Physiol Ther.1991; 14:95 –103.

    5. Cibulka MT, Delitto A, Koldehoff RM. Changes in innominate tilt after manipulation of the sacroiliac joint in patients with low back pain: an experimental study. Phys Ther.1988; 68:1359 –1363.

    6. Freburger JK, Riddle DL. Measurement of sacroiliac joint dysfunction: a multicenter intertester reliability study. Phys Ther. 1999;79:1134 –1141.

    7. Kokmeyer DJ, Van der Wurff P, Aufdemkampe G, Fickenscher TC. The reliability of multitest regimens with sacroiliac pain provocation tests. J Manipulative Physiol Ther.2002; 25:42 –48.

    8. Cibulka MT, Koldehoff R. Clinical usefulness of a cluster of sacroiliac joint tests in patients with and without low back pain. J Orthop Sports Phys Ther.1999; 29:83 –89.

    9. Keating JC Jr, Bergmann TF, Jacobs GE, Finer BA, Larson K. Interexaminer reliability of eight evaluative dimensions of lumbar segmental abnormality. J Manipulative Physiol Ther.1990; 13:463 –470.

    10. Boline PD, Haas M, Meyer JJ, Kassak K, Nelson C, Keating JC Jr. Interexaminer reliability of eight evaluative dimensions of lumbar segmental abnormality: part II. J Manipulative Physiol Ther.1993; 16:363 –374.

    11. Greenman PE. Pelvic girdle dysfunction. In: Principles of Manual Medicine. 2nd ed. Baltimore, Md: Williams & Wilkins; 1996:305 –367.

    12. Fleiss JL. The measurement of interrater agreement. In: Statistical Methods for Rates and Proportions. 2nd ed. New York, NY: Wiley; 1981:212 –236.

    13. Haas M. Statistical methodology for reliability studies. J Manipulative Physiol Ther.1991; 14:119 –132.

    14. Landis JR, Koch GG. The measurement of observer agreement for categorical data. Biometrics.1977; 33:159 –174.

    15. Flynn T, Fritz J, Whitman J, Wainner R, Magel J, Rendeiro D, et al. A clinical prediction rule for classifying patients with low back pain who demonstrate short-term improvement with spinal manipulation. Spine. 2002;27:2835 –2843.

    16. Dreyfuss P, Michaelsen M, Pauza K, McLarty J, Bogduk N. The value of medical history and physical examination in diagnosing sacroiliac joint pain. Spine.1996; 21:2594 –2602.

    17. Riddle DL, Freburger JK. Evaluation of the presence of sacroiliac joint region dysfunction using a combination of tests: a multicenter intertester reliability study. Phys Ther.2002; 82:772 –781.

    18. Boline PD, Keating JC Jr, Haas M, Anderson AV. Interexaminer reliability and discriminant validity of inclinometric measurement of lumbar rotation in chronic low-back pain patients and subjects without low-back pain. Spine. 1992;17:335 –338.(Henry C. Tong, MD; Oscar )