초록

준거참조검사에서의 성공과 실패는 의사 결정을 위한 기준을 제공하는 분할점수에 의하여 결정되고, 제대로 정해진 분할점수는 피험자들을 분류하는데 매우 중요한 역할을 한다. 그럼에도 불구하고, 현재 우리나라에서는 피험자들이 검사에서 통과하거나 실패하는 여부는 어떠한 과학적이고 체계적인 기준 설정의 과정이 없이 임의로 정한 점수의 평균을 넘어서는가를 재는 동일한 기준에 따라 보통 결정되는데, 최근 점수 판정제가 도입된 ‘교과서 검정 심사’에서도 비과학적인 분할점수 설정이 적용되고 있다. 이러한 상황에서, 이 연구는 수정된 Angoff 방법에 의한 교과서 검정 심사에서의 분할점수 설정을 목표로 하였으며 자료는 2008년도 수학, 영어 과목의 교과서 검정 심사 결과 자료를 사용하였는데, 연구 결과를 종합한 결론은 첫째, 총점 75점 이상이라는 동일 기준에 따라 임의로 정해진 분할점수는 준거참조검사에서의 평가 관점에서 볼 때 결코 합리적이라고 볼 수 없기 때문에, 이 논문에서는 미리 특정한 점수를 정하여 사용하기 보다는 체계적으로 분할점수를 설정할 수 있는 여러 가지 방법 중에서 ‘수정된 Angoff 방법’을 사용하여 새로운 분할점수를 설정하였다. 둘째, 새로운 분할점수를 설정한 후에 교과서를 분류한 결과 즉, 적격·부적격 분류 결과는 처음의 결과와 비교하여 매우 크게 변하였는데, 검정위원들이 실제 적격·부적격으로 구분한 기준점수와 이상적으로 생각하는 분할점수 사이에는 큰 격차가 있었다는 것을 알 수 있었다. 마지막으로, 교과서 검정에서 채택될 수 있는 여러 가지 제언들이 요약되어 제안되었고 연구의 한계 및 장래의 연구 가능성을 제시하였다.

The success or failure of criterion-referenced tests is determined by the cut-off scores, which provide the basis for decision-making, and the proper cut-off scores play a very significant role in classifying students. Yet in Korea, passing and failing in testing is determined by a universal standard (cut-off score) that requires testees to exceed arbitrary overall scores and has no basis in any systematic standard-setting procedure. Even the recent “Textbook Authorization” fails to take a scientific approach to setting standards. Given these less-than-ideal circumstances, this study aimed to set standards (cut-off score) by employing the "modified Angoff method" using data from the "Textbook Authorization Screening of Mathematics and English." The major conclusions of this study may be summarized as follows. First, the cut-off score determined arbitrarily by a universal standard that requires over 75 points overall was never considered reasonable from the viewpoint of evaluation in criterion-referenced tests. Therefore, the modified Angoff method was used in this study for setting the cut-off scores systematically instead of pre-setting arbitrary standards. Second, after setting new cut-off scores, it turned out that the result of the screening of textbooks into passing and failing changed significantly. Therefore, there was a large gap between ideal cut-off scores and real decision scores in screening textbooks. Finally, several proposals for textbook authorization were summarized, as well as limitations of the study and future research possibilities.