Reliability and separation index analysis of mathematics questions integrated with the cultural architecture framework using the Rasch model
Muh Fitrah
Department of Education Research and Evaluation, Graduate School, Yogyakarta State University, Yogyakarta, Indonesia.
https://orcid.org/0000-0003-2189-5256
Anastasia Sofroniou
School of Computing and Engineering, University of West London, London, UK.
https://orcid.org/0009-0001-6792-1838
Ofianto
Padang State University, Padang, Indonesia.
https://orcid.org/0000-0002-0625-2247
Loso Judijanto
IPOSS Jakarta, Jakarta, Indonesia.
https://orcid.org/0009-0007-7766-0647
Widihastuti
Department of Education Research and Evaluation, Graduate School, Yogyakarta State University, Yogyakarta, Indonesia.
https://orcid.org/0000-0001-8242-658X
DOI: https://doi.org/10.20448/jeelr.v11i3.5861
Keywords: Culture, Mathematics test instrument, Rasch model, Reliability, Separation index.
Abstract
This research uses Rasch model analysis to identify the reliability and separation index of an integrated mathematics test instrument with a cultural architecture structure in measuring students' mathematical thinking abilities. The study involved 357 students from six eighth-grade public junior high schools in Bima. The selection of schools was based on average school exam scores and considered the effectiveness of the learning process that used cultural settings to explore mathematical content. Data analysis was conducted using Microsoft Excel to calculate the content validity of Aiken's index with four experts and the jMetrik software to measure reliability and the separation index. The research results indicate that the mathematics test instrument passed validation by mathematics experts and measurements with a valid content validity level. Rasch model calibration shows a very high level of instrument reliability. Separation analysis on the logit scale indicates the instrument's ability to differentiate students with different ability levels with good homogeneity in the distribution of test items and individual abilities. Scale quality statistics show good item response variability, low error rates and a high separation index. This study has limitations because it focuses solely on multiple-choice questions. Similar research must be conducted using other types of questions (such as those used in PISA, namely open- constructed and closed-constructed questions) and integrating other mathematical materials within relevant cultural architectural structures.