Reliability and separation index analysis of mathematics questions integrated with the cultural architecture framework using the Rasch model

Muh Fitrah

Department of Education Research and Evaluation, Graduate School, Yogyakarta State University, Yogyakarta, Indonesia.

https://orcid.org/0000-0003-2189-5256

Anastasia Sofroniou

School of Computing and Engineering, University of West London, London, UK.

https://orcid.org/0009-0001-6792-1838

Ofianto

Padang State University, Padang, Indonesia.

https://orcid.org/0000-0002-0625-2247

Loso Judijanto

IPOSS Jakarta, Jakarta, Indonesia.

https://orcid.org/0009-0007-7766-0647

Widihastuti

Department of Education Research and Evaluation, Graduate School, Yogyakarta State University, Yogyakarta, Indonesia.

https://orcid.org/0000-0001-8242-658X

DOI: https://doi.org/10.20448/jeelr.v11i3.5861

Keywords: Culture, Mathematics test instrument, Rasch model, Reliability, Separation index.


Abstract

This research uses Rasch model analysis to identify the reliability and separation index of an integrated mathematics test instrument with a cultural architecture structure in measuring students' mathematical thinking abilities. The study involved 357 students from six eighth-grade public junior high schools in Bima. The selection of schools was based on average school exam scores and considered the effectiveness of the learning process that used cultural settings to explore mathematical content. Data analysis was conducted using Microsoft Excel to calculate the content validity of Aiken's index with four experts and the jMetrik software to measure reliability and the separation index. The research results indicate that the mathematics test instrument passed validation by mathematics experts and measurements with a valid content validity level. Rasch model calibration shows a very high level of instrument reliability. Separation analysis on the logit scale indicates the instrument's ability to differentiate students with different ability levels with good homogeneity in the distribution of test items and individual abilities. Scale quality statistics show good item response variability, low error rates and a high separation index. This study has limitations because it focuses solely on multiple-choice questions. Similar research must be conducted using other types of questions (such as those used in PISA, namely open- constructed and closed-constructed questions) and integrating other mathematical materials within relevant cultural architectural structures.

Downloads

Download data is not yet available.