Measuring the relationship between social capital, race, and education
Key words: Social capital; Race; Education; Equality
Enhancing social and community support (e.g., social capital) is essential for building healthier communities,
as social capital significantly influences health outcomes. However, the relationship between social capital,
race, and education is complex. Historically marginalized groups often face systemic barriers that reduce
their social capital. Therefore, longitudinal research is essential to understand these dynamics and address
health disparities. This study explores the relationship between social capital, race, and education in U.S.
adults over time, using Midlife in the U.S. (MIDUS) data from Waves 1–3 (1995–1996; 2004–2006; 2013–2014).
We used the disparity assessment framework from Ward et al. and multilevel mixed-effects models to investigate
how social capital evolved differently based on race and education as well as the potential implications of
these differences. Our findings revealed that Black respondents consistently demonstrated higher community
contributions and community involvement compared with White respondents, despite having lower education on average.
This social capital advantage for Black respondents persisted across all three waves of the MIDUS study.
Longitudinal analysis also showed that community contributions remained stable at all time points for all
respondents, while community involvement declined at MIDUS 3. However, Black respondents exhibited a prominent
increase in community involvement at MIDUS 3, suggesting that Black communities may have adapted and thrived
through culturally specific forms of social capital during that period. Our findings indicated these positive
manifestations of social capital should be explored to see how it can be supported and suggested the need
for further exploration of racial dynamics and culturally specific forms of social capital.
Citation: Contreras, J., Amissah, C. M., Khademi, A., Valeriann, C., & Villalonga-Olives, E. (2025).
Measuring the relationship between social capital, race, and education. Frontiers in Public Health, Vol 13.
Download
Beware the Little Foxes that Spoil the Vines: Small Inconsistencies in Clinical Data Can Distort Machine Learning Findings
Key words: Biomedical informatics; Noisy data; Noisy covariates; Machine learning; Clinical data
It is well known that Electronic Health Records (EHR) data contain inconsistent and inaccurate data,
the effect of which on predictive model performance and risk/benefit factor identification are often neglected.
This study investigates how varying levels of random and non-random binary differences, often referred to
as "noise", affect modeling tools, such as logistic regression, support vector machines, and gradient boosting models.
Using curated data from the All of Us database, we simulated different noise levels to mimic real-world variability.
Across all models and noise types, increased noise consistently reduced classification accuracy. More importantly,
noise diminished the variance of variable impact scores while leaving their means unchanged, suggesting a muted
ability to identify key predictors. These findings imply that even modest noise levels can obscure meaningful signals.
Measures like accuracy and hazard ratios may thus be misleading in noisy data contexts. The consistency of effects
across models and noise mechanisms suggests this issue stems from inherent data variability rather than model brittleness,
with broad implications for EHR data analyses.
Citation: Khademi, A., Tuttle, M. S., Qing, Z., & Nelson, S. J. (2025).
Beware the Little Foxes that Spoil the Vines: Small Inconsistencies in Clinical Data Can Distort Machine Learning Findings. Journal of Health Sciences, Vol 8, Issue 3, 872-879.
Download
Download PDF
Unveiling disparities: examining differential item functioning’s impact on racial health equity among white and black populations
Key words: Social Capital; Item response theory; Measurement invariance; Differential item functioning; Health disparities; Racial difference;
This paper aims to examine the psychometric properties of social capital indicators, comparing Black and White respondents to identify the extent of
measurement invariance in social capital by race.
We used data from the longitudinal study Midlife in the United States (MIDUS), waves 1 through 3 (1995-2016).
Data were from 6513 respondents (5604 White and 909 Black respondents). Social capital indicators were social cohesion,
contributions to community, and community involvement.
We used Structural Equation Modeling and Item Response Theory
methods to test for measurement invariance in social capital by race.
We observed violations of longitudinal and multi-group measurement invariance (MI) at configural and metric levels on
two scales. Factor structures and indicator loadings were inconsistent over time. In IRT analysis, ‘Many people come
for advice’ exhibited Differential Item Functioning (DIF), indicating a consistent advantage for White respondents
on the contributions to community scale. Despite similar social capital levels (P(χ2,2) = 0.00), DIF was found in
all contributions to community items and some community involvement items when examining race and education interaction.
Citation: Villalonga-Olives, E., Khademi, A., Pan, Y. Y., & Ransome, Y. (2024).
Unveiling disparities: examining differential item functioning’s impact on racial health equity
among white and black populations. Public Health, 231, 80-87.
Download
A Longitudinal Analysis of the Variability of Cognitive Complexity in IELTS Academic Writing Prompts: a mixed-method study
Key words: Writing prompts; Task complexity; Cognitive complexity; IELTS
Cognitive complexity of writing prompts may affect the linguistic performance of test takers in writing assessment, as evidenced by previous research.
In this mixed method study, we aim to explore if and to what extent the perceived cognitive complexity of IELTS Writing Task II prompts has changedover time.
We use official IELTS prompts for this study collected over test administrations years 1996 to 2022. For each test year, two prompts were randomly selected (total of 20 prompts
for ten years) and were structured in a survey. Experienced IELTS teachers and assessment experts (n = 19) rated the perceived cognitive complexity of
each prompt on 1-8 Likert scale. An intra-class correlation (ICC) coefficient showed a high degree of inter-rater reliability among the raters(r = 0.84, p = 0.00).
A time series analysis showed that the perceived year-on-year cognitive complexity of the prompts showed significant negative change (b = -0.073, p=0.00).
In the qualitative part of the study, the prompts with the lowest and highest cognitive complexity mean ratings were presented to the raters and
inquired what made those prompts easy or difficult for test takers.
Citation: Khademi, A. (2024, March 19). A Longitudinal Analysis of the Variability of Cognitive Complexity in IELTS Academic Writing Prompts: a mixed-method study.
https://doi.org/10.35542/osf.io/euv87
Download
Examining Appropriacy of CFI and TLI Cutoff Value in Multiple-Group CFA Test of Measurement Invariance to Enhance Accuracy of Test Score Interpretation
Key words: measurement invariance, fit index, MG-CFA, multiple group confirmatory factor analysis, lack of invariance, differential item functioning
The most common effect size when using a multiple-group confirmatory factor analysis approach to measurement invariance is ΔCFI and ΔTLI with a cutoff value of 0.01. However,
this recommended cutoff value may not be ubiquitously appropriate and may be of limited application for some tests (e.g., measures using dichotomous items or different estimation methods,
sample sizes, or model complexity). Moreover, prior cutoff value estimations often have ignored consequences resulting in using measures that more accurately estimate countries’ or
learners’ proficiency for some countries or groups versus others. In this study, we investigate whether the cutoff value proposed by Cheung and Rensvold (ΔCFI or ΔTLI > 0.01) is appropriate
across educational measurement contexts. Specifically, we investigated the performance of ΔCFI and ΔTLI in capturing LOI at the scalar level in dichotomous items within item response theory on
groups whose test characteristic curves differed by 0.5. Simulation results showed that the proposed cutoff value of 0.01 in ΔCFI and ΔTLI was not appropriate to
capture LOI under the study conditions, which may result in the misinterpretation of test results or inaccurate inferences.
Citation: Khademi, A., Wells, C. S., Oliveri, M. E. & Villalonga-Olives, E. (2023). Examining appropriacy of CFI/TLI cutoff value in
multiple-group CFA test of measurement invariance to enhance accuracy of test score interpretation. https://doi.org/10.31234/osf.io/y36uv
Download
Can ChatGPT and Bard Generate Aligned Assessment Items? A Reliability Analysis against Human Performance
Key words: ChatGPT; Artificial Intelligence; Deep learning; Natural language processing; Automated item generation
ChatGPT and Bard are AI chatbots based on Large Language Models (LLM) that are slated to promise different applications in diverse areas. In education, these AI technologies
have been tested for applications in assessment and teaching. In assessment, AI has long been used in automated essay scoring and automated item generation.
One psychometric property that these tools must have to assist or replace humans in assessment is high reliability in terms of agreement between AI scores and human raters.
In this paper, we measure the reliability of OpenAI ChatGP and Google Bard LLMs tools against experienced and trained humans in perceiving and rating the complexity of writing prompts.
Intraclass correlation (ICC) as a performance metric showed that the inter-reliability of both the OpenAI ChatGPT and the Google Bard were low against the gold standard of human ratings.
Citation: Khademi, A. (2023). Can ChatGPT and Bard Generate Aligned Assessment Items? A Reliability Analysis against Human Performance. Journal of Applied Learning and Teaching, 6(1). DOI: 10.37074/jalt.2023.6.1.28
Download
Download Prompts
Investigating Test Content Structure using Multidimensional Scaling
Key words: Multidimensional scaling; Dimension reduction; Construct validation; Writing assessment
Although multidimensional scaling (MDS) has extensively been used in social and physical sciences to visualize and explore the
latent dimensionality of observed data, few studies in language assessment have used MDS for the purpose of test construction, including construct
and content validation. In this study, we use MDS to investigate and compare the similarity of writing prompts in the IELTS and TOEFL iBT tests.
Random prompts from both tests were presented as item pairs and raters were asked to rate the similarity of the pairs in terms of content and cognitive complexity.
The results showed the writing prompts in the TOEFL iBT test represented two major dimensions while those in the IELTS test demonstrated three domains.
Citation: Khademi, A. (2023). Investigating test content structure using multidimensional scaling. Research Methods in Applied Linguistics, 2(2). https://doi.org/10.1016/j.rmal.2023.100047
Download
Download
Effectiveness of a Negotiated Syllabus on the Reading Achievement of Intermediate-Level EFL Learners
Key words: negotiated syllabus, Iranian EFL students, reading comprehension, syllabus design
Few studies have evaluated the effectiveness of a negotiated syllabus on the reading skill of learners. The present study attempts to establish if a negotiated syllabus had any effect on the reading achievement of female EFL learners at intermediate level English proficiency. The study was conducted with the participation of 61 learners placed into the experimental group (n = 32) and the control group (n = 29). The element of negotiation in this study was the topic interests of the experimental group surveyed on an interest areas questionnaire on the basis of which instructional reading passages were selected. Topic interests were not surveyed in the control group and the texts given to them were selected by the teachers. Both groups were tested at the beginning of the experiment and retested at the end of the experiment with a valid reading comprehension test comprising 22 items. A two-sample t-test was performed to compare any means difference between the scores of the experimental group and the control group on the reading test. The results indicated that no statistically significant difference was observed between the mean scores of the experimental and control groups.
Citation: Khademi, A. (2022). Effectiveness of a Negotiated Syllabus on the Reading Achievement of Intermediate-Level EFL Learners. SAGE Open, 12(4). https://doi.org/10.1177/21582440221143319
Download
Determining Effective Environmental Factors in the Distribution of Endangered Endemic Medicinal Plant Species Using the BMLR Model: The Example of Wild Celery (Kelussia odoratissima Mozaff., Apiaceae) in Zagros (Iran)
Key words: Backward multiple regression
Abstract: Kelussia odoratissima Mozaff. is a medicinal species native to Iran. The goal of this research was to determine the environmental factors important for the distribution of K. doratissima in Iran using BMLR modeling. Six random transects were established throughout the species’ habitat, and 220 quadrats with an area of 4 m2 were plotted. The canopy cover percentages of K. doratissima were estimated in each quadrat. Topographic factors, including elevation, slope, and aspect maps, were generated by creating DEM images. Land use, land evaluation, evaporation, temperature, and precipitation maps of the area were created accordingly. The data collected from the experiments were analyzed using the Minitab and R statistical packages. To determine the effect of the studied factors in the distribution of K. doratissima, we ran a set of backward multiple linear regressions. The results showed that the effects of evaporation, elevation, and slope were significant in the species’ distribution, with elevation having a positive effect and evaporation and slope showing negative effects. Further, elevation had the highest effect on distribution (greatest absolute value of beta at 9.660). The next most significant factors in the plant’s distribution were evaporation (beta = 8.282) and slope (beta = 0.807), respectively.
Citation: Jahantab, E., Mahmoudi, M. R., Sharafatmandrad, M., Karimian, V., Sheidai-Karkaj, E., Khademi, A., Morshedloo, M. R., et al. (2022). Determining Effective Environmental Factors in the Distribution of Endangered Endemic Medicinal Plant Species Using the BMLR Model: The Example of Wild Celery (Kelussia odoratissima Mozaff., Apiaceae) in Zagros (Iran). Plants, 11(21), 2965. MDPI AG. Retrieved from http://dx.doi.org/10.3390/plants11212965
Download
The relationship between acute cardiac attack and LDL-C serum levels in cardiac and CCU inpatients in Hajar hospital: Replying to a paradox
Key words: ANOVA, Chi-square test
Abstract: Acute myocardial infarction (MI) is one of the most prevalent heart diseases across the world, including in Iran. The purpose of the present study was to investigate the relationship between acute MI and serum low-density lipoprotein (LDL) levels in patients with acute MI.
Citation: Lotfizadeh, M., Khaledifar, A., Heidari, F., A Khdemi, A. (2022). The relationship between acute cardiac attack and LDL-C serum levels in cardiac and CCU inpatients in Hajar hospital: Replying to a paradox. Journal of Shahrekord University of Medical Sciences. 2022;24(3): 139-144.
doi: 10.34172/jsums.2022.23
Download
Natural diversity in phenolic components and antioxidant properties of oregano (Origanum vulgare L.) accessions, grown under the same conditions
Key words: Principal Components Analysis
Abstract: Oregano (Origanum vulgare L.) is a rich source of biologically active components such as phenolic compounds. Here, seven pot grown O. vulgare accessions belonging to three subspecies (subsp. virens, subsp. vulgare and subsp. gracile) were investigated for their content in sixteen bioactive phenolic compounds as well as their antioxidant capacities (DPPH• and FRAP tests), total phenolic content (TPC) and total flavonoid content (TFC) in order to identify the most suitable ones on an industrial level. HPLC analyses showed that rosmarinic acid (659.6–1646.9 mg/100 g DW) was by far the most abundant constituent, followed by luteolin (46.5–345.4 mg/100 g DW), chicoric acid (36.3–212.5 mg/100 g DW), coumarin (65.7–193.9 mg/100 g DW) and quercetin (10.6–106.1 mg/100 g DW), with variability in concentration depending on the accession and subspecies. The highest level of rosmarinic acid and TPC was obtained from Ardabil accession (subsp. virens). There was a significant and positive correlation between rosmarinic acid and antioxidant activity (r = 0.46). TFC significantly correlated to TPC (r = 0.57) as well as to chicoric acid (r = 0.73). Cluster (CA) and principal component (PCA) analyses classified the investigated accessions in three different groups. Such natural variabilities in phenolics provide the possibility of using elite plants for nutraceutical and pharmaceutical industries and domestication of highly antioxidative accessions of oregano.
Citation: Jafari Khorsand, G., Morshedloo, M.R., Mumivand, H. et al. Natural diversity in phenolic components and antioxidant properties of oregano (Origanum vulgare L.) accessions, grown under the same conditions. Sci Rep 12, 5813 (2022). https://doi.org/10.1038/s41598-022-09742-4
Download
Exploring and Comparing High Stakes Writing Test Prompts Content Structure
Key words: multidimensional scaling, content structure, dimension reduction, IELTS, TOEFL, writing
Abstract: TOEFL and IELTS are two major tests that measure language preparedness of prospectivecollege students. The writing section of these two tests provide a measure of readiness for academic writing. However, to what extent these two testsmeasure the same contents has not been quantitatively investigated before. In this paper, multidimensional scaling (MDS) is applied to explore the content structure of writingprompts inTOEFL and IELTS examinations.
Citation: Khademi, A. (2022, January 2). Exploring and Comparing High Stakes Writing Test Prompts Content Structure. https://doi.org/10.35542/osf.io/98crg
Download
Comparing DIF detection accuracy with observed score and latent score ordinal logistic model: a simulation study
Key words: DIF, differential item functioning, latent score, logistic regression
We studied and compared the performance of two approaches in the detection of differential item functioning: observed score and hybrid (latent-observed) methods. Twelve polytomous items were simulated using sample sizes of 100, 200, 500, 1000. The results show superior performance of the observed score approach at smaller sample sizes, and equal performance at larger samples. Under such parameter and test length conditions, the observed score approach seems a better candidate for different contexts, given large sample sizes may not be always available.
Citation: Khademi, Abdolvahab & Wang, Xiaoyun (2019). Comparing DIF detection accuracy with observed score and latent score ordinal logistic model: a simulation study. Poster presented at Northeastern Educational Research Association, Trumbull, CT, USA (October 21-23, 2019).
Download
Assessing the performance of classical and modern classification methods: LR, TR, RF and SVM
Key words: logistic regression, tree regression, random forest, support vector machine
A primary purpose in educational testing is achieving optimal decision making results regarding the examinees (admit/reject) or the psychometric aspects of the test (e.g. identifying DIF items or a cut-off score). We assess and compare classification accuracy of the classical logistic regression with tree classification, random forests and support vector machines.
Citation: Khademi, Abdolvahab (2015). Assessing the performance of classical and modern classification methods: LR, TR, RF, and SVM. Poster presented at the Northeastern Educational Research Association, Trumbull, CT, USA (October 21-23, 2015).
Download
The impact of teachers' linguistic and affective feedback on Iranian EFL students' writing skills
Key words: composition, EFL, essay, expectation, expository writing, feedback
Errors and error correction are two basic elements of an EFL writing class, and Iranian EFL teachers have always been grappling with the students’ mistakes and errors. Tracking down applicable techniques and efficient strategies for correcting students’ mistakes and errors has forced teachers to carefully scrutinize the students’ behaviour, skills, aptitudes, and the conditions of the class. On the other hand, one of the students’ needs in EFL writing courses is a clear assessment of their progress. Therefore, teachers may have to respond and comment on the students’ progress according to their performance. This study was an attempt to investigate the impact of teachers’ feedback and comments on EFL students’ writing skill improvement. To answer this question, a one-way Analysis of Variance (ANOVA) was carried out and the results confirmed the existence of a significant effect of teachers’ feedback and comments on EFL students’ writing. Students were motivated to write more creatively and at the same time they had more improvement in the writing skill. The findings based on the reports given by the teachers also suggest that students’ negative attitude toward writing perceived as a boring, difficult, time consuming, and unimportant task gradually leaned toward a positive one.
Citation: Safivand, A., Vahdani Sanavi, R., Khademi Shamami, A. (2010). The Impact pf Teachers' Linguistic and Affective Feedback on Iranian EFL Students' Writing Skills. Proceedings of ICERI2010 Conference. 15th-17th November 2010, Madrid, Spain
Download
High-stakes tests writing component: what makes it different from writing in general
Key words: writing test, high-stakes tests
What is it that one needs to know in a language to be able to write it? What challenges do students meet when they try to write for a high-stakes test? How should the teacher react to students taking such high-stakes tests as TOEFL or IELTS? Or yet, does knowledge of all these things for both the teacher and the learner contribute significantly to their successful enhancement of knowledge and subsequent better performance? These are only a few of the problems that students who wish to sit for these exams should deal with. Basically, what the researchers would like to find out is what challenges high-stakes tests candidates face. Some scholars (Brown, [1]; Nunan, [2]) have mentioned what makes writing difficult but none have mentioned whether or not there is anything which adds to the difficulty of the high-stakes tests. What other challenges do students meet when taking the TOEFL or IELTS exam? To this end, some 67, IELTS and TOEFL students were asked to fill out a questionnaire to find out what challenges they have when they write in English, in general and what makes high-stakes writing component difficult, in particular. Twenty four of the participants were then interviewed to have an in-depth analysis of their challenges. The findings, among others, suggest that high-stakes tests exert so much pressure on the candidates. Since they believe their future rests in their performance on such tests, they might not even be able to evince their actual knowledge of the language.
Citation: Behrouz, D., Vahdani Sanavi, R., Safivand, A., Khademi Shamami, Abdolvahab. (2010). High-stakes tests writing component: what makes it different from writing in general. Proceedings of ICERI2010 Conference. 15th-17th November 2010, Madrid, Spain
Download
An empirical rationale for the selection of syntactic structures in ESP materials development.
Key words: corpus linguistics, NLP, syntax, TG grammar, materials development
This study reports on the extension of one part of the SAMT large-scale project designed to develop EAP materials for Iranian university students. The project involves four dimensions of Needs Specification, Content Specification, Purpose Specification, and Task Specification. The present study is within the dimension of Content Specification for which a voluminous academic corpus has been collected for the main project. One of the most important aspects of gradation in structural syllabuses is the selection of various syntactic structures. An impressionistic survey of current EAP textbooks reveals diversity in the presentation manners of structural elements that seems to be on no theoretical or empirical foundation. Therefore, it seems necessary to investigate the validity of such statements. One solution to this problem in current EAP syllabuses is to base the selection and gradation of syntactic structures on an empirical rationale which arises from a corpus generated frequency of structures. To develop an objective list of syntactic structures, four steps have been taken. First, all English syntactic structures, in the form of linear phrase structure rules, were determined based on surveying the related sources. Second, the extracted syntactic structures were coded into a machine readable pattern. Third, the coded patterns were applied to part of the available corpus which was compiled for the whole project. And finally, syntactic structures with the highest frequency were identified, selected, and ordered to be used in EAP materials. The details of the findings will be discussed further and the implications of the study will be presented.
Citation: Nayernia, A., Khademi Shamami, A. (2007). An empirical rationale for the selection of syntactic structures in ESP materials development. Paper presented at First TELSI Conference, Shiraz, Iran.
Download