Measuring the relationship between social capital, race, and education

Key words: Social capital; Race; Education; Equality

Enhancing social and community support (e.g., social capital) is essential for building healthier communities, as social capital significantly influences health outcomes. However, the relationship between social capital, race, and education is complex. Historically marginalized groups often face systemic barriers that reduce their social capital. Therefore, longitudinal research is essential to understand these dynamics and address health disparities. This study explores the relationship between social capital, race, and education in U.S. adults over time, using Midlife in the U.S. (MIDUS) data from Waves 1–3 (1995–1996; 2004–2006; 2013–2014). We used the disparity assessment framework from Ward et al. and multilevel mixed-effects models to investigate how social capital evolved differently based on race and education as well as the potential implications of these differences. Our findings revealed that Black respondents consistently demonstrated higher community contributions and community involvement compared with White respondents, despite having lower education on average. This social capital advantage for Black respondents persisted across all three waves of the MIDUS study. Longitudinal analysis also showed that community contributions remained stable at all time points for all respondents, while community involvement declined at MIDUS 3. However, Black respondents exhibited a prominent increase in community involvement at MIDUS 3, suggesting that Black communities may have adapted and thrived through culturally specific forms of social capital during that period. Our findings indicated these positive manifestations of social capital should be explored to see how it can be supported and suggested the need for further exploration of racial dynamics and culturally specific forms of social capital.

Citation: Contreras, J., Amissah, C. M., Khademi, A., Valeriann, C., & Villalonga-Olives, E. (2025). Measuring the relationship between social capital, race, and education. Frontiers in Public Health, Vol 13.

Download

Beware the Little Foxes that Spoil the Vines: Small Inconsistencies in Clinical Data Can Distort Machine Learning Findings

Key words: Biomedical informatics; Noisy data; Noisy covariates; Machine learning; Clinical data

It is well known that Electronic Health Records (EHR) data contain inconsistent and inaccurate data, the effect of which on predictive model performance and risk/benefit factor identification are often neglected. This study investigates how varying levels of random and non-random binary differences, often referred to as "noise", affect modeling tools, such as logistic regression, support vector machines, and gradient boosting models. Using curated data from the All of Us database, we simulated different noise levels to mimic real-world variability. Across all models and noise types, increased noise consistently reduced classification accuracy. More importantly, noise diminished the variance of variable impact scores while leaving their means unchanged, suggesting a muted ability to identify key predictors. These findings imply that even modest noise levels can obscure meaningful signals. Measures like accuracy and hazard ratios may thus be misleading in noisy data contexts. The consistency of effects across models and noise mechanisms suggests this issue stems from inherent data variability rather than model brittleness, with broad implications for EHR data analyses.

Citation: Khademi, A., Tuttle, M. S., Qing, Z., & Nelson, S. J. (2025). Beware the Little Foxes that Spoil the Vines: Small Inconsistencies in Clinical Data Can Distort Machine Learning Findings. Journal of Health Sciences, Vol 8, Issue 3, 872-879.

Download

Download PDF

Unveiling disparities: examining differential item functioning’s impact on racial health equity among white and black populations

Key words: Social Capital; Item response theory; Measurement invariance; Differential item functioning; Health disparities; Racial difference;

This paper aims to examine the psychometric properties of social capital indicators, comparing Black and White respondents to identify the extent of measurement invariance in social capital by race. We used data from the longitudinal study Midlife in the United States (MIDUS), waves 1 through 3 (1995-2016). Data were from 6513 respondents (5604 White and 909 Black respondents). Social capital indicators were social cohesion, contributions to community, and community involvement. We used Structural Equation Modeling and Item Response Theory methods to test for measurement invariance in social capital by race. We observed violations of longitudinal and multi-group measurement invariance (MI) at configural and metric levels on two scales. Factor structures and indicator loadings were inconsistent over time. In IRT analysis, ‘Many people come for advice’ exhibited Differential Item Functioning (DIF), indicating a consistent advantage for White respondents on the contributions to community scale. Despite similar social capital levels (P(χ2,2) = 0.00), DIF was found in all contributions to community items and some community involvement items when examining race and education interaction.

Citation: Villalonga-Olives, E., Khademi, A., Pan, Y. Y., & Ransome, Y. (2024). Unveiling disparities: examining differential item functioning’s impact on racial health equity among white and black populations. Public Health, 231, 80-87.

Download

A Longitudinal Analysis of the Variability of Cognitive Complexity in IELTS Academic Writing Prompts: a mixed-method study

Key words: Writing prompts; Task complexity; Cognitive complexity; IELTS

Cognitive complexity of writing prompts may affect the linguistic performance of test takers in writing assessment, as evidenced by previous research. In this mixed method study, we aim to explore if and to what extent the perceived cognitive complexity of IELTS Writing Task II prompts has changedover time. We use official IELTS prompts for this study collected over test administrations years 1996 to 2022. For each test year, two prompts were randomly selected (total of 20 prompts for ten years) and were structured in a survey. Experienced IELTS teachers and assessment experts (n = 19) rated the perceived cognitive complexity of each prompt on 1-8 Likert scale. An intra-class correlation (ICC) coefficient showed a high degree of inter-rater reliability among the raters(r = 0.84, p = 0.00). A time series analysis showed that the perceived year-on-year cognitive complexity of the prompts showed significant negative change (b = -0.073, p=0.00). In the qualitative part of the study, the prompts with the lowest and highest cognitive complexity mean ratings were presented to the raters and inquired what made those prompts easy or difficult for test takers.

Citation: Khademi, A. (2024, March 19). A Longitudinal Analysis of the Variability of Cognitive Complexity in IELTS Academic Writing Prompts: a mixed-method study. https://doi.org/10.35542/osf.io/euv87

Download

Examining Appropriacy of CFI and TLI Cutoff Value in Multiple-Group CFA Test of Measurement Invariance to Enhance Accuracy of Test Score Interpretation

Key words: measurement invariance, fit index, MG-CFA, multiple group confirmatory factor analysis, lack of invariance, differential item functioning

The most common effect size when using a multiple-group confirmatory factor analysis approach to measurement invariance is ΔCFI and ΔTLI with a cutoff value of 0.01. However, this recommended cutoff value may not be ubiquitously appropriate and may be of limited application for some tests (e.g., measures using dichotomous items or different estimation methods, sample sizes, or model complexity). Moreover, prior cutoff value estimations often have ignored consequences resulting in using measures that more accurately estimate countries’ or learners’ proficiency for some countries or groups versus others. In this study, we investigate whether the cutoff value proposed by Cheung and Rensvold (ΔCFI or ΔTLI > 0.01) is appropriate across educational measurement contexts. Specifically, we investigated the performance of ΔCFI and ΔTLI in capturing LOI at the scalar level in dichotomous items within item response theory on groups whose test characteristic curves differed by 0.5. Simulation results showed that the proposed cutoff value of 0.01 in ΔCFI and ΔTLI was not appropriate to capture LOI under the study conditions, which may result in the misinterpretation of test results or inaccurate inferences.

Citation: Khademi, A., Wells, C. S., Oliveri, M. E. & Villalonga-Olives, E. (2023). Examining appropriacy of CFI/TLI cutoff value in multiple-group CFA test of measurement invariance to enhance accuracy of test score interpretation. https://doi.org/10.31234/osf.io/y36uv

Download

Can ChatGPT and Bard Generate Aligned Assessment Items? A Reliability Analysis against Human Performance

Key words: ChatGPT; Artificial Intelligence; Deep learning; Natural language processing; Automated item generation

ChatGPT and Bard are AI chatbots based on Large Language Models (LLM) that are slated to promise different applications in diverse areas. In education, these AI technologies have been tested for applications in assessment and teaching. In assessment, AI has long been used in automated essay scoring and automated item generation. One psychometric property that these tools must have to assist or replace humans in assessment is high reliability in terms of agreement between AI scores and human raters. In this paper, we measure the reliability of OpenAI ChatGP and Google Bard LLMs tools against experienced and trained humans in perceiving and rating the complexity of writing prompts. Intraclass correlation (ICC) as a performance metric showed that the inter-reliability of both the OpenAI ChatGPT and the Google Bard were low against the gold standard of human ratings.

Citation: Khademi, A. (2023). Can ChatGPT and Bard Generate Aligned Assessment Items? A Reliability Analysis against Human Performance. Journal of Applied Learning and Teaching, 6(1). DOI: 10.37074/jalt.2023.6.1.28

Download

Download Prompts

Investigating Test Content Structure using Multidimensional Scaling

Key words: Multidimensional scaling; Dimension reduction; Construct validation; Writing assessment

Although multidimensional scaling (MDS) has extensively been used in social and physical sciences to visualize and explore the latent dimensionality of observed data, few studies in language assessment have used MDS for the purpose of test construction, including construct and content validation. In this study, we use MDS to investigate and compare the similarity of writing prompts in the IELTS and TOEFL iBT tests. Random prompts from both tests were presented as item pairs and raters were asked to rate the similarity of the pairs in terms of content and cognitive complexity. The results showed the writing prompts in the TOEFL iBT test represented two major dimensions while those in the IELTS test demonstrated three domains.

Citation: Khademi, A. (2023). Investigating test content structure using multidimensional scaling. Research Methods in Applied Linguistics, 2(2). https://doi.org/10.1016/j.rmal.2023.100047

Download

Download

Effectiveness of a Negotiated Syllabus on the Reading Achievement of Intermediate-Level EFL Learners

Key words: negotiated syllabus, Iranian EFL students, reading comprehension, syllabus design

Few studies have evaluated the effectiveness of a negotiated syllabus on the reading skill of learners. The present study attempts to establish if a negotiated syllabus had any effect on the reading achievement of female EFL learners at intermediate level English proficiency. The study was conducted with the participation of 61 learners placed into the experimental group (n = 32) and the control group (n = 29). The element of negotiation in this study was the topic interests of the experimental group surveyed on an interest areas questionnaire on the basis of which instructional reading passages were selected. Topic interests were not surveyed in the control group and the texts given to them were selected by the teachers. Both groups were tested at the beginning of the experiment and retested at the end of the experiment with a valid reading comprehension test comprising 22 items. A two-sample t-test was performed to compare any means difference between the scores of the experimental group and the control group on the reading test. The results indicated that no statistically significant difference was observed between the mean scores of the experimental and control groups.

Citation: Khademi, A. (2022). Effectiveness of a Negotiated Syllabus on the Reading Achievement of Intermediate-Level EFL Learners. SAGE Open, 12(4). https://doi.org/10.1177/21582440221143319

Download

Determining Effective Environmental Factors in the Distribution of Endangered Endemic Medicinal Plant Species Using the BMLR Model: The Example of Wild Celery (Kelussia odoratissima Mozaff., Apiaceae) in Zagros (Iran)

Key words: Backward multiple regression

Abstract: Kelussia odoratissima Mozaff. is a medicinal species native to Iran. The goal of this research was to determine the environmental factors important for the distribution of K. doratissima in Iran using BMLR modeling. Six random transects were established throughout the species’ habitat, and 220 quadrats with an area of 4 m2 were plotted. The canopy cover percentages of K. doratissima were estimated in each quadrat. Topographic factors, including elevation, slope, and aspect maps, were generated by creating DEM images. Land use, land evaluation, evaporation, temperature, and precipitation maps of the area were created accordingly. The data collected from the experiments were analyzed using the Minitab and R statistical packages. To determine the effect of the studied factors in the distribution of K. doratissima, we ran a set of backward multiple linear regressions. The results showed that the effects of evaporation, elevation, and slope were significant in the species’ distribution, with elevation having a positive effect and evaporation and slope showing negative effects. Further, elevation had the highest effect on distribution (greatest absolute value of beta at 9.660). The next most significant factors in the plant’s distribution were evaporation (beta = 8.282) and slope (beta = 0.807), respectively.

Citation: Jahantab, E., Mahmoudi, M. R., Sharafatmandrad, M., Karimian, V., Sheidai-Karkaj, E., Khademi, A., Morshedloo, M. R., et al. (2022). Determining Effective Environmental Factors in the Distribution of Endangered Endemic Medicinal Plant Species Using the BMLR Model: The Example of Wild Celery (Kelussia odoratissima Mozaff., Apiaceae) in Zagros (Iran). Plants, 11(21), 2965. MDPI AG. Retrieved from http://dx.doi.org/10.3390/plants11212965

Download

The relationship between acute cardiac attack and LDL-C serum levels in cardiac and CCU inpatients in Hajar hospital: Replying to a paradox

Key words: ANOVA, Chi-square test

Abstract: Acute myocardial infarction (MI) is one of the most prevalent heart diseases across the world, including in Iran. The purpose of the present study was to investigate the relationship between acute MI and serum low-density lipoprotein (LDL) levels in patients with acute MI.

Citation: Lotfizadeh, M., Khaledifar, A., Heidari, F., A Khdemi, A. (2022). The relationship between acute cardiac attack and LDL-C serum levels in cardiac and CCU inpatients in Hajar hospital: Replying to a paradox. Journal of Shahrekord University of Medical Sciences. 2022;24(3): 139-144. doi: 10.34172/jsums.2022.23

Download

Natural diversity in phenolic components and antioxidant properties of oregano (Origanum vulgare L.) accessions, grown under the same conditions

Key words: Principal Components Analysis

Abstract: Oregano (Origanum vulgare L.) is a rich source of biologically active components such as phenolic compounds. Here, seven pot grown O. vulgare accessions belonging to three subspecies (subsp. virens, subsp. vulgare and subsp. gracile) were investigated for their content in sixteen bioactive phenolic compounds as well as their antioxidant capacities (DPPH• and FRAP tests), total phenolic content (TPC) and total flavonoid content (TFC) in order to identify the most suitable ones on an industrial level. HPLC analyses showed that rosmarinic acid (659.6–1646.9 mg/100 g DW) was by far the most abundant constituent, followed by luteolin (46.5–345.4 mg/100 g DW), chicoric acid (36.3–212.5 mg/100 g DW), coumarin (65.7–193.9 mg/100 g DW) and quercetin (10.6–106.1 mg/100 g DW), with variability in concentration depending on the accession and subspecies. The highest level of rosmarinic acid and TPC was obtained from Ardabil accession (subsp. virens). There was a significant and positive correlation between rosmarinic acid and antioxidant activity (r = 0.46). TFC significantly correlated to TPC (r = 0.57) as well as to chicoric acid (r = 0.73). Cluster (CA) and principal component (PCA) analyses classified the investigated accessions in three different groups. Such natural variabilities in phenolics provide the possibility of using elite plants for nutraceutical and pharmaceutical industries and domestication of highly antioxidative accessions of oregano.

Citation: Jafari Khorsand, G., Morshedloo, M.R., Mumivand, H. et al. Natural diversity in phenolic components and antioxidant properties of oregano (Origanum vulgare L.) accessions, grown under the same conditions. Sci Rep 12, 5813 (2022). https://doi.org/10.1038/s41598-022-09742-4

Download

Exploring and Comparing High Stakes Writing Test Prompts Content Structure

Key words: multidimensional scaling, content structure, dimension reduction, IELTS, TOEFL, writing

Abstract: TOEFL and IELTS are two major tests that measure language preparedness of prospectivecollege students. The writing section of these two tests provide a measure of readiness for academic writing. However, to what extent these two testsmeasure the same contents has not been quantitatively investigated before. In this paper, multidimensional scaling (MDS) is applied to explore the content structure of writingprompts inTOEFL and IELTS examinations.

Citation: Khademi, A. (2022, January 2). Exploring and Comparing High Stakes Writing Test Prompts Content Structure. https://doi.org/10.35542/osf.io/98crg

Download

Comparing DIF detection accuracy with observed score and latent score ordinal logistic model: a simulation study

Key words: DIF, differential item functioning, latent score, logistic regression

We studied and compared the performance of two approaches in the detection of differential item functioning: observed score and hybrid (latent-observed) methods. Twelve polytomous items were simulated using sample sizes of 100, 200, 500, 1000. The results show superior performance of the observed score approach at smaller sample sizes, and equal performance at larger samples. Under such parameter and test length conditions, the observed score approach seems a better candidate for different contexts, given large sample sizes may not be always available.

Citation: Khademi, Abdolvahab & Wang, Xiaoyun (2019). Comparing DIF detection accuracy with observed score and latent score ordinal logistic model: a simulation study. Poster presented at Northeastern Educational Research Association, Trumbull, CT, USA (October 21-23, 2019).

Download

Assessing the performance of classical and modern classification methods: LR, TR, RF and SVM

Key words: logistic regression, tree regression, random forest, support vector machine

A primary purpose in educational testing is achieving optimal decision making results regarding the examinees (admit/reject) or the psychometric aspects of the test (e.g. identifying DIF items or a cut-off score). We assess and compare classification accuracy of the classical logistic regression with tree classification, random forests and support vector machines.

Citation: Khademi, Abdolvahab (2015). Assessing the performance of classical and modern classification methods: LR, TR, RF, and SVM. Poster presented at the Northeastern Educational Research Association, Trumbull, CT, USA (October 21-23, 2015).

Download

The impact of teachers' linguistic and affective feedback on Iranian EFL students' writing skills

Key words: composition, EFL, essay, expectation, expository writing, feedback

Errors and error correction are two basic elements of an EFL writing class, and Iranian EFL teachers have always been grappling with the students’ mistakes and errors. Tracking down applicable techniques and efficient strategies for correcting students’ mistakes and errors has forced teachers to carefully scrutinize the students’ behaviour, skills, aptitudes, and the conditions of the class. On the other hand, one of the students’ needs in EFL writing courses is a clear assessment of their progress. Therefore, teachers may have to respond and comment on the students’ progress according to their performance. This study was an attempt to investigate the impact of teachers’ feedback and comments on EFL students’ writing skill improvement. To answer this question, a one-way Analysis of Variance (ANOVA) was carried out and the results confirmed the existence of a significant effect of teachers’ feedback and comments on EFL students’ writing. Students were motivated to write more creatively and at the same time they had more improvement in the writing skill. The findings based on the reports given by the teachers also suggest that students’ negative attitude toward writing perceived as a boring, difficult, time consuming, and unimportant task gradually leaned toward a positive one.

Citation: Safivand, A., Vahdani Sanavi, R., Khademi Shamami, A. (2010). The Impact pf Teachers' Linguistic and Affective Feedback on Iranian EFL Students' Writing Skills. Proceedings of ICERI2010 Conference. 15th-17th November 2010, Madrid, Spain

Download

High-stakes tests writing component: what makes it different from writing in general

Key words: writing test, high-stakes tests

What is it that one needs to know in a language to be able to write it? What challenges do students meet when they try to write for a high-stakes test? How should the teacher react to students taking such high-stakes tests as TOEFL or IELTS? Or yet, does knowledge of all these things for both the teacher and the learner contribute significantly to their successful enhancement of knowledge and subsequent better performance? These are only a few of the problems that students who wish to sit for these exams should deal with. Basically, what the researchers would like to find out is what challenges high-stakes tests candidates face. Some scholars (Brown, [1]; Nunan, [2]) have mentioned what makes writing difficult but none have mentioned whether or not there is anything which adds to the difficulty of the high-stakes tests. What other challenges do students meet when taking the TOEFL or IELTS exam? To this end, some 67, IELTS and TOEFL students were asked to fill out a questionnaire to find out what challenges they have when they write in English, in general and what makes high-stakes writing component difficult, in particular. Twenty four of the participants were then interviewed to have an in-depth analysis of their challenges. The findings, among others, suggest that high-stakes tests exert so much pressure on the candidates. Since they believe their future rests in their performance on such tests, they might not even be able to evince their actual knowledge of the language.

Citation: Behrouz, D., Vahdani Sanavi, R., Safivand, A., Khademi Shamami, Abdolvahab. (2010). High-stakes tests writing component: what makes it different from writing in general. Proceedings of ICERI2010 Conference. 15th-17th November 2010, Madrid, Spain

Download

An empirical rationale for the selection of syntactic structures in ESP materials development.

Key words: corpus linguistics, NLP, syntax, TG grammar, materials development

This study reports on the extension of one part of the SAMT large-scale project designed to develop EAP materials for Iranian university students. The project involves four dimensions of Needs Specification, Content Specification, Purpose Specification, and Task Specification. The present study is within the dimension of Content Specification for which a voluminous academic corpus has been collected for the main project. One of the most important aspects of gradation in structural syllabuses is the selection of various syntactic structures. An impressionistic survey of current EAP textbooks reveals diversity in the presentation manners of structural elements that seems to be on no theoretical or empirical foundation. Therefore, it seems necessary to investigate the validity of such statements. One solution to this problem in current EAP syllabuses is to base the selection and gradation of syntactic structures on an empirical rationale which arises from a corpus generated frequency of structures. To develop an objective list of syntactic structures, four steps have been taken. First, all English syntactic structures, in the form of linear phrase structure rules, were determined based on surveying the related sources. Second, the extracted syntactic structures were coded into a machine readable pattern. Third, the coded patterns were applied to part of the available corpus which was compiled for the whole project. And finally, syntactic structures with the highest frequency were identified, selected, and ordered to be used in EAP materials. The details of the findings will be discussed further and the implications of the study will be presented.

Citation: Nayernia, A., Khademi Shamami, A. (2007). An empirical rationale for the selection of syntactic structures in ESP materials development. Paper presented at First TELSI Conference, Shiraz, Iran.

Download