A PHP Error was encountered

Severity: Warning

Message: file_get_contents(https://...@gmail.com&api_key=61f08fa0b96a73de8c900d749fcb997acc09&a=1): Failed to open stream: HTTP request failed! HTTP/1.1 429 Too Many Requests

Filename: helpers/my_audit_helper.php

Line Number: 197

Backtrace:

File: /var/www/html/application/helpers/my_audit_helper.php
Line: 197
Function: file_get_contents

File: /var/www/html/application/helpers/my_audit_helper.php
Line: 271
Function: simplexml_load_file_from_url

File: /var/www/html/application/helpers/my_audit_helper.php
Line: 3165
Function: getPubMedXML

File: /var/www/html/application/controllers/Detail.php
Line: 597
Function: pubMedSearch_Global

File: /var/www/html/application/controllers/Detail.php
Line: 511
Function: pubMedGetRelatedKeyword

File: /var/www/html/index.php
Line: 317
Function: require_once

Investigating response behavior through TF-IDF and Word2vec text analysis: A case study of PISA 2012 problem-solving process data. | LitMetric

Investigating response behavior through TF-IDF and Word2vec text analysis: A case study of PISA 2012 problem-solving process data.

Heliyon

Collaborative Innovation Center of Assessment Towards Basic Education Quality, Beijing Normal University, No. 19, XinJieKouWai St., HaiDian District, Beijing, 100875, PR China Beijing, China.

Published: August 2024


Category Ranking

98%

Total Visits

921

Avg Visit Duration

2 minutes

Citations

20

Article Abstract

The process data in computer-based problem-solving evaluation is rich in valuable implicit information. However, its diverse and irregular structure poses challenges for effective feature extraction, leading to varying degrees of information loss in existing methods. Process-response behavior exhibits similarities to textual data in terms of the key units and contextual relationships. Despite the scarcity of relevant research, exploring text analysis methods for feature recognition in process data is significant. This study investigated the efficacy of Term Frequency-Inverse Document Frequency (TF-IDF) and Word to Vector (Word2vec) in extracting response behavior features and compared the predictive, analytical, and clustering effects of classical machine learning methods (supervised and unsupervised) on response behavior. An analysis of the PISA 2012 computer-based problem-solving dataset revealed that TF-IDF effectively extracted key response behaviors, whereas Word2vec captured effective features from sequenced response behaviors. In addition, in supervised machine learning using both methods, the random forest model based on TF-IDF performed the best, followed by the SVM model based on Word2vec. Word2vec-based models outperformed TF-IDF-based ones in the F1-score, accuracy, and recall (except for precision) across the logistic regression, k-nearest neighbor, and support vector machine algorithms. In unsupervised machine learning, the k-means algorithm effectively clustered different response behavior patterns extracted by these methods. The findings underscore the theoretical and methodological transferability of these text analysis methods in educational and psychological assessment contexts. This study offers valuable insights for research and practice in similar domains by yielding rich feature representations, supplementing fine-grained assessment evidence, fostering personalized learning, and introducing novel insights for educational assessment.

Download full-text PDF

Source
http://www.ncbi.nlm.nih.gov/pmc/articles/PMC11379602PMC
http://dx.doi.org/10.1016/j.heliyon.2024.e35945DOI Listing

Publication Analysis

Top Keywords

response behavior
16
text analysis
12
process data
12
machine learning
12
pisa 2012
8
computer-based problem-solving
8
analysis methods
8
learning methods
8
response behaviors
8
model based
8

Similar Publications