Data Aphasia: An Institutional Counterfactual Study of the Stability of Academic Cognition Under Letter-Grade Evaluation Systems

Li Li; Yu Cao

arxiv: 2606.12946 · v1 · pith:OWEWWHYKnew · submitted 2026-06-11 · 💻 cs.CY

Data Aphasia: An Institutional Counterfactual Study of the Stability of Academic Cognition Under Letter-Grade Evaluation Systems

Li Li , Yu Cao This is my paper

Pith reviewed 2026-06-27 05:43 UTC · model grok-4.3

classification 💻 cs.CY

keywords data aphasialetter-grade evaluationacademic cognitioninstitutional counterfactualdiagnostic consistencyinformation entropyclustering stabilityeducational evaluation

0 comments

The pith

Letter-grade conversion induces data aphasia that makes academic structures unstable to single-student changes.

A machine-rendered reading of the paper's core claim, the machinery that carries it, and where it could break.

The paper establishes that mandated letter-grade presentation restricts diagnostic information about student performance, termed data aphasia, and tests this via counterfactual conversion of percentage scores from 68 mathematics exams taken by 75 primary students. It shows entropy falling 69 percent, full-sample clustering stable at K=4 yet jumping to K=8 after removing one anchor student with diagnostic consistency dropping from 95 percent to 62 percent, and temporal consistency ranging 52-96 percent versus 93-96 percent under percentages. A sympathetic reader would care because the findings indicate that evaluation reforms aimed at simplification can distort the education system's ability to maintain consistent pictures of individual academic identities.

Core claim

The central claim is that the letter-grade system causes data aphasia through discretization that compresses the feature space nineteenfold, flattens density gradients, and creates pseudo-heterogeneity regions, rendering clustering boundaries highly sensitive to minor perturbations. Under the full sample the system appears stable at K=4, but exclusion of one extreme anchor student raises optimal K to 8 and drops individual diagnostic identity consistency from 95 percent to 62 percent, while temporal consistency fluctuates between 52 percent and 96 percent against the percentage system's 93-96 percent baseline.

What carries the argument

Institutional counterfactual simulation that converts percentage scores to A/B/C/D letter grades and compares information entropy, optimal cluster number K, and diagnostic identity consistency before and after conversion.

If this is right

Information entropy decreases by approximately 69 percent after conversion to letter grades.
Letter-grade clustering appears stable at K=4 in full samples but becomes unstable upon removal of one extreme anchor student.
Individual diagnostic identity consistency falls from 95 percent to 62 percent when an anchor student is excluded.
Temporal consistency of diagnostic identities ranges 52-96 percent, below the 93-96 percent baseline of the percentage system.
Discretization compresses the feature space nineteenfold and generates pseudo-heterogeneity regions that flatten density gradients.

Where Pith is reading between the lines

These are editorial extensions of the paper, not claims the author makes directly.

Education systems using only letter grades may systematically mis-track student progress patterns compared with percentage-based records.
Policy decisions about interventions or grouping could rest on groupings that shift with small data changes.
A dual-track system retaining both letter and percentage data might preserve diagnostic stability while meeting simplification goals.
Similar sensitivity effects could appear in non-mathematics subjects or larger age groups if the discretization mechanism is general.

Load-bearing premise

That the clustering procedure and definition of diagnostic identity consistency on discretized grades capture stable academic structures rather than artifacts of the discretization itself.

What would settle it

A replication dataset in which removing any single student leaves optimal K unchanged at 4 and keeps diagnostic identity consistency above 90 percent would falsify the claimed instability.

read the original abstract

Does the letter-grade evaluation system, while achieving its burden-reduction goals, affect the education system's stable understanding of students' academic structures? This paper introduces the concept of "data aphasia," referring to restrictions on diagnostic information expression caused by institutionally mandated forms of data presentation. Using data from 68 mathematics examinations administered to 75 primary school students, we employ an institutional counterfactual simulation method to convert percentage scores into A/B/C/D letter grades and conduct systematic tests at the information, structural, and diagnostic levels. Results show that information entropy decreases by approximately 69% after grade conversion; under the full sample, the letter-grade system appears superficially stable (K=4), but removing a single extreme anchor student causes the optimal K to increase from 4 to 8 and individual diagnostic identity consistency to fall from 95% to 62%; temporal consistency fluctuates between 52% and 96%, far below the 93%-96% baseline of the percentage system. Mechanism analysis indicates that discretization compresses the feature space by approximately nineteenfold across 68 examinations; after standardization, it creates extensive pseudo-heterogeneity regions, flattens density gradients, and makes clustering boundaries highly sensitive to minor perturbations. Based on these findings, this paper proposes a dual-track evaluation mechanism and provides a testable analytical framework for understanding the cognitive costs of educational evaluation reform.

Editorial analysis

A structured set of objections, weighed in public.

Desk editor's note, referee report, simulated authors' rebuttal, and a circularity audit. Tearing a paper down is the easy half of reading it; the pith above is the substance, this is the friction.

Desk Editor's Note private letter to a colleague

Letter grades cut entropy by 69% and make clusters jump from K=4 to K=8 after dropping one student, but the clustering method and consistency metric are never described so the numbers cannot be checked.

read the letter

The main finding is that converting these percentage scores to letter grades loses roughly 69% of the entropy and turns the student groupings fragile: under the full sample the optimal K stays at 4, but removing one extreme student pushes it to 8 and drops individual consistency from 95% to 62%, while temporal consistency swings between 52% and 96% against a 93-96% baseline for raw percentages.

What is new is the concrete counterfactual on 68 real primary-school math exams for 75 students, plus the specific demonstration that a single anchor student can flip the apparent structure. The paper applies standard entropy and clustering ideas to an institutional question in a straightforward way.

The soft spots are substantial. The abstract and available text give no name for the clustering algorithm, no distance measure, no rule for choosing optimal K, and no formula for the diagnostic identity consistency metric. Without those details it is impossible to know whether the reported instability comes from the grade discretization itself or from the particular procedure chosen. The post-hoc removal of one student also looks selective, and deriving both the clusters and the stability claims from the same discretized data creates a circularity risk. The nineteenfold feature-space compression claim is stated but not derived in the text.

This paper is aimed at people working on learning analytics or education evaluation policy who want quantitative illustrations of information loss. A serious referee could usefully check it if the authors supply the exact code, the clustering specification, and the raw processed data, because the underlying question about evaluation formats is worth testing even though the current evidence is too thin to stand on its own.

Referee Report

4 major / 1 minor

Summary. The manuscript claims that letter-grade evaluation systems induce 'data aphasia' by restricting diagnostic information expression. Using an institutional counterfactual simulation on 68 mathematics examinations administered to 75 primary school students, percentage scores are converted to A/B/C/D grades. This yields a ~69% drop in information entropy; the full sample appears stable (optimal K=4) but removal of one extreme anchor student shifts optimal K to 8 and drops individual diagnostic identity consistency from 95% to 62%; temporal consistency fluctuates 52-96% versus a 93-96% baseline for percentages. Mechanism analysis attributes this to ~19-fold feature-space compression, pseudo-heterogeneity regions, and flattened density gradients that make boundaries sensitive to perturbations. The paper proposes a dual-track evaluation mechanism and a testable framework for cognitive costs of grading reform.

Significance. If the quantitative instability claims prove robust once methods are fully specified, the work could offer empirical support for information-loss effects of discretized grading and a simulation-based framework for evaluating educational reforms. The use of real examination data in a counterfactual design is a methodological strength that distinguishes it from purely theoretical critiques of grading systems.

major comments (4)

[Abstract] Abstract: The clustering algorithm (e.g., k-means, hierarchical), distance metric, and optimal-K selection criterion are not stated. These details are load-bearing for the central claims that optimal K shifts from 4 to 8 and consistency falls from 95% to 62% after removal of one anchor student.
[Abstract] Abstract (mechanism analysis paragraph): No equations, formulas, or step-by-step calculations are supplied for the reported 69% entropy decrease or the nineteenfold feature-space compression. These quantities cannot be reproduced from the given text.
[Abstract] Abstract: The precise definition and computational formula for 'individual diagnostic identity consistency' and 'temporal consistency' are omitted. Without them it is impossible to assess whether the reported drops (95%→62%, 52-96% range) are intrinsic to letter-grade discretization or artifacts of the chosen metric and discretization boundaries.
[Abstract] Abstract: The post-hoc removal of a single extreme anchor student is presented as decisive evidence of instability, yet no justification, pre-specified rule, or sensitivity analysis for this removal is provided. This step directly supports the headline contrast between the two grading systems.

minor comments (1)

[Abstract] The term 'data aphasia' is introduced without reference to existing literature on information loss, discretization effects, or related concepts in educational measurement or data science.

Simulated Author's Rebuttal

4 responses · 0 unresolved

We thank the referee for the thorough and constructive review. The comments correctly identify areas where the abstract lacks necessary methodological transparency. We address each point below and will revise the abstract and relevant sections accordingly to improve reproducibility.

read point-by-point responses

Referee: [Abstract] Abstract: The clustering algorithm (e.g., k-means, hierarchical), distance metric, and optimal-K selection criterion are not stated. These details are load-bearing for the central claims that optimal K shifts from 4 to 8 and consistency falls from 95% to 62% after removal of one anchor student.

Authors: We agree these details must be stated explicitly. The manuscript employs k-means clustering with Euclidean distance and selects optimal K via the silhouette coefficient. We will add this specification to the abstract in the revised version. revision: yes
Referee: [Abstract] Abstract (mechanism analysis paragraph): No equations, formulas, or step-by-step calculations are supplied for the reported 69% entropy decrease or the nineteenfold feature-space compression. These quantities cannot be reproduced from the given text.

Authors: The referee is correct that the abstract omits the formulas. The entropy reduction is computed as 1 - (H_letter-grades / H_percentages) using Shannon entropy on the discretized versus continuous score distributions across the 68 examinations; the nineteenfold compression is the ratio of distinct possible values (101 percentages versus 4 letter grades) averaged over the feature space. We will incorporate the equations and a brief derivation into the abstract. revision: yes
Referee: [Abstract] Abstract: The precise definition and computational formula for 'individual diagnostic identity consistency' and 'temporal consistency' are omitted. Without them it is impossible to assess whether the reported drops (95%→62%, 52-96% range) are intrinsic to letter-grade discretization or artifacts of the chosen metric and discretization boundaries.

Authors: We accept this criticism. Individual diagnostic identity consistency is the percentage of students whose cluster assignment remains unchanged across bootstrap resamples of the data; temporal consistency is the fraction of students retaining the same cluster label between consecutive examinations. Both are computed after optimal-K selection. We will add these definitions and formulas to the abstract. revision: yes
Referee: [Abstract] Abstract: The post-hoc removal of a single extreme anchor student is presented as decisive evidence of instability, yet no justification, pre-specified rule, or sensitivity analysis for this removal is provided. This step directly supports the headline contrast between the two grading systems.

Authors: The comment is valid; the abstract presents the removal without sufficient justification or pre-specification. In the full manuscript this is framed as an illustrative sensitivity check, but we agree a pre-specified rule (e.g., removal of the single highest and lowest scoring students) and a broader sensitivity analysis across multiple candidates must be added. We will revise the abstract and methods to include these elements. revision: yes

Circularity Check

0 steps flagged

No circularity: empirical clustering results on external exam data are independent of the instability claims

full rationale

The paper applies a counterfactual conversion of percentage scores to letter grades on 68 real examinations from 75 students, then reports entropy drop, optimal cluster count K, and consistency metrics computed on that transformed data. No step reduces the reported instability (K shift or consistency drop) to a definition or fit that presupposes the result; the percentage-system baseline provides an external comparator, and the mechanism analysis (feature-space compression, density flattening) follows directly from the discretization without self-referential closure. The derivation chain remains self-contained against the input examination records.

Axiom & Free-Parameter Ledger

2 free parameters · 2 axioms · 1 invented entities

The central claim rests on the assumption that percentage scores are the ground-truth representation of academic structure and that the chosen clustering and consistency metrics validly measure diagnostic stability; the new term 'data aphasia' is introduced without external grounding.

free parameters (2)

letter-grade boundaries
Standard A/B/C/D cutoffs are applied to convert continuous scores; exact thresholds not stated and may affect all downstream entropy and clustering results.
optimal K in clustering
K=4 chosen for full sample and shown to shift to 8 after anchor removal; selection procedure is data-dependent.

axioms (2)

domain assumption Percentage scores faithfully represent underlying academic structures without measurement error beyond the grading conversion
Used as the stable baseline against which letter-grade instability is measured.
domain assumption Clustering on grade vectors captures diagnostic identity
Central to the claim that letter grades produce unstable student groupings.

invented entities (1)

data aphasia no independent evidence
purpose: Name the restriction on diagnostic information caused by mandated letter-grade forms
New conceptual label introduced to frame the entropy and stability findings; no independent evidence supplied.

pith-pipeline@v0.9.1-grok · 5774 in / 1633 out tokens · 24546 ms · 2026-06-27T05:43:25.495866+00:00 · methodology

discussion (0)

Reference graph

Works this paper leans on

52 extracted references · 24 canonical work pages

[1]

Opinions on Further Reducing the Homework Burden and Off-Campus Training Burden of Students in Compulsory Education [J]

[1] General Office of the CPC Central Committee; General Office of the State Council. Opinions on Further Reducing the Homework Burden and Off-Campus Training Burden of Students in Compulsory Education [J]. Gazette of the State Council of the People's Republic of China, 2021(22): 14–19. (in Chinese)

2021
[2]

Notice on Further Strengthening the Management of Daily Examinations in Primary and Secondary Schools [Z]

[2] General Office of the Ministry of Education. Notice on Further Strengthening the Management of Daily Examinations in Primary and Secondary Schools [Z]. Department of Basic Education〔2025〕No. 3, 2025-12-

2025
[3]

(in Chinese)

http://www.moe.gov.cn/srcsite/A06/s3321/202512/t20251216_1423634.html. (in Chinese)
[4]

Implementation Opinions on Further Standardizing Examination Management in Compulsory Education Schools [Z]

[3] Department of Education of Anhui Province. Implementation Opinions on Further Standardizing Examination Management in Compulsory Education Schools [Z]. Wan Jiao Ji〔 2021〕 No. 17, 2021-10-22. https://jyt.ah.gov.cn/ztzl/sjgzzxd/zcwj/40490416.html. (in Chinese)

arXiv 2021
[5]

Four key links in deepening the reform of educational evaluation [J]

[4] Xin T. Four key links in deepening the reform of educational evaluation [J]. China Examinations, 2023(10): 1–8. DOI: 10.19360/j.cnki.11-3303/g4.2023.10.001. (in Chinese)

work page doi:10.19360/j.cnki.11-3303/g4.2023.10.001 2023
[6]

Supervised and unsupervised discretization of continuous features [M]// Machine Learning Proceedings 1995

[5] Dougherty J, Kohavi R, Sahami M. Supervised and unsupervised discretization of continuous features [M]// Machine Learning Proceedings 1995. Morgan Kaufmann, 1995: 194–202

1995
[7]

A survey of discretization techniques: Taxonomy and empirical analysis in supervised learning [J]

[6] Garcia S, Luengo J, Sáez J A, et al. A survey of discretization techniques: Taxonomy and empirical analysis in supervised learning [J]. IEEE Transactions on Knowledge and Data Engineering, 2012, 25(4): 734– 750.DOI:10.1109/TKDE.2012.35

work page doi:10.1109/tkde.2012.35 2012
[8]

Finding Groups in Data: An Introduction to Cluster Analysis [M]

[7] Kaufman L, Rousseeuw P J. Finding Groups in Data: An Introduction to Cluster Analysis [M]. John Wiley & Sons, 2009

2009
[9]

Modern Multidimensional Scaling: Theory and Applications [M]

[8] Borg I, Groenen P J F. Modern Multidimensional Scaling: Theory and Applications [M]. New York, NY: Springer New York, 2005

2005
[10]

Applied Multivariate Statistical Analysis [M]

[9] Johnson R A, Wichern D W. Applied Multivariate Statistical Analysis [M]. 2002.https://doi.org/10.1007/978-3-031-63833-6

work page doi:10.1007/978-3-031-63833-6 2002
[11]

A study on the identification of gifted and talented students: A case study of the curriculum experimental class at Wenlai Junior High School in Shanghai [J]

[10] Xiang R F, Bai B, Liu S X. A study on the identification of gifted and talented students: A case study of the curriculum experimental class at Wenlai Junior High School in Shanghai [J]. Research in Educational Development, 2016, 36(2): 49–53. (in Chinese)

2016
[12]

implementing letter-grade evaluation for examinations

[11] Cheng J J, Zhou X J. The implications of "implementing letter-grade evaluation for examinations" in compulsory education [J]. Journal of Gannan Normal University, 2023, 44(2): 94 – 100. DOI: 10.13698/j.cnki.cn36-1346/c.2023.02.016. (in Chinese)

work page doi:10.13698/j.cnki.cn36-1346/c.2023.02.016 2023
[13]

The replacement of this "ruler" is of great significance [N/OL]

[12] Guangming Online Commentator. The replacement of this "ruler" is of great significance [N/OL]. Guangming Online, 2025-12-19. http://views.ce.cn/view/ent/202512/t20251219_2651262.shtml. (in Chinese)

2025
[14]

Examining the phenomenon of educational utilitarianism: A perspective of instrumental rationality [J]

[13] Zhang X F. Examining the phenomenon of educational utilitarianism: A perspective of instrumental rationality [J]. Research in Educational Development, 2008(21): 26 – 28. DOI: 10.14121/j.cnki.1008- 3855.2008.21.005. (in Chinese)

work page doi:10.14121/j.cnki.1008- 2008
[15]

[14] Ma Y X. Is the letter-grade system versus the 100-point system the criterion for distinguishing quality- oriented education evaluation? — Also on the essence of evaluation system reform [J]. Education Science, 1998(4): 35–37. (in Chinese)

1998
[16]

Learning analytics: The emergence of a discipline [J]

[15] Siemens G. Learning analytics: The emergence of a discipline [J]. American Behavioral Scientist, 2013, 57(10): 1380–1400. https://doi.org/10.1177/0002764213498851

work page doi:10.1177/0002764213498851 2013
[17]

Focus on formative feedback [J]

[16] Shute V J. Focus on formative feedback [J]. Review of Educational Research, 2008, 78(1): 153 – 189. https://doi.org/10.3102/0034654307313795

work page doi:10.3102/0034654307313795 2008
[18]

Analytics 2.0 for precision education [J]

[17] Wu J Y, Yang C C Y, Liao C H, et al. Analytics 2.0 for precision education [J]. Educational Technology & Society, 2021, 24(1): 267–279. https://www.jstor.org/stable/26977872

arXiv 2021
[19]

Research progress on cognitive tracking models in educational big data [J]

[18] Hu X G, Liu F, Bu C Y. Research progress on cognitive tracking models in educational big data [J]. Journal of Computer Research and Development, 2020, 57(12): 2523–2546. (in Chinese)

2020
[20]

Educational data mining and learning analytics: An updated survey [J]

[19] Romero C, Ventura S. Educational data mining and learning analytics: An updated survey [J]. Wiley Interdisciplinary Reviews: Data Mining and Knowledge Discovery, 2020, 10(3): e1355. https://doi.org/10.1002/widm.1355

work page doi:10.1002/widm.1355 2020
[21]

Learning analytics: Mining the value of educational data in the era of big data [J]

[20] Wei S P. Learning analytics: Mining the value of educational data in the era of big data [J]. Modern Educational Technology, 2013(2): 5–11. (in Chinese)

2013
[22]

Sorting Things Out: Classification and Its Consequences [M]

[21] Bowker G C, Star S L. Sorting Things Out: Classification and Its Consequences [M]. MIT Press, 2000

2000
[23]

Looking beyond learning: Notes towards the critical study of educational technology [J]

[22] Selwyn N. Looking beyond learning: Notes towards the critical study of educational technology [J]. Journal of Computer Assisted Learning, 2010, 26(1): 65–73. https://doi.org/10.1111/j.1365-2729.2009.00338.x

work page doi:10.1111/j.1365-2729.2009.00338.x 2010
[24]

The Data Revolution: Big Data, Open Data, Data Infrastructures and Their Consequences [M]

[23] Kitchin R. The Data Revolution: Big Data, Open Data, Data Infrastructures and Their Consequences [M]. Sage, 2014

2014
[25]

Measurement invariance conventions and reporting: The state of the art and future directions for psychological research [J]

[24] Putnick D L, Bornstein M H. Measurement invariance conventions and reporting: The state of the art and future directions for psychological research [J]. Developmental Review, 2016, 41: 71 – 90. https://doi.org/10.1016/j.dr.2016.06.004

work page doi:10.1016/j.dr.2016.06.004 2016
[26]

A review and synthesis of the measurement invariance literature: Suggestions, practices, and recommendations for organizational research [J]

[25] Vandenberg R J, Lance C E. A review and synthesis of the measurement invariance literature: Suggestions, practices, and recommendations for organizational research [J]. Organizational Research Methods, 2000, 3(1): 4–70. https://doi.org/10.1177/109442810031002

work page doi:10.1177/109442810031002 2000
[27]

Measurement invariance, factor analysis and factorial invariance [J]

[26] Meredith W. Measurement invariance, factor analysis and factorial invariance [J]. Psychometrika, 1993, 58(4): 525–543. https://doi.org/10.1007/BF02294825

work page doi:10.1007/bf02294825 1993
[28]

Statistical Approaches to Measurement Invariance [M]

[27] Millsap R E. Statistical Approaches to Measurement Invariance [M]. Routledge, 2012

2012
[29]

The cost of dichotomising continuous variables [J]

[28] Altman D G, Royston P. The cost of dichotomising continuous variables [J]. BMJ, 2006, 332(7549): 1080. https://doi.org/10.1136/bmj.332.7549.1080

work page doi:10.1136/bmj.332.7549.1080 2006
[30]

Representation in

[29] Hacking I. The looping effects of human kinds [M]// Causal Cognition: A Multidisciplinary Debate. Oxford University Press, 1995: 351–383. https://doi.org/10.1093/acprof:oso/9780198524021.003.0012

work page doi:10.1093/acprof:oso/9780198524021.003.0012 1995
[31]

Statistical Analysis with Missing Data [M]

[30] Little R J A, Rubin D B. Statistical Analysis with Missing Data [M]. 3rd ed. John Wiley & Sons, 2019

2019
[32]

Routledge International Handbook of Ignorance Studies [M]

[31] Gross M, McGoey L, eds. Routledge International Handbook of Ignorance Studies [M]. London: Routledge, 2015

2015
[33]

Institutions and Organizations: Ideas, Interests, and Identities [M]

[32] Scott W R. Institutions and Organizations: Ideas, Interests, and Identities [M]. 4th ed. Sage Publications, 2013

2013
[34]

On the practice of dichotomization of quantitative variables [J]

[33] MacCallum R C, Zhang S, Preacher K J, et al. On the practice of dichotomization of quantitative variables [J]. Psychological Methods, 2002, 7(1): 19–40. DOI: 10.1037/1082-989X.7.1.19

work page doi:10.1037/1082-989x.7.1.19 2002
[35]

Developing the theory of formative assessment [J]

[34] Black P, Wiliam D. Developing the theory of formative assessment [J]. Educational Assessment, Evaluation and Accountability, 2009, 21(1): 5–31. https://doi.org/10.1007/s11092-008-9068-5

work page doi:10.1007/s11092-008-9068-5 2009
[36]

How Institutions Think [M]

[35] Douglas M. How Institutions Think [M]. Syracuse University Press, 1986

1986
[37]

Overall Plan for Deepening the Reform of Educational Evaluation in the New Era [Z]

[36] The CPC Central Committee; The State Council. Overall Plan for Deepening the Reform of Educational Evaluation in the New Era [Z]. 2020-10-13. (in Chinese)

2020
[38]

Guidelines for Evaluating the Quality of Compulsory Education [Z]

[37] Ministry of Education; Organization Department of the CPC Central Committee; Office of the Central Establishment Committee, et al. Guidelines for Evaluating the Quality of Compulsory Education [Z]. 2021- 03-04. (in Chinese)

2021
[39]

Notice on Strengthening Examination Management in Compulsory Education Schools [Z]

[38] General Office of the Ministry of Education. Notice on Strengthening Examination Management in Compulsory Education Schools [Z]. 2021-08-30. http://www.moe.gov.cn/jyb_xwfb/gzdt_gzdt/s5987/202108/t20210831_556381.html. (in Chinese)

2021
[40]

1948 , journal =

[39] Shannon C E. A mathematical theory of communication [J]. The Bell System Technical Journal, 1948, 27(3): 379–423. DOI: 10.1002/j.1538-7305.1948.tb01338.x

work page doi:10.1002/j.1538-7305.1948.tb01338.x 1948
[41]

Silhouettes: A graphical aid to the interpretation and validation of cluster analysis [J]

[40] Rousseeuw P J. Silhouettes: A graphical aid to the interpretation and validation of cluster analysis [J]. Journal of Computational and Applied Mathematics, 1987, 20: 53 – 65. https://doi.org/10.1016/0377- 0427(87)90125-7

work page doi:10.1016/0377- 1987
[42]

Jolliffe and Jorge Cadima

[41] Jolliffe I T, Cadima J. Principal component analysis: A review and recent developments [J]. Philosophical Transactions of the Royal Society A, 2016, 374(2065): 20150202. https://doi.org/10.1098/rsta.2015.0202

work page doi:10.1098/rsta.2015.0202 2016
[43]

Some methods of classification and analysis of multivariate observations [C]// Proceedings of the 5th Berkeley Symposium on Mathematical Statistics and Probability

[42] McQueen J B. Some methods of classification and analysis of multivariate observations [C]// Proceedings of the 5th Berkeley Symposium on Mathematical Statistics and Probability. 1967: 281–297

1967
[44]

Time-series clustering — A decade review [J]

[43] Aghabozorgi S, Shirkhorshidi A S, Wah T Y. Time-series clustering — A decade review [J]. Information Systems, 2015, 53: 16–38. https://doi.org/10.1016/j.is.2015.04.007

work page doi:10.1016/j.is.2015.04.007 2015
[45]

Guiding Opinions on Strengthening Examination Management and Teaching Quality Evaluation in Compulsory Education Schools [Z]

[44] Education and Sports Bureau of Menghai County. Guiding Opinions on Strengthening Examination Management and Teaching Quality Evaluation in Compulsory Education Schools [Z]. 2025-01-06. (in Chinese)

2025
[46]

A cluster separation measure [J]

[45] Davies D L, Bouldin D W. A cluster separation measure [J]. IEEE Transactions on Pattern Analysis and Machine Intelligence, 1979(2): 224–227. DOI: 10.1109/TPAMI.1979.4766909

work page doi:10.1109/tpami.1979.4766909 1979
[47]

Proceedings of the 26th Annual International Conference on Machine Learning , pages =

[46] Vinh N X, Epps J, Bailey J. Information theoretic measures for clusterings comparison: Is a correction for chance necessary? [C]// Proceedings of the 26th Annual International Conference on Machine Learning. 2009: 1073–1080. https://doi.org/10.1145/1553374.1553511

work page doi:10.1145/1553374.1553511 2009
[48]

Ben-David, U

[47] Ben-David S, Von Luxburg U, Pál D. A sober look at clustering stability [C]// International Conference on Computational Learning Theory. Berlin, Heidelberg: Springer, 2006: 5 – 19. https://doi.org/10.1007/11776420_4

work page doi:10.1007/11776420_4 2006
[49]

The unanticipated consequences of purposive social action [J]

[48] Merton R K. The unanticipated consequences of purposive social action [J]. American Sociological Review, 1936, 1(6): 894–904. https://doi.org/10.2307/2084615

work page doi:10.2307/2084615 1936
[50]

Abandoning score worship and returning to the original aspiration of educating people [EB/OL]

[49] Zhang Z Y. Abandoning score worship and returning to the original aspiration of educating people [EB/OL]. (2021-08-31) [2026-06-09]. http://www.moe.gov.cn/jyb_xwfb/moe_2082/2021/2021_zl54/202108/t20210831_556486.html. (in Chinese)

2021
[51]

Double Reduction

[50] Yang Y M. Achievements, problems, and countermeasures of the "Double Reduction" policy implementation [J]. Democracy and Science, 2022(5): 61–64. (in Chinese)

2022
[52]

Double Reduction

[51] Sun D. An assessment of the effectiveness of the "Double Reduction" policy: An empirical study based on internet big data [J]. Future and Development, 2026, 50(1): 128–133. (in Chinese)

2026

[1] [1]

Opinions on Further Reducing the Homework Burden and Off-Campus Training Burden of Students in Compulsory Education [J]

[1] General Office of the CPC Central Committee; General Office of the State Council. Opinions on Further Reducing the Homework Burden and Off-Campus Training Burden of Students in Compulsory Education [J]. Gazette of the State Council of the People's Republic of China, 2021(22): 14–19. (in Chinese)

2021

[2] [2]

Notice on Further Strengthening the Management of Daily Examinations in Primary and Secondary Schools [Z]

[2] General Office of the Ministry of Education. Notice on Further Strengthening the Management of Daily Examinations in Primary and Secondary Schools [Z]. Department of Basic Education〔2025〕No. 3, 2025-12-

2025

[3] [3]

(in Chinese)

http://www.moe.gov.cn/srcsite/A06/s3321/202512/t20251216_1423634.html. (in Chinese)

[4] [4]

Implementation Opinions on Further Standardizing Examination Management in Compulsory Education Schools [Z]

[3] Department of Education of Anhui Province. Implementation Opinions on Further Standardizing Examination Management in Compulsory Education Schools [Z]. Wan Jiao Ji〔 2021〕 No. 17, 2021-10-22. https://jyt.ah.gov.cn/ztzl/sjgzzxd/zcwj/40490416.html. (in Chinese)

arXiv 2021

[5] [5]

Four key links in deepening the reform of educational evaluation [J]

[4] Xin T. Four key links in deepening the reform of educational evaluation [J]. China Examinations, 2023(10): 1–8. DOI: 10.19360/j.cnki.11-3303/g4.2023.10.001. (in Chinese)

work page doi:10.19360/j.cnki.11-3303/g4.2023.10.001 2023

[6] [6]

Supervised and unsupervised discretization of continuous features [M]// Machine Learning Proceedings 1995

[5] Dougherty J, Kohavi R, Sahami M. Supervised and unsupervised discretization of continuous features [M]// Machine Learning Proceedings 1995. Morgan Kaufmann, 1995: 194–202

1995

[7] [7]

A survey of discretization techniques: Taxonomy and empirical analysis in supervised learning [J]

[6] Garcia S, Luengo J, Sáez J A, et al. A survey of discretization techniques: Taxonomy and empirical analysis in supervised learning [J]. IEEE Transactions on Knowledge and Data Engineering, 2012, 25(4): 734– 750.DOI:10.1109/TKDE.2012.35

work page doi:10.1109/tkde.2012.35 2012

[8] [8]

Finding Groups in Data: An Introduction to Cluster Analysis [M]

[7] Kaufman L, Rousseeuw P J. Finding Groups in Data: An Introduction to Cluster Analysis [M]. John Wiley & Sons, 2009

2009

[9] [9]

Modern Multidimensional Scaling: Theory and Applications [M]

[8] Borg I, Groenen P J F. Modern Multidimensional Scaling: Theory and Applications [M]. New York, NY: Springer New York, 2005

2005

[10] [10]

Applied Multivariate Statistical Analysis [M]

[9] Johnson R A, Wichern D W. Applied Multivariate Statistical Analysis [M]. 2002.https://doi.org/10.1007/978-3-031-63833-6

work page doi:10.1007/978-3-031-63833-6 2002

[11] [11]

A study on the identification of gifted and talented students: A case study of the curriculum experimental class at Wenlai Junior High School in Shanghai [J]

[10] Xiang R F, Bai B, Liu S X. A study on the identification of gifted and talented students: A case study of the curriculum experimental class at Wenlai Junior High School in Shanghai [J]. Research in Educational Development, 2016, 36(2): 49–53. (in Chinese)

2016

[12] [12]

implementing letter-grade evaluation for examinations

[11] Cheng J J, Zhou X J. The implications of "implementing letter-grade evaluation for examinations" in compulsory education [J]. Journal of Gannan Normal University, 2023, 44(2): 94 – 100. DOI: 10.13698/j.cnki.cn36-1346/c.2023.02.016. (in Chinese)

work page doi:10.13698/j.cnki.cn36-1346/c.2023.02.016 2023

[13] [13]

The replacement of this "ruler" is of great significance [N/OL]

[12] Guangming Online Commentator. The replacement of this "ruler" is of great significance [N/OL]. Guangming Online, 2025-12-19. http://views.ce.cn/view/ent/202512/t20251219_2651262.shtml. (in Chinese)

2025

[14] [14]

Examining the phenomenon of educational utilitarianism: A perspective of instrumental rationality [J]

[13] Zhang X F. Examining the phenomenon of educational utilitarianism: A perspective of instrumental rationality [J]. Research in Educational Development, 2008(21): 26 – 28. DOI: 10.14121/j.cnki.1008- 3855.2008.21.005. (in Chinese)

work page doi:10.14121/j.cnki.1008- 2008

[15] [15]

[14] Ma Y X. Is the letter-grade system versus the 100-point system the criterion for distinguishing quality- oriented education evaluation? — Also on the essence of evaluation system reform [J]. Education Science, 1998(4): 35–37. (in Chinese)

1998

[16] [16]

Learning analytics: The emergence of a discipline [J]

[15] Siemens G. Learning analytics: The emergence of a discipline [J]. American Behavioral Scientist, 2013, 57(10): 1380–1400. https://doi.org/10.1177/0002764213498851

work page doi:10.1177/0002764213498851 2013

[17] [17]

Focus on formative feedback [J]

[16] Shute V J. Focus on formative feedback [J]. Review of Educational Research, 2008, 78(1): 153 – 189. https://doi.org/10.3102/0034654307313795

work page doi:10.3102/0034654307313795 2008

[18] [18]

Analytics 2.0 for precision education [J]

[17] Wu J Y, Yang C C Y, Liao C H, et al. Analytics 2.0 for precision education [J]. Educational Technology & Society, 2021, 24(1): 267–279. https://www.jstor.org/stable/26977872

arXiv 2021

[19] [19]

Research progress on cognitive tracking models in educational big data [J]

[18] Hu X G, Liu F, Bu C Y. Research progress on cognitive tracking models in educational big data [J]. Journal of Computer Research and Development, 2020, 57(12): 2523–2546. (in Chinese)

2020

[20] [20]

Educational data mining and learning analytics: An updated survey [J]

[19] Romero C, Ventura S. Educational data mining and learning analytics: An updated survey [J]. Wiley Interdisciplinary Reviews: Data Mining and Knowledge Discovery, 2020, 10(3): e1355. https://doi.org/10.1002/widm.1355

work page doi:10.1002/widm.1355 2020

[21] [21]

Learning analytics: Mining the value of educational data in the era of big data [J]

[20] Wei S P. Learning analytics: Mining the value of educational data in the era of big data [J]. Modern Educational Technology, 2013(2): 5–11. (in Chinese)

2013

[22] [22]

Sorting Things Out: Classification and Its Consequences [M]

[21] Bowker G C, Star S L. Sorting Things Out: Classification and Its Consequences [M]. MIT Press, 2000

2000

[23] [23]

Looking beyond learning: Notes towards the critical study of educational technology [J]

[22] Selwyn N. Looking beyond learning: Notes towards the critical study of educational technology [J]. Journal of Computer Assisted Learning, 2010, 26(1): 65–73. https://doi.org/10.1111/j.1365-2729.2009.00338.x

work page doi:10.1111/j.1365-2729.2009.00338.x 2010

[24] [24]

The Data Revolution: Big Data, Open Data, Data Infrastructures and Their Consequences [M]

[23] Kitchin R. The Data Revolution: Big Data, Open Data, Data Infrastructures and Their Consequences [M]. Sage, 2014

2014

[25] [25]

Measurement invariance conventions and reporting: The state of the art and future directions for psychological research [J]

[24] Putnick D L, Bornstein M H. Measurement invariance conventions and reporting: The state of the art and future directions for psychological research [J]. Developmental Review, 2016, 41: 71 – 90. https://doi.org/10.1016/j.dr.2016.06.004

work page doi:10.1016/j.dr.2016.06.004 2016

[26] [26]

A review and synthesis of the measurement invariance literature: Suggestions, practices, and recommendations for organizational research [J]

[25] Vandenberg R J, Lance C E. A review and synthesis of the measurement invariance literature: Suggestions, practices, and recommendations for organizational research [J]. Organizational Research Methods, 2000, 3(1): 4–70. https://doi.org/10.1177/109442810031002

work page doi:10.1177/109442810031002 2000

[27] [27]

Measurement invariance, factor analysis and factorial invariance [J]

[26] Meredith W. Measurement invariance, factor analysis and factorial invariance [J]. Psychometrika, 1993, 58(4): 525–543. https://doi.org/10.1007/BF02294825

work page doi:10.1007/bf02294825 1993

[28] [28]

Statistical Approaches to Measurement Invariance [M]

[27] Millsap R E. Statistical Approaches to Measurement Invariance [M]. Routledge, 2012

2012

[29] [29]

The cost of dichotomising continuous variables [J]

[28] Altman D G, Royston P. The cost of dichotomising continuous variables [J]. BMJ, 2006, 332(7549): 1080. https://doi.org/10.1136/bmj.332.7549.1080

work page doi:10.1136/bmj.332.7549.1080 2006

[30] [30]

Representation in

[29] Hacking I. The looping effects of human kinds [M]// Causal Cognition: A Multidisciplinary Debate. Oxford University Press, 1995: 351–383. https://doi.org/10.1093/acprof:oso/9780198524021.003.0012

work page doi:10.1093/acprof:oso/9780198524021.003.0012 1995

[31] [31]

Statistical Analysis with Missing Data [M]

[30] Little R J A, Rubin D B. Statistical Analysis with Missing Data [M]. 3rd ed. John Wiley & Sons, 2019

2019

[32] [32]

Routledge International Handbook of Ignorance Studies [M]

[31] Gross M, McGoey L, eds. Routledge International Handbook of Ignorance Studies [M]. London: Routledge, 2015

2015

[33] [33]

Institutions and Organizations: Ideas, Interests, and Identities [M]

[32] Scott W R. Institutions and Organizations: Ideas, Interests, and Identities [M]. 4th ed. Sage Publications, 2013

2013

[34] [34]

On the practice of dichotomization of quantitative variables [J]

[33] MacCallum R C, Zhang S, Preacher K J, et al. On the practice of dichotomization of quantitative variables [J]. Psychological Methods, 2002, 7(1): 19–40. DOI: 10.1037/1082-989X.7.1.19

work page doi:10.1037/1082-989x.7.1.19 2002

[35] [35]

Developing the theory of formative assessment [J]

[34] Black P, Wiliam D. Developing the theory of formative assessment [J]. Educational Assessment, Evaluation and Accountability, 2009, 21(1): 5–31. https://doi.org/10.1007/s11092-008-9068-5

work page doi:10.1007/s11092-008-9068-5 2009

[36] [36]

How Institutions Think [M]

[35] Douglas M. How Institutions Think [M]. Syracuse University Press, 1986

1986

[37] [37]

Overall Plan for Deepening the Reform of Educational Evaluation in the New Era [Z]

[36] The CPC Central Committee; The State Council. Overall Plan for Deepening the Reform of Educational Evaluation in the New Era [Z]. 2020-10-13. (in Chinese)

2020

[38] [38]

Guidelines for Evaluating the Quality of Compulsory Education [Z]

[37] Ministry of Education; Organization Department of the CPC Central Committee; Office of the Central Establishment Committee, et al. Guidelines for Evaluating the Quality of Compulsory Education [Z]. 2021- 03-04. (in Chinese)

2021

[39] [39]

Notice on Strengthening Examination Management in Compulsory Education Schools [Z]

[38] General Office of the Ministry of Education. Notice on Strengthening Examination Management in Compulsory Education Schools [Z]. 2021-08-30. http://www.moe.gov.cn/jyb_xwfb/gzdt_gzdt/s5987/202108/t20210831_556381.html. (in Chinese)

2021

[40] [40]

1948 , journal =

[39] Shannon C E. A mathematical theory of communication [J]. The Bell System Technical Journal, 1948, 27(3): 379–423. DOI: 10.1002/j.1538-7305.1948.tb01338.x

work page doi:10.1002/j.1538-7305.1948.tb01338.x 1948

[41] [41]

Silhouettes: A graphical aid to the interpretation and validation of cluster analysis [J]

[40] Rousseeuw P J. Silhouettes: A graphical aid to the interpretation and validation of cluster analysis [J]. Journal of Computational and Applied Mathematics, 1987, 20: 53 – 65. https://doi.org/10.1016/0377- 0427(87)90125-7

work page doi:10.1016/0377- 1987

[42] [42]

Jolliffe and Jorge Cadima

[41] Jolliffe I T, Cadima J. Principal component analysis: A review and recent developments [J]. Philosophical Transactions of the Royal Society A, 2016, 374(2065): 20150202. https://doi.org/10.1098/rsta.2015.0202

work page doi:10.1098/rsta.2015.0202 2016

[43] [43]

Some methods of classification and analysis of multivariate observations [C]// Proceedings of the 5th Berkeley Symposium on Mathematical Statistics and Probability

[42] McQueen J B. Some methods of classification and analysis of multivariate observations [C]// Proceedings of the 5th Berkeley Symposium on Mathematical Statistics and Probability. 1967: 281–297

1967

[44] [44]

Time-series clustering — A decade review [J]

[43] Aghabozorgi S, Shirkhorshidi A S, Wah T Y. Time-series clustering — A decade review [J]. Information Systems, 2015, 53: 16–38. https://doi.org/10.1016/j.is.2015.04.007

work page doi:10.1016/j.is.2015.04.007 2015

[45] [45]

Guiding Opinions on Strengthening Examination Management and Teaching Quality Evaluation in Compulsory Education Schools [Z]

[44] Education and Sports Bureau of Menghai County. Guiding Opinions on Strengthening Examination Management and Teaching Quality Evaluation in Compulsory Education Schools [Z]. 2025-01-06. (in Chinese)

2025

[46] [46]

A cluster separation measure [J]

[45] Davies D L, Bouldin D W. A cluster separation measure [J]. IEEE Transactions on Pattern Analysis and Machine Intelligence, 1979(2): 224–227. DOI: 10.1109/TPAMI.1979.4766909

work page doi:10.1109/tpami.1979.4766909 1979

[47] [47]

Proceedings of the 26th Annual International Conference on Machine Learning , pages =

[46] Vinh N X, Epps J, Bailey J. Information theoretic measures for clusterings comparison: Is a correction for chance necessary? [C]// Proceedings of the 26th Annual International Conference on Machine Learning. 2009: 1073–1080. https://doi.org/10.1145/1553374.1553511

work page doi:10.1145/1553374.1553511 2009

[48] [48]

Ben-David, U

[47] Ben-David S, Von Luxburg U, Pál D. A sober look at clustering stability [C]// International Conference on Computational Learning Theory. Berlin, Heidelberg: Springer, 2006: 5 – 19. https://doi.org/10.1007/11776420_4

work page doi:10.1007/11776420_4 2006

[49] [49]

The unanticipated consequences of purposive social action [J]

[48] Merton R K. The unanticipated consequences of purposive social action [J]. American Sociological Review, 1936, 1(6): 894–904. https://doi.org/10.2307/2084615

work page doi:10.2307/2084615 1936

[50] [50]

Abandoning score worship and returning to the original aspiration of educating people [EB/OL]

[49] Zhang Z Y. Abandoning score worship and returning to the original aspiration of educating people [EB/OL]. (2021-08-31) [2026-06-09]. http://www.moe.gov.cn/jyb_xwfb/moe_2082/2021/2021_zl54/202108/t20210831_556486.html. (in Chinese)

2021

[51] [51]

Double Reduction

[50] Yang Y M. Achievements, problems, and countermeasures of the "Double Reduction" policy implementation [J]. Democracy and Science, 2022(5): 61–64. (in Chinese)

2022

[52] [52]

Double Reduction

[51] Sun D. An assessment of the effectiveness of the "Double Reduction" policy: An empirical study based on internet big data [J]. Future and Development, 2026, 50(1): 128–133. (in Chinese)

2026