pith. machine review for the scientific record. sign in

arxiv: 1012.2609 · v4 · submitted 2010-12-13 · 💻 cs.LG · cs.AI

Recognition: unknown

Inverse-Category-Frequency based supervised term weighting scheme for text categorization

Authors on Pith no claims yet
classification 💻 cs.LG cs.AI
keywords termweightingschemessupervisedcategorizationschemetermstext
0
0 comments X
read the original abstract

Term weighting schemes often dominate the performance of many classifiers, such as kNN, centroid-based classifier and SVMs. The widely used term weighting scheme in text categorization, i.e., tf.idf, is originated from information retrieval (IR) field. The intuition behind idf for text categorization seems less reasonable than IR. In this paper, we introduce inverse category frequency (icf) into term weighting scheme and propose two novel approaches, i.e., tf.icf and icf-based supervised term weighting schemes. The tf.icf adopts icf to substitute idf factor and favors terms occurring in fewer categories, rather than fewer documents. And the icf-based approach combines icf and relevance frequency (rf) to weight terms in a supervised way. Our cross-classifier and cross-corpus experiments have shown that our proposed approaches are superior or comparable to six supervised term weighting schemes and three traditional schemes in terms of macro-F1 and micro-F1.

This paper has not been read by Pith yet.

discussion (0)

Sign in with ORCID, Apple, or X to comment. Anyone can read and Pith papers without signing in.

Forward citations

Cited by 1 Pith paper

Reviewed papers in the Pith corpus that reference this work. Sorted by Pith novelty score.

  1. Model-Agnostic Meta Learning for Class Imbalance Adaptation

    cs.CL 2026-04 conditional novelty 5.0

    HAMR combines meta-learning with hardness-aware weighting and neighborhood resampling to improve minority-class performance on imbalanced NLP datasets.