Machine learning pipeline with MICE imputation, tree-based feature selection, and ensemble models predicts birth weight, claiming improved performance on constrained clinical datasets.
Feature Selection via Mutual Information: New Theoretical Insights
1 Pith paper cite this work. Polarity classification is still indexing.
abstract
Mutual information has been successfully adopted in filter feature-selection methods to assess both the relevancy of a subset of features in predicting the target variable and the redundancy with respect to other variables. However, existing algorithms are mostly heuristic and do not offer any guarantee on the proposed solution. In this paper, we provide novel theoretical results showing that conditional mutual information naturally arises when bounding the ideal regression/classification errors achieved by different subsets of features. Leveraging on these insights, we propose a novel stopping condition for backward and forward greedy methods which ensures that the ideal prediction error using the selected feature subset remains bounded by a user-specified threshold. We provide numerical simulations to support our theoretical claims and compare to common heuristic methods.
fields
cs.LG 1years
2025 1verdicts
UNVERDICTED 1representative citing papers
citing papers explorer
-
Predicting Fetal Birthweight from High Dimensional Data using Advanced Machine Learning
Machine learning pipeline with MICE imputation, tree-based feature selection, and ensemble models predicts birth weight, claiming improved performance on constrained clinical datasets.