pith. sign in

arxiv: 2206.06587 · v1 · pith:BV5GA3ZOnew · submitted 2022-06-14 · 💻 cs.LG · cs.AI

Learning Enhanced Representations for Tabular Data via Neighborhood Propagation

classification 💻 cs.LG cs.AI
keywords datapropagationtabularinstancemodelpredictiontasksenhance
0
0 comments X
read the original abstract

Prediction over tabular data is an essential and fundamental problem in many important downstream tasks. However, existing methods either take a data instance of the table independently as input or do not fully utilize the multi-rows features and labels to directly change and enhance the target data representations. In this paper, we propose to 1) construct a hypergraph from relevant data instance retrieval to model the cross-row and cross-column patterns of those instances, and 2) perform message Propagation to Enhance the target data instance representation for Tabular prediction tasks. Specifically, our specially-designed message propagation step benefits from 1) fusion of label and features during propagation, and 2) locality-aware high-order feature interactions. Experiments on two important tabular data prediction tasks validate the superiority of the proposed PET model against other baselines. Additionally, we demonstrate the effectiveness of the model components and the feature enhancement ability of PET via various ablation studies and visualizations. The code is included in https://github.com/KounianhuaDu/PET.

This paper has not been read by Pith yet.

discussion (0)

Sign in with ORCID, Apple, or X to comment. Anyone can read and Pith papers without signing in.

Forward citations

Cited by 1 Pith paper

Reviewed papers in the Pith corpus that reference this work. Sorted by Pith novelty score.

  1. TabICL: A Tabular Foundation Model for In-Context Learning on Large Data

    cs.LG 2025-02 unverdicted novelty 6.0

    TabICL scales in-context learning to large tabular data via column-then-row attention for row embeddings followed by a transformer, matching TabPFNv2 speed and performance while outperforming it and CatBoost on datase...