A new corpus of 108 mixed string-numeric tables shows that advanced tabular learners with basic string embeddings perform well on most real-world data, while large LLM encoders help on free-text heavy tables.
Individual comparisons by ranking methods,
8 Pith papers cite this work. Polarity classification is still indexing.
citation-role summary
citation-polarity summary
verdicts
UNVERDICTED 8roles
background 2polarities
background 2representative citing papers
A large-scale study of real-world repositories finds that AI-generated code differs from human-written code in complexity, structural traits, defect indicators, and commit-level activity patterns.
LLMs propose volatile performance improvements on real-world Java tasks that lag human developers on average, showing algorithmic benchmarks overestimate capabilities.
A black-box LLM approach for fault localization in system-level test code that estimates execution traces from failure logs to rank potential faults with reduced inference cost.
Hybrid Bayesian-graph LLM agent reaches competitive performance against large models and achieves 67% win rate against humans in controlled Avalon play, outperforming baselines and human teammates.
RESCORE recovers task-coherent simulations from 40.7% of 500 CDC papers via a three-component LLM agent pipeline and claims a 10X speedup over manual human replication.
A reproducible grid-based pipeline converts Austin e-scooter trips into spatiotemporal demand images; a correlation-plus-error method plus ablation study on UNET selects temporal inputs that cut next-hour MSE by up to 37% and next-24-hour MSE by up to 35% versus simple baselines.
Multi-objective LTR combining clicks, VLM labels, and locale boosting improves relevance and local content visibility across five growth markets.
citing papers explorer
-
STRABLE: Benchmarking Tabular Machine Learning with Strings
A new corpus of 108 mixed string-numeric tables shows that advanced tabular learners with basic string embeddings perform well on most real-world data, while large LLM encoders help on free-text heavy tables.
-
A Large-Scale Empirical Study of AI-Generated Code in Real-World Repositories
A large-scale study of real-world repositories finds that AI-generated code differs from human-written code in complexity, structural traits, defect indicators, and commit-level activity patterns.
-
Do AI Models Dream of Faster Code? An Empirical Study on LLM-Proposed Performance Improvements in Real-World Software
LLMs propose volatile performance improvements on real-world Java tasks that lag human developers on average, showing algorithmic benchmarks overestimate capabilities.
-
Efficient Black-Box Fault Localization for System-Level Test Code Using Large Language Models
A black-box LLM approach for fault localization in system-level test code that estimates execution traces from failure logs to rank potential faults with reduced inference cost.
-
Bayesian Social Deduction with Graph-Informed Language Models
Hybrid Bayesian-graph LLM agent reaches competitive performance against large models and achieves 67% win rate against humans in controlled Avalon play, outperforming baselines and human teammates.
-
RESCORE: LLM-Driven Simulation Recovery in Control Systems Research Papers
RESCORE recovers task-coherent simulations from 40.7% of 500 CDC papers via a three-component LLM agent pipeline and claims a 10X speedup over manual human replication.
-
A Grid-Based Framework for E-Scooter Demand Representation and Temporal Input Design for Deep Learning: Evidence from Austin, Texas
A reproducible grid-based pipeline converts Austin e-scooter trips into spatiotemporal demand images; a correlation-plus-error method plus ablation study on UNET selects temporal inputs that cut next-hour MSE by up to 37% and next-24-hour MSE by up to 35% versus simple baselines.
-
Localization Boosting for Growth Markets: Mitigating Cross-Locale Behavioral Bias in Learning-to-Rank
Multi-objective LTR combining clicks, VLM labels, and locale boosting improves relevance and local content visibility across five growth markets.