pith. sign in

Beyond correctness: Benchmarking multi-dimensional code generation for large language models

6 Pith papers cite this work. Polarity classification is still indexing.

6 Pith papers citing it

citation-role summary

background 2

citation-polarity summary

fields

cs.SE 4 cs.HC 2

years

2026 4 2025 2

roles

background 2

polarities

background 1 support 1

clear filters

representative citing papers

MetaLint: Easy-to-Hard Generalization for Code Linting

cs.SE · 2025-07-15 · unverdicted · novelty 7.0

MetaLint uses meta-learning to let models generalize from easy synthetic linting data to hard human-curated best practices, yielding large F-score gains on a new PEP-inspired benchmark.

Subjective Code Preferences in Experts and Large Language Models

cs.HC · 2026-05-24 · unverdicted · novelty 6.0

LLMs frequently reverse their stated coding preferences when shown actual code instead of descriptions, show positional bias, and produce more polarized ratings than human experts on complexity, commenting, modularity, and readability.

citing papers explorer

Showing 4 of 4 citing papers after filters.