pith. sign in

Livecodebench: Holistic and contamination free evaluation of large language models for code

2 Pith papers cite this work. Polarity classification is still indexing.

2 Pith papers citing it

fields

cs.LG 2

years

2026 2

verdicts

UNVERDICTED 2

representative citing papers

Majority Voting for Code Generation

cs.LG · 2026-04-17 · unverdicted · novelty 5.0

Functional Majority Voting selects code by runtime agreement on tests, boosting LiveCodeBench performance and serving as an aggregation method for label-free test-time RL without exceeding base model limits.

citing papers explorer

Showing 2 of 2 citing papers.

  • CoSPlay: Cooperative Self-Play at Test-Time with Self-Generated Code and Unit Test cs.LG · 2026-05-22 · unverdicted · none · ref 12

    CoSPlay jointly refines self-generated codes and unit tests via bidirectional pass-count signals and consensus selection, raising pass@N and UT accuracy on code benchmarks without ground-truth data.

  • Majority Voting for Code Generation cs.LG · 2026-04-17 · unverdicted · none · ref 4

    Functional Majority Voting selects code by runtime agreement on tests, boosting LiveCodeBench performance and serving as an aggregation method for label-free test-time RL without exceeding base model limits.