pith. sign in

Ds-1000: A natural and reliable bench- mark for data science code generation

10 Pith papers cite this work. Polarity classification is still indexing.

10 Pith papers citing it

citation-role summary

background 1

citation-polarity summary

roles

background 1

polarities

background 1

clear filters

representative citing papers

KernelBench: Can LLMs Write Efficient GPU Kernels?

cs.LG · 2025-02-14 · accept · novelty 7.0

KernelBench shows that even the best current LLMs generate correct and faster-than-baseline GPU kernels in fewer than 20 percent of realistic ML workloads.

Compass: SLO-aware Query Planner for Compound AI Serving at Scale

cs.DB · 2025-04-23 · unverdicted · novelty 6.0

Compass decomposes multi-query multi-SLO planning for compound AI serving, exploits plan similarities, uses selective profiling, and applies bipartite matching at runtime to deliver 2.4-5.1x higher goodput and 3.8-4.5x lower costs.

Business Utility of Large Language Models as Exploratory Data Analysis Agents

cs.CY · 2026-05-08 · unverdicted · novelty 5.0

Evaluation of 15 LLM configurations across four conditions in a supply chain EDA benchmark finds most lack sufficient repeatability for autonomous deployment, with GPT-5.4 at extra-high reasoning effort scoring highest on mean score (0.8748) and proposed Business utility (0.6952).

StarCoder: may the source be with you!

cs.CL · 2023-05-09 · accept · novelty 5.0

StarCoderBase matches or beats OpenAI's code-cushman-001 on multi-language code benchmarks; the Python-fine-tuned StarCoder reaches 40% pass@1 on HumanEval while retaining other-language performance.

Trading Human Curation for Synthetic Augmentation in RLVR

cs.LG · 2026-06-02 · unverdicted · novelty 4.0

Gated synthetic augmentations can substitute for additional human-authored RLVR tasks at a cost-adjusted trade rate of 1.4x-11.6x while retaining held-out generalization on ten benchmarks spanning code, instruction following, reasoning, and agentic function calling.

citing papers explorer

Showing 8 of 8 citing papers after filters.