Adding product context retrieval to AI coding agents raises decision compliance from 46% to 95% on a new benchmark of 8 tasks with 41 weighted decision points.
hub
The Impact of AI on Developer Productivity: Evidence from GitHub Copilot
33 Pith papers cite this work. Polarity classification is still indexing.
abstract
Generative AI tools hold promise to increase human productivity. This paper presents results from a controlled experiment with GitHub Copilot, an AI pair programmer. Recruited software developers were asked to implement an HTTP server in JavaScript as quickly as possible. The treatment group, with access to the AI pair programmer, completed the task 55.8% faster than the control group. Observed heterogenous effects show promise for AI pair programmers to help people transition into software development careers.
hub tools
claims ledger
- abstract Generative AI tools hold promise to increase human productivity. This paper presents results from a controlled experiment with GitHub Copilot, an AI pair programmer. Recruited software developers were asked to implement an HTTP server in JavaScript as quickly as possible. The treatment group, with access to the AI pair programmer, completed the task 55.8% faster than the control group. Observed heterogenous effects show promise for AI pair programmers to help people transition into software development careers.
co-cited works
representative citing papers
A network analysis of software mentions in 1.3 million papers identifies 520 tools in eight communities and shows disciplines maintain distinct, stable tool portfolios that are crystallizing toward common sets.
The Mise en Place methodology uses contextual grounding, collaborative specification, and task decomposition to prepare AI agents for coding tasks, demonstrated in a hackathon where two hours of prep enabled rapid parallel development of a full-stack platform.
Generative AI boosted solo entrepreneurial entry on Product Hunt after ChatGPT but teams still dominate the top quality tiers.
SynConfRoute routes code completions using syntax validation and token confidence, improving pass@1 by up to 31% on hard tasks and reducing accelerator usage by 58% versus always using the largest model.
Freelancers use generative AI to support exploratory skill acquisition but not as their main resource due to reliability issues, leading to a shift toward survival-oriented upskilling and the emergence of invisible competencies that lack market validation.
SpecValidator detects lexical vagueness, under-specification, and syntax-formatting defects in LLM code-generation prompts with F1 0.804, outperforming GPT-5-mini and Claude Sonnet 4, and shows that under-specification is the most damaging defect type while richer benchmarks are more resilient.
A game-theoretic model shows that individually rational adoption of generative AI causes model collapse that reduces collective social welfare for important tasks, with habit formation creating spillovers from low-stakes to high-value domains.
BONSAI introduces a four-layer architecture and four-phase workflow for human-AI co-development of visual analytics applications, shown in case studies to enable efficient novel tool creation and reconstruction from paper descriptions.
Co-locating tests with implementation code yields substantially higher preservation and correctness in foundation-model-generated programs than separated test syntax.
LLMs produce executable code only 42.55% of the time under API evolution without full documentation, improving to 66.36% with structured docs and by 11% more with reasoning strategies, yet outdated patterns persist.
REAgent improves LLM patch generation for software issues by 17.4% on average through automated construction, quality checking, and iterative refinement of structured issue-oriented requirements.
StarCoder2-15B matches or beats CodeLlama-34B on code tasks despite being smaller, and StarCoder2-3B outperforms prior 15B models, with open weights and exact training data identifiers released.
Reverie is a new AI-powered game that reduced stress levels in a pilot study of 20 students while providing excellent user experience and improved cognitive emotion regulation.
Meta-analysis of 23 studies shows moderate productivity gains from GenAI coding assistants (Hedges' g=0.33) but no significant effect on learning (g=0.14).
The Productivity-Reliability Paradox arises because AI code generators produce variable output while developers lack sufficient specification discipline, making governance models focused on specifications the binding constraint rather than model improvements.
Agentic AI systems are shifting software engineering from line-level code generation to delegated repository-scale execution under supervision, with SWE-bench performance rising from 1.96% to 78.4% and productivity gains of 13.6-55.8%.
Among novice programmers using AI code generators, trust did not predict compliance with suggestions, while performance correlated with both compliance and increased subsequent trust.
AI-native software ecosystems exhibit emergent behaviors best explained by complex adaptive systems theory, requiring new ecosystem-level monitoring and seven testable propositions that may extend or replace Lehman's laws.
Agentic Consensus replaces code as the main artifact with a typed property graph world model that maintains commitments and evidence through synchronization operators, shifting evaluation to alignment fidelity and consensus entropy.
Sema Code decouples AI coding agents into a programmable npm library with eight mechanisms for isolation, queuing, compression, scheduling, permissions, and integration.
The AI Codebase Maturity Model defines six sequential levels of AI-driven development based on feedback loop topologies, validated by experience reports showing 5x PR and 37x issue throughput gains from level 2 to level 6.
Collaborative ML reproducibility requires socio-technical interactional support beyond artifacts, demonstrated via a clinical deployment and addressed by a proposed two-layer system with an AI semantic interface.
EcoAssist embeds energy estimation and optimization into AI-assisted frontend coding, reducing website energy use by 13-16% in benchmarks while preserving developer productivity.