archive
Every paper Pith has read. Search by title, abstract, or pith.
718 papers in cs.CY · page 1
-
Standard rules understaff SNAP call centers by ignoring redials
Due Process on Hold: A Queueing Framework for Improving Access in SNAP
-
This paper uses data from 26 million U.S
Tradeoffs are Domain Dependent: Improving Accuracy and Fairness in Property Tax Assessments
-
ViMU benchmark tests video AI on hidden meanings
ViMU: Benchmarking Video Metaphorical Understanding
-
4B genome agent matches larger LLMs on microbial trait prediction
GGBound: A Genome-Grounded Agent for Microbial Life-Boundary Prediction
-
Moderate starters gain most in AI agent workshops
Computational Thinking Development in AI Agent Creation_A Mixed-Methods Study
-
Agent harnesses allow unsafe actions even with correct final outputs
Auditing Agent Harness Safety
-
AI benchmarks redefine capabilities to fit their own rules
The Evaluation Trap: Benchmark Design as Theoretical Commitment
-
Safety refusals rise with Korean language but drop with Korean context
ROK-FORTRESS: Measuring the Effect of Geopolitical Transcreation for National Security and Public Safety
-
Generative models automate social doing
Synthetic Sociality: How Generative Models Privatize the Social Fabric
-
Formal checks can keep AI legal reasoning inside the text
Bridging Legal Interpretation and Formal Logic: Faithfulness, Assumption, and the Future of AI Legal Reasoning
-
GraphRAG retrieval aligns LLM agents with social values
From Descriptive to Prescriptive: Uncover the Social Value Alignment of LLM-based Agents
-
AI Overviews appear in 14% of searches with 11% unsupported claims
Measuring Google AI Overviews: Activation, Source Quality, Claim Fidelity, and Publisher Impact
-
Election tweets on X rose to 93 percent original content in 2024 from 59 percent in 2016
Amplification to Synthesis: A Comparative Analysis of Cognitive Operations Before and After Generative AI
-
Canary tokens link scrapers to the LLMs they feed
Identifying AI Web Scrapers Using Canary Tokens
-
Fine-tuning plus hierarchical prompts strengthen propaganda detection
Fine-tuning with Hierarchical Prompting for Robust Propaganda Classification Across Annotation Schemas
-
Europe Needs Preparedness Plan for AGI by 2030-2040
Europe and the Geopolitics of AGI: The Need for a Preparedness Plan
-
Students rate AI slides equal to instructor ones
AI-Generated Slides: Are They Good? Can Students Tell?
-
3C framework links competition and networks to women's computing participation
3C: Competition, Competence, and Collaboration for Women in Computing
-
Bias audits for AI image generators must match use-case risks
Context Matters: Auditing Gender Bias in T2I Generation through Risk-Tiered Use-Case Profiles
-
Aggregation turns watermarking into monitoring
Watermarking Should Be Treated as a Monitoring Primitive
-
Watermarking turns into entity monitoring via output aggregation
Watermarking Should Be Treated as a Monitoring Primitive
-
Chinese tech writing needs separate terms for safety and security
Not All Anquan Is the Same: A Terminological Proposal for Chinese Computer Science and Engineering
-
Use 'anbao' for security, keep 'anquan' for safety in Chinese tech writing
Not All Anquan Is the Same: A Terminological Proposal for Chinese Computer Science and Engineering
-
GenAI flattens L2 writers' voices into uniform English
The Cost of Perfect English: Pragmatic Flattening and the Erasure of Authorial Voice in L2 Writing Supported by GenAI
-
KITE tutor raises simulated student accuracy on algorithm tasks
Retrieval-Augmented Tutoring for Algorithm Tracing and Problem-Solving in AI Education
-
87% of teachers quit AI agent creation weeks after training
An Activity-Theoretical Approach to Teacher Professional Development in Pedagogical AI Agent Design
-
The MIRACLE system uses multiple AI agents to guide students through planning
MIRACLE_Multi-Agent Intelligent Regulation to Advance Collaborative Learning Environment
-
AI-TPACK forms through thinking style and beliefs
Modeling AI-TPACK in Practice Insights from Teachers Multi-Agent Workflow Design
-
Clinical AI models passing accuracy tests can fail hidden deployment checks
RISED: A Pre-Deployment Safety Evaluation Framework for Clinical AI Decision-Support Systems
-
Scale separates mechanistic explanation from reproduction in LLM models
Mechanism Plausibility in Generative Agent-Based Modeling
-
Synthetic dataset benchmarks AI for swim coaching
Synthesizing the Expert: A Validated Multimodal Dataset for Trustworthy AI-Assisted Swimming Coaching
-
Feature models cut error 22-33% on student effort forecasts
From Heuristics to Analytics: Forecasting Effort and Progress in Online Learning
-
AI forces new rules for how universities change teaching
A Framework for institutional change in the age of AI
-
LLM simulators fix answers regardless of feedback relevance
Simulating Students or Sycophantic Problem Solving? On Misconception Faithfulness of LLM Simulators
-
Outcome-fair models still reason differently for similar applicants
Do Fair Models Reason Fairly? Counterfactual Explanation Consistency for Procedural Fairness in Credit Decisions
-
Nobody knows the state of the art in geospatial foundation models
No One Knows the State of the Art in Geospatial Foundation Models
-
Multisector moves boost upward mobility for planning alumni
Career Mobility of Planning Alumni in the United States: Evidence from Professional Profile Data using Large Language Models
-
Simulator trains AI agents on utility demand response
Towards Affordable Energy: A Gymnasium Environment for Electric Utility Demand-Response Programs
-
LLM political discourse lacks real population variation in crises
The Algorithmic Caricature: Auditing LLM-Generated Political Discourse Across Crisis Events
-
Embedding geometry flags LLM rating disagreements
Predicting Disagreement with Human Raters in LLM-as-a-Judge Difficulty Assessment without Using Generation-Time Probability Signals
-
AI in exams makes judging solutions the new measure of learning
Reimagining Assessment in the Age of Generative AI: Lessons from Open-Book Exams with ChatGPT
-
Culturally responsive outreach builds AI knowledge in Black youth
Early AI Literacy in Culturally Responsive STEM Outreach for Black Youth
-
LLM arbitration cuts delays at signal-free intersections
LISA: Cognitive Arbitration for Signal-Free Autonomous Intersection Management
-
Budget split cuts gender skew in ads without excluding unknowns
Into the Unknown: Accounting for Missing Demographic Data when Mitigating Ad Delivery Skew
-
Same facts produce different conclusions when inference profiles differ
Why Conclusions Diverge from the Same Observations: Formalizing World-Model Non-Identifiability via an Inference
-
Adaptive weights add feature selection to FGW distances
Fused Gromov-Wasserstein Distance with Feature Selection
-
Poetic prompts create separate processing paths that evade LLM safety
Metaphor Is Not All Attention Needs
-
GDPR access requests expose contracts of African content moderators
Auditing African Content Moderators' Working Conditions by Using the European General Data Protection Regulation (GDPR)
-
Polymarket shows single fill-side cluster for all addresses
Fill-Side Non-Retail Trading on Polymarket: An Empirical Study of Behavioral Tiers and Microstructure Signatures Under Quote-Attribution Constraints
-
The paper introduces the Evaluation Differential (ED) as a divergence in AI model…
The Evaluation Differential: When Frontier AI Models Recognise They Are Being Tested