pith. machine review for the scientific record. sign in

archive

Every paper Pith has read. Search by title, abstract, or pith.

221 papers in cs.DB · page 1

  1. cs.DS 2026-05-14 reviewed
    Hybrid sketches match best space bounds for dynamic graph connectivity

    Hybrid Sketching Methods for Dynamic Connectivity on Sparse Graphs

    David Tench +4

  2. cs.DB 2026-05-14 reviewed
    Retrieval augments schema graphs for relational database predictions

    From Schema to Signal: Retrieval-Augmented Modeling for Relational Data Analytics

    Beng Chin Ooi +5

  3. cs.AR 2026-05-13 reviewed
    FPGA lock agents boost OLTP throughput 51X over CPUs

    FPGA-Accelerated Lock Management and Transaction Processing: Architecture, Optimization, and Design Space Exploration

    Gustavo Alonso +1

  4. cs.LO 2026-05-13 reviewed
    ELbotpreceq extends DL-Lite with reachability in NL

    A Horn extension of DL-Lite with NL data complexity

    Bartosz Jan Bednarczyk +2

  5. cs.DL 2026-05-13 reviewed
    Graph links 200k research repos to papers and artifacts

    SemRepo: A Knowledge Graph for Research Software and Its Scholarly Ecosystem

    Abdul Rafay +3

  6. cs.DB 2026-05-13 reviewed
    Benchmark shows top multimodal models lag on e-commerce

    OxyEcomBench: Benchmarking Multimodal Foundation Models across E-Commerce Ecosystems

    Bing Bai +7

  7. cs.CV 2026-05-12 reviewed
    3D primitives in code raise VLM spatial scores up to 17 percent

    3D Primitives are a Spatial Language for VLMs

    Alejandro Mottini +10

  8. eess.SP 2026-05-12 reviewed
    Commercial 5G dataset aids AI handover and beam management

    Enabling AI-Native Mobility in 6G: A Real-World Dataset for Handover, Beam Management, and Timing Advance

    Deepa M.R +3

  9. cs.DB 2026-05-12 reviewed
    Chase termination undecidable even for decidable queries

    Will My Favorite Chases Terminate if Evaluating Conjunctive Queries Does? One Does Not Simply Decide This

    Lucas Larroque +1

  10. cs.DB 2026-05-12 reviewed
    Separating instances pick correct NL2SQL candidate

    Data-aware candidate selection in NL2SQL translation via small separating instances

    Alexander Shulgin +2

  11. cs.IR 2026-05-12 reviewed
    BatchBench framework equalizes autoscaling policy tests

    BatchBench: Toward a Workload-Aware Benchmark for Autoscaling Policies in Big Data Batch Processing -- A Proposed Framework

    Siri Chandana Sirigiri +1

  12. cs.DB 2026-05-12 reviewed
    Knowledge graphs source optimization problems via queries

    Graph-Grounded Optimization: Rao-Family Metaheuristics, Classical OR, and SLM-Driven Formulation over Knowledge Graphs

    Madhulatha Mandarapu (samyama.ai) +1

  13. cs.DB 2026-05-12 reviewed
    Graph queries for optimization reveal hidden data flaws

    Graph-Grounded Optimization: Rao-Family Metaheuristics, Classical OR, and SLM-Driven Formulation over Knowledge Graphs

    Madhulatha Mandarapu (samyama.ai) +1

  14. cs.DB 2026-05-12 reviewed
    Replicas detect and repair database corruption without stopping work

    PROTECT-DB: Protecting Data using Replicated State Machines: Efficient Corruption Detection & Recovery

    Anant Utgikar +1

  15. cs.AI 2026-05-12 reviewed
    LLMs cannot always be correct

    A CAP-like Trilemma for Large Language Models: Correctness, Non-bias, and Utility under Semantic Underdetermination

    Vinu Ellampallil Venugopal

  16. cs.LG 2026-05-12 reviewed
    Benchmark with 40 epidemic datasets enables fair model comparisons

    EpiCastBench: Datasets and Benchmarks for Multivariate Epidemic Forecasting

    Danny D'Agostino +4

  17. cs.LG 2026-05-12 reviewed
    Relational signals lift membership inference on tabular diffusion models

    FERMI: Exploiting Relations for Membership Inference Against Tabular Diffusion Models

    Abtin Mahyar +3

  18. cs.DB 2026-05-11 reviewed
    SHACL-DS validates named graphs faster than standard SHACL

    Keeping track of errors: A study of SHACL-DS for RDF dataset validation on the ERA RINF Knowledge Graph

    Christophe Debruyne +2

  19. cs.DB 2026-05-11 reviewed
    Single GPU kernel fuses IO and query steps for faster analytics

    Data Path Fusion in GPU for Analytical Query Processing

    Kazuo Goda +1

  20. cs.DB 2026-05-11 reviewed
    Text2Cypher must reason across multiple graph databases

    Toward Multi-Database Query Reasoning for Text2Cypher

    Makbule Gulcin Ozsoy

  21. cs.AI 2026-05-11 reviewed
    Autonomous objects resolve over half of scientific data conflicts

    Autonomous FAIR Digital Objects: From Passive Assertions to Active Knowledge

    Christoph Lange +3

  22. cs.DB 2026-05-11 reviewed
    Cloud GPUs speed graph index construction by 9x at 6x lower cost

    ScaleGANN: Accelerate Large-Scale ANN Indexing by Cost-effective Cloud GPUs

    Boon Thau Loo +7

  23. cs.IR 2026-05-11 reviewed
    Graph of codecs compresses data smaller and faster

    OpenZL: Using Graphs to Compress Smaller and Faster

    Danielle Rozenblit +12

  24. cs.CL 2026-05-10 reviewed
    Home activity benchmark shows AI question-answering gaps

    HOME-KGQA: A Benchmark Dataset for Multimodal Knowledge Graph Question Answering on Household Daily Activities

    Aoi Ohta +7

  25. cs.DB 2026-05-09 reviewed
    Krone decomposes logs into entity-action-status units for modular anomaly detection

    Detect, Localize, and Explain: Interactive Hierarchical Log Anomaly Analytics with LLM Augmentation

    Athanasios Tassiadamis +7

  26. cs.AI 2026-05-09 reviewed
    One-to-one matching boosts ontology alignment precision

    Open Ontologies: Tool-Augmented Ontology Engineering with Stable Matching Alignment

    Fabio Rovai

  27. cs.DB 2026-05-09 reviewed
    Personalized privacy cuts infinite stream estimation error by 53.6%

    Personalized w-Event Privacy for Infinite Stream Estimation

    Kenli Li +6

  28. cs.AI 2026-05-09 reviewed
    Diagnosis consistency links to actual causality for AI explanations

    Reconciling Consistency-Based Diagnosis with Actual-Causality-Based Explanations

    Leopoldo Bertossi

  29. cs.DB 2026-05-09 reviewed
    LLMs fall short on natural language data prep tasks

    PrepBench: How Far Are We from Natural-Language-Driven Data Preparation?

    Guoliang Li +3

  30. cs.DB 2026-05-09 reviewed
    Elastic scheduling meets stream deadlines at lowest cost

    Elastic Scheduling of Intermittent Query Processing in a Cluster Environment

    Saranya Chandrasekaran +1

  31. cs.DB 2026-05-08 reviewed
    Heavy-light partitioning maintains arbitrary joins under updates

    Maintaining Queries under Updates Using Heavy-Light Partitioning of the Input Relations

    Ahmet Kara +4

  32. cs.DB 2026-05-07 reviewed
    SkipDisk hits 63% HNSW latency at 20% memory

    Low-Latency Out-of-Core ANN Search in High-Dimensional Space

    Bin Wang +3

  33. cs.DB 2026-05-07 reviewed
    Query rewrite rules written once deploy across database engines

    An Extensible and Verifiable Language for Query Rewrite Rules

    Alvin Cheung +5

  34. cs.DB 2026-05-07 reviewed
    Every query reduces to Filter

    Anatomy of a Query: W5H Dimensions and FAR Patterns for Text-to-SQL Evaluation

    Eduardo Valverde +2

  35. cond-mat.mtrl-sci 2026-05-06 reviewed
    Diversity selection builds versatile materials datasets

    Building informative materials datasets beyond targeted objectives

    Adji Bousso Dieng +8

  36. cs.DB 2026-05-06 reviewed
    Caching cuts redundant CBO calls in cost-based query rewrite

    Efficient Cost-Based Rewrite in a Bottom-Up Optimizer

    Chong Chen +6

  37. cs.LG 2026-05-06 reviewed
    Only solution concentration ranks consistently across electrospinning ML models

    Cross-Model Consistency of Feature Importance in Electrospinning: Separating Robust from Model-Dependent Features

    Ferenc Ender +2

  38. cs.LG 2026-05-06 reviewed
    Concentration alone has zero rank variability in electrospinning models

    Cross-Model Consistency of Feature Importance in Electrospinning: Separating Robust from Model-Dependent Features

    Ferenc Ender +2

  39. cs.DB 2026-05-06 reviewed
    Hierarchical agents clean messy time series without ground truth

    AegisTS: A Hierarchical Agent System with Reinforcement Learning for Multivariate Time Series Data Cleaning

    Lu Chen +4

  40. cs.LG 2026-05-05 reviewed
    Fused soil dataset pretrains representations matching real processes

    LUCAS-MEGA: A Large-Scale Multimodal Dataset for Representation Learning in Soil-Environment Systems

    Kuangdai Leng +3

  41. cs.LG 2026-05-05 reviewed
    Fused soil dataset pretrains model to capture real processes

    LUCAS-MEGA: A Large-Scale Multimodal Dataset for Representation Learning in Soil-Environment Systems

    Kuangdai Leng +3

  42. cs.DB 2026-05-05 reviewed
    Database repairs match preferred extensions in SETAFs

    Inconsistent Databases and Argumentation Frameworks with Collective Attacks

    Axel-Cyrille Ngonga Ngomo +3

  43. cs.DB 2026-05-05 reviewed
    ConRAD introduces a framework that applies conformal risk control inside neural graph…

    ConRAD: Conformal Risk-Aware Neural Databases

    Fabian Zeiher +6

  44. cs.DB 2026-05-05 reviewed
    Sliced kd-trees speed up multi-dimensional queries in memory

    In-memory Multidimensional Indexing Using the skd-tree

    Achilleas Michalopoulos +2

  45. cs.AI 2026-05-05 reviewed
    AI agents average 45 percent on workspace tasks with 20k files

    Workspace-Bench 1.0: Benchmarking AI Agents on Workspace Tasks with Large-Scale File Dependencies

    Chunwei Liu +19

  46. cs.AI 2026-05-05 reviewed
    AI agents top out at 60% on workspace file dependency tasks

    Workspace-Bench 1.0: Benchmarking AI Agents on Workspace Tasks with Large-Scale File Dependencies

    Chunwei Liu +19

  47. cs.AI 2026-05-05 reviewed
    Agents reach 68.7% on workspace tasks with big file sets

    Workspace-Bench 1.0: Benchmarking AI Agents on Workspace Tasks with Large-Scale File Dependencies

    Chunwei Liu +19

  48. cs.AI 2026-05-05 reviewed
    Agents hit 43% average on realistic workspace tasks

    Workspace-Bench 1.0: Benchmarking AI Agents on Workspace Tasks with Large-Scale File Dependencies

    Chunwei Liu +19

  49. cs.DB 2026-05-05 reviewed
    3B model hits 85% Text-to-SQL accuracy using fine-grained rewards

    FINER-SQL: Boosting Small Language Models for Text-to-SQL

    Hongzhi Yin +6

  50. cs.SE 2026-05-05 reviewed
    AI models recover semantics from legacy database code

    Semantic Reverse Engineering Legacy Software Applications with ChatGPT, Gemini AI, and Claude AI

    Christian Mancas +1