EstGraph benchmark evaluates LLMs on estimating properties of very large graphs from random-walk samples that fit in context limits.
Title resolution pending
4 Pith papers cite this work. Polarity classification is still indexing.
fields
cs.LG 4representative citing papers
Long-Term Embeddings anchor sequential recommendation models to fixed content-based item representations to capture stable preferences and ensure version compatibility, resulting in uplifts in user engagement and financial metrics.
RetrievalAttention approximates full attention in long-context LLMs by retrieving relevant KV vectors from CPU-based ANNS indexes with an attention-aware algorithm, achieving near-full accuracy while accessing only 1-3% of the data.
MC-Dropout uncertainty in DKT, SAKT and AKT models allows targeted abstention that raises accuracy 2.3-3.0 points and captures 77-90% architecture-specific epistemic signal unexplained by IRT or psychometric factors.
citing papers explorer
-
Evaluating LLMs on Large-Scale Graph Property Estimation via Random Walks
EstGraph benchmark evaluates LLMs on estimating properties of very large graphs from random-walk samples that fit in context limits.
-
Long-Term Embeddings for Balanced Personalization
Long-Term Embeddings anchor sequential recommendation models to fixed content-based item representations to capture stable preferences and ensure version compatibility, resulting in uplifts in user engagement and financial metrics.
-
RetrievalAttention: Accelerating Long-Context LLM Inference via Vector Retrieval
RetrievalAttention approximates full attention in long-context LLMs by retrieving relevant KV vectors from CPU-based ANNS indexes with an attention-aware algorithm, achieving near-full accuracy while accessing only 1-3% of the data.
-
Knowing When to Defer: Selective Prediction for Responsible Knowledge Tracing
MC-Dropout uncertainty in DKT, SAKT and AKT models allows targeted abstention that raises accuracy 2.3-3.0 points and captures 77-90% architecture-specific epistemic signal unexplained by IRT or psychometric factors.