ProCompNav improves success rate and shortens user responses in ambiguous instance navigation by using comparative binary questions that prune a candidate pool rather than requesting detailed descriptions.
citation dossier
Pengcheng He, Jianfeng Gao, and Weizhu Chen
why this work matters in Pith
Pith has found this work in 18 reviewed papers. Its strongest current cluster is cs.CL (11 papers). The largest review-status bucket among citing papers is UNVERDICTED (16 papers). For highly cited works, this page shows a dossier first and a bounded explorer second; it never tries to render every citing paper at once.
representative citing papers
A framework jointly models annotator-specific NLI labels and explanations using conditioned representations and two explainer architectures, improving predictive performance over baselines.
RAGognizer adds a detection head to LLMs for joint training on generation and token-level hallucination detection, yielding SOTA detection and fewer hallucinations in RAG while preserving output quality.
Dual Triangle Attention achieves effective bidirectional attention with built-in positional inductive bias via dual triangular masks, outperforming standard bidirectional attention on position-sensitive tasks and showing strong masked language modeling results with or without positional embeddings.
Encoder models trained on SEC filings struggle with earnings calls due to domain shift, while LLMs enable open-ended KPI extraction with 79.7% human-verified precision on newly introduced benchmarks.
TwinGate deploys a stateful dual-encoder system with asymmetric contrastive learning to detect decompositional jailbreaks in untraceable LLM traffic at high recall and low false-positive rate with negligible latency.
ADE scales multi-anchor word representations to transformers via Vocabulary Projection, Grouped Positional Encoding, and context-aware reweighting, achieving 98.7% fewer trainable parameters than DeBERTa-v3-base while matching or exceeding it on two text-classification benchmarks and compressing the
SHADE adaptively combines coverage and spectral signals to estimate semantic alphabet size from few LLM samples, yielding better performance than baselines in low-sample regimes for alphabet estimation and QA error detection.
AdaLoRA uses SVD-based pruning to allocate the parameter budget for low-rank fine-tuning updates according to per-matrix importance scores, yielding better performance than uniform allocation especially under tight budgets.
Feature-augmented DeBERTa-v3-base with attention-based fusion reaches 85.9% balanced accuracy on the multi-domain M4 benchmark under fixed-threshold evaluation, outperforming zero-shot baselines by up to 7.22 points.
SHIELD dataset and distilled DeBERTa v3 model achieve 0.88 micro precision and 0.86 recall on PHI de-identification while matching teacher performance on structured categories.
MILD reformulates two-stage learning to defer as cost-sensitive learning over the input-expert domain and derives new margin-based losses with guarantees, yielding better performance than baselines on image classification and LLM routing tasks.
ZSG-IAD is a zero-shot multimodal system that uses language-guided two-hop grounding and rule-based reinforcement learning to produce anomaly masks and explainable reports from industrial sensor data.
AGSC combines NLI neutral probabilities for adaptive granularity with GMM semantic clustering to improve uncertainty quantification in long-text LLM generation, claiming SOTA factuality correlation and 60% faster inference.
A cascaded generative system for e-commerce recommendations using theme and keyword generation with teacher-student fine-tuning achieves a 2.7% lift in cart adds per page view.
Safactory integrates three platforms for simulation, data management, and agent evolution to create a unified pipeline for training trustworthy autonomous AI.
A language-adaptive combination of generalist, specialist, and ensemble transformer models achieves 0.796 macro F1 and 0.826 accuracy on multilingual polarization detection across 22 languages.
A heterogeneous ensemble of XLM-RoBERTa-large and mDeBERTa-v3-base with independent task modeling and class weighting is reported as effective for multilingual, multicultural, and multievent online polarization detection.
citing papers explorer
-
Proactive Instance Navigation with Comparative Judgment for Ambiguous User Queries
ProCompNav improves success rate and shortens user responses in ambiguous instance navigation by using comparative binary questions that prune a candidate pool rather than requesting detailed descriptions.
-
Fine-Grained Perspectives: Modeling Explanations with Annotator-Specific Rationales
A framework jointly models annotator-specific NLI labels and explanations using conditioned representations and two explainer architectures, improving predictive performance over baselines.
-
RAGognizer: Hallucination-Aware Fine-Tuning via Detection Head Integration
RAGognizer adds a detection head to LLMs for joint training on generation and token-level hallucination detection, yielding SOTA detection and fewer hallucinations in RAG while preserving output quality.
-
Dual Triangle Attention: Effective Bidirectional Attention Without Positional Embeddings
Dual Triangle Attention achieves effective bidirectional attention with built-in positional inductive bias via dual triangular masks, outperforming standard bidirectional attention on position-sensitive tasks and showing strong masked language modeling results with or without positional embeddings.
-
Effective Performance Measurement: Challenges and Opportunities in KPI Extraction from Earnings Calls
Encoder models trained on SEC filings struggle with earnings calls due to domain shift, while LLMs enable open-ended KPI extraction with 79.7% human-verified precision on newly introduced benchmarks.
-
TwinGate: Stateful Defense against Decompositional Jailbreaks in Untraceable Traffic via Asymmetric Contrastive Learning
TwinGate deploys a stateful dual-encoder system with asymmetric contrastive learning to detect decompositional jailbreaks in untraceable LLM traffic at high recall and low false-positive rate with negligible latency.
-
ADE: Adaptive Dictionary Embeddings -- Scaling Multi-Anchor Representations to Large Language Models
ADE scales multi-anchor word representations to transformers via Vocabulary Projection, Grouped Positional Encoding, and context-aware reweighting, achieving 98.7% fewer trainable parameters than DeBERTa-v3-base while matching or exceeding it on two text-classification benchmarks and compressing the
-
Mind the Unseen Mass: Unmasking LLM Hallucinations via Soft-Hybrid Alphabet Estimation
SHADE adaptively combines coverage and spectral signals to estimate semantic alphabet size from few LLM samples, yielding better performance than baselines in low-sample regimes for alphabet estimation and QA error detection.
-
AdaLoRA: Adaptive Budget Allocation for Parameter-Efficient Fine-Tuning
AdaLoRA uses SVD-based pruning to allocate the parameter budget for low-rank fine-tuning updates according to per-matrix importance scores, yielding better performance than uniform allocation especially under tight budgets.
-
Feature-Augmented Transformers for Robust AI-Text Detection Across Domains and Generators
Feature-augmented DeBERTa-v3-base with attention-based fusion reaches 85.9% balanced accuracy on the multi-domain M4 benchmark under fixed-threshold evaluation, outperforming zero-shot baselines by up to 7.22 points.
-
SHIELD: A Diverse Clinical Note Dataset and Distilled Small Language Models for Enterprise-Scale De-identification
SHIELD dataset and distilled DeBERTa v3 model achieve 0.88 micro precision and 0.86 recall on PHI de-identification while matching teacher performance on structured categories.
-
Optimized Deferral for Imbalanced Settings
MILD reformulates two-stage learning to defer as cost-sensitive learning over the input-expert domain and derives new margin-based losses with guarantees, yielding better performance than baselines on image classification and LLM routing tasks.
-
ZSG-IAD: A Multimodal Framework for Zero-Shot Grounded Industrial Anomaly Detection
ZSG-IAD is a zero-shot multimodal system that uses language-guided two-hop grounding and rule-based reinforcement learning to produce anomaly masks and explainable reports from industrial sensor data.
-
AGSC: Adaptive Granularity and Semantic Clustering for Uncertainty Quantification in Long-text Generation
AGSC combines NLI neutral probabilities for adaptive granularity with GMM semantic clustering to improve uncertainty quantification in long-text LLM generation, claiming SOTA factuality correlation and 60% faster inference.
-
A Cascaded Generative Approach for e-Commerce Recommendations
A cascaded generative system for e-commerce recommendations using theme and keyword generation with teacher-student fine-tuning achieves a 2.7% lift in cart adds per page view.
-
Safactory: A Scalable Agentic Infrastructure for Training Trustworthy Autonomous Intelligence
Safactory integrates three platforms for simulation, data management, and agent evolution to create a unified pipeline for training trustworthy autonomous AI.
-
MKJ at SemEval-2026 Task 9: A Comparative Study of Generalist, Specialist, and Ensemble Strategies for Multilingual Polarization
A language-adaptive combination of generalist, specialist, and ensemble transformer models achieves 0.796 macro F1 and 0.826 accuracy on multilingual polarization detection across 22 languages.
-
YEZE at SemEval-2026 Task 9: Detecting Multilingual, Multicultural and Multievent Online Polarization via Heterogeneous Ensembling
A heterogeneous ensemble of XLM-RoBERTa-large and mDeBERTa-v3-base with independent task modeling and class weighting is reported as effective for multilingual, multicultural, and multievent online polarization detection.