Apparent psychological profiles of LLMs are largely measurement artifacts driven by directional response bias rather than actual traits.
ToolGen: Unified Tool Retrieval and Calling via Generation
21 Pith papers cite this work. Polarity classification is still indexing.
citation-role summary
citation-polarity summary
roles
background 4representative citing papers
MAVN adaptively selects and connects virtual nodes in MPNNs via learned dual-perspective preferences, proves it can realize any connectivity pattern, and reports up to 46.5% gains over backbones on nine datasets.
SkCC introduces a typed intermediate representation and compiler pipeline to make LLM agent skills portable across frameworks and enforce security constraints before deployment.
RAG-Reflect achieves F1=0.78 on valid comment-edit prediction using retrieval-augmented reasoning and self-reflection, outperforming baselines and approaching fine-tuned models without retraining.
The Decision Event Schema (DES) is a unified JSON schema that records governance evidence from four infrastructure layers in a single per-decision event structure with tiered completeness options.
Agentic AI re-identifies 72% of individuals from simulated mobility traces by cross-referencing public web sources without human intervention.
Mahalanobis PatchCore adds covariance-aware whitening and incremental streaming aggregation to PatchCore, preserving benchmark performance while cutting peak memory from 5.41 GB to 2.78 GB and raising mean industrial AUC from 0.981 to 0.986.
Asteria is a runtime system that enables second-order optimization for LLMs by dynamically distributing optimizer state across GPU, CPU, and NVMe while using asynchronous inverse-root computations and bounded-staleness synchronization.
Pilot study shows agent decision reconstructability varies by vendor SDK regime, with completeness scores from 42.9% to 85.7% and consistent gaps in reasoning traces.
Hard distractors trigger a nonlinear 'First Drop of Ink' performance collapse in long-context LLM reasoning, with most damage from the initial small fraction via disproportionate attention.
Autonomous excavator controller achieves 1.8 cm RMSE in heavy-duty grading across different hydraulic architectures, outperforming commercial solutions by a factor of 2.6 in precision while better utilizing machine pressure.
Dual-Guard embeds complementary watermarks in diffusion image generation to verify provenance and localize tampering with low error rates on a 2400-sample benchmark under reprompting and editing attacks.
Empirical study of eight LLMs finds overuse of popular libraries like NumPy in up to 45% of unnecessary cases and strong default preference for Python even when suboptimal.
ATM is a CID-brokered governance framework that maps write intents to semantic atoms for pre-admission control, validation, and neutral-steward application in single-domain multi-agent code synthesis.
KAPPS is a knowledge-based CPPS architecture that uses an ontology-grounded knowledge graph as the unifying data backbone and authoritative write-time state for handling uncertainty in circular manufacturing, demonstrated via anomaly detection and constraint enforcement use cases.
Synthesizes a governance evidence framework revealing a coverage gradient from full auditability in rule engines to structural breaks in agentic AI, with a cascade of uncertainty and four formal propositions.
DNNs plus SHAP/SSHAP applied to 39 European bidding zones identify solar and gas as key price drivers and simulate a single-price EU market.
MimirRAG, a multi-agent RAG framework with metadata integration and table-aware chunking, reaches 89.3% accuracy on FinanceBench and outperforms prior baselines for financial document retrieval.
DEMM defines four executable evidence-sufficiency categories plus a conflicting category for agentic AI decisions and rolls per-property verdicts into a five-level maturity rubric.
A human-in-control LLM architecture translates natural language to OpenSearch DSL queries using hybrid lexical and semantic search in a secure private-cloud setup, shown via prototype on the Enron dataset.
Rule-based annotation generation for ACSL outperforms LLM-based methods in achieving successful formal verification of C programs.
citing papers explorer
-
SkCC: Portable and Secure Skill Compilation for Cross-Framework LLM Agents
SkCC introduces a typed intermediate representation and compiler pipeline to make LLM agent skills portable across frameworks and enforce security constraints before deployment.
-
RAG-Reflect: Agentic Retrieval-Augmented Generation with Reflections for Comment-Driven Code Maintenance on Stack Overflow
RAG-Reflect achieves F1=0.78 on valid comment-edit prediction using retrieval-augmented reasoning and self-reflection, outperforming baselines and approaching fine-tuned models without retraining.
-
High Precision Hydraulic Excavator Control for Heavy-Duty Grading
Autonomous excavator controller achieves 1.8 cm RMSE in heavy-duty grading across different hydraulic architectures, outperforming commercial solutions by a factor of 2.6 in precision while better utilizing machine pressure.
-
Governed Auditable Decisioning Under Uncertainty: Synthesis and Agentic Extension
Synthesizes a governance evidence framework revealing a coverage gradient from full auditability in rule engines to structural breaks in agentic AI, with a cascade of uncertainty and four formal propositions.