Canonical reference

Title resolution pending

· 1901

Canonical reference. 86% of citing Pith papers cite this work as background.

10 Pith papers citing it

Background 86% of classified citations

Title metadata for this work has not finished resolving. The hub is built from the citation graph; the title resolver retries DOI and OpenAlex on its next pass.

citation-role summary

background 6 dataset 1

citation-polarity summary

background 6 use dataset 1

representative citing papers

Tree of Thoughts: Deliberate Problem Solving with Large Language Models

cs.CL · 2023-05-17 · accept · novelty 8.0

Tree of Thoughts enables language models to solve complex planning tasks by generating, evaluating, and searching over coherent intermediate thoughts in a tree, raising Game of 24 success from 4% to 74% with GPT-4.

VoxPoser: Composable 3D Value Maps for Robotic Manipulation with Language Models

cs.RO · 2023-07-12 · unverdicted · novelty 7.0

VoxPoser uses LLMs to compose 3D value maps via VLM interaction for model-based synthesis of robust robot trajectories on open-set language-specified manipulation tasks.

Direct Preference Optimization: Your Language Model is Secretly a Reward Model

cs.LG · 2023-05-29 · accept · novelty 7.0

DPO derives the optimal policy directly from human preferences via a reparameterized reward model, solving the RLHF objective with only a binary classification loss and no sampling or separate reward model.

Is Your Code Generated by ChatGPT Really Correct? Rigorous Evaluation of Large Language Models for Code Generation

cs.SE · 2023-05-02 · accept · novelty 7.0

EvalPlus augments HumanEval with 80x more tests via LLM and mutation strategies, exposing up to 28.9% more incorrect LLM-generated code and reversing some model performance rankings.

Do As I Can, Not As I Say: Grounding Language in Robotic Affordances

cs.RO · 2022-04-04 · accept · novelty 7.0

SayCan combines an LLM's high-level semantic knowledge with robot skill value functions to select only feasible actions, enabling completion of abstract natural-language instructions on a real mobile manipulator.

Socratic Models: Composing Zero-Shot Multimodal Reasoning with Language

cs.CV · 2022-04-01 · unverdicted · novelty 7.0

Socratic Models compose zero-shot multimodal reasoning by prompting pretrained language and vision models to exchange information and enable new capabilities without finetuning.

Disentangled Safety Adapters Enable Efficient Guardrails and Flexible Inference-Time Alignment

cs.LG · 2025-05-30 · unverdicted · novelty 6.0

Disentangled Safety Adapters decouple safety computations from task-optimized LLMs via lightweight adapters, yielding up to 53% better AUC on safety tasks and dynamic inference-time alignment with reduced performance trade-offs.

Exploring Cross-lingual Latent Transplantation: Mutual Opportunities and Open Challenges

cs.CL · 2024-12-17 · unverdicted · novelty 5.0

XTransplant empirically shows that cross-lingual latent transplantation yields mutual benefits for multilingual capability and cultural adaptability in LLMs, especially low-resource ones, while revealing underutilized model potential.

Reinforcement Learning with Foundation Priors: Let the Embodied Agent Efficiently Learn on Its Own

cs.RO · 2023-10-04 · unverdicted · novelty 5.0

RLFP and the FAC algorithm combine foundation-model priors for policy, value, and rewards to produce sample-efficient robotic RL that reaches 86% real-robot success after one hour and 100% success on 7/8 Meta-world tasks in under 100k frames.

GLM-4.5: Agentic, Reasoning, and Coding (ARC) Foundation Models

cs.CL · 2025-08-08 · unverdicted · novelty 4.0

GLM-4.5, a 355B-parameter MoE model with hybrid reasoning, scores 70.1% on TAU-Bench, 91.0% on AIME 24, and 64.2% on SWE-bench Verified while ranking 3rd overall and 2nd on agentic benchmarks.

citing papers explorer

Showing 10 of 10 citing papers.

Tree of Thoughts: Deliberate Problem Solving with Large Language Models cs.CL · 2023-05-17 · accept · none · ref 1
Tree of Thoughts enables language models to solve complex planning tasks by generating, evaluating, and searching over coherent intermediate thoughts in a tree, raising Game of 24 success from 4% to 74% with GPT-4.
VoxPoser: Composable 3D Value Maps for Robotic Manipulation with Language Models cs.RO · 2023-07-12 · unverdicted · none · ref 1
VoxPoser uses LLMs to compose 3D value maps via VLM interaction for model-based synthesis of robust robot trajectories on open-set language-specified manipulation tasks.
Direct Preference Optimization: Your Language Model is Secretly a Reward Model cs.LG · 2023-05-29 · accept · none · ref 8
DPO derives the optimal policy directly from human preferences via a reparameterized reward model, solving the RLHF objective with only a binary classification loss and no sampling or separate reward model.
Is Your Code Generated by ChatGPT Really Correct? Rigorous Evaluation of Large Language Models for Code Generation cs.SE · 2023-05-02 · accept · none · ref 6
EvalPlus augments HumanEval with 80x more tests via LLM and mutation strategies, exposing up to 28.9% more incorrect LLM-generated code and reversing some model performance rankings.
Do As I Can, Not As I Say: Grounding Language in Robotic Affordances cs.RO · 2022-04-04 · accept · none · ref 5
SayCan combines an LLM's high-level semantic knowledge with robot skill value functions to select only feasible actions, enabling completion of abstract natural-language instructions on a real mobile manipulator.
Socratic Models: Composing Zero-Shot Multimodal Reasoning with Language cs.CV · 2022-04-01 · unverdicted · none · ref 2
Socratic Models compose zero-shot multimodal reasoning by prompting pretrained language and vision models to exchange information and enable new capabilities without finetuning.
Disentangled Safety Adapters Enable Efficient Guardrails and Flexible Inference-Time Alignment cs.LG · 2025-05-30 · unverdicted · none · ref 6
Disentangled Safety Adapters decouple safety computations from task-optimized LLMs via lightweight adapters, yielding up to 53% better AUC on safety tasks and dynamic inference-time alignment with reduced performance trade-offs.
Exploring Cross-lingual Latent Transplantation: Mutual Opportunities and Open Challenges cs.CL · 2024-12-17 · unverdicted · none · ref 5
XTransplant empirically shows that cross-lingual latent transplantation yields mutual benefits for multilingual capability and cultural adaptability in LLMs, especially low-resource ones, while revealing underutilized model potential.
Reinforcement Learning with Foundation Priors: Let the Embodied Agent Efficiently Learn on Its Own cs.RO · 2023-10-04 · unverdicted · none · ref 18
RLFP and the FAC algorithm combine foundation-model priors for policy, value, and rewards to produce sample-efficient robotic RL that reaches 86% real-robot success after one hour and 100% success on 7/8 Meta-world tasks in under 100k frames.
GLM-4.5: Agentic, Reasoning, and Coding (ARC) Foundation Models cs.CL · 2025-08-08 · unverdicted · none · ref 6
GLM-4.5, a 355B-parameter MoE model with hybrid reasoning, scores 70.1% on TAU-Bench, 91.0% on AIME 24, and 64.2% on SWE-bench Verified while ranking 3rd overall and 2nd on agentic benchmarks.

Title resolution pending

citation-role summary

citation-polarity summary

fields

years

verdicts

roles

polarities

representative citing papers

citing papers explorer