arxiv: 2605.14262 · v1 · submitted 2026-05-14 · 💻 cs.RO · cs.HC

Recognition: 2 theorem links

· Lean Theorem

Distill: Uncovering the True Intent behind Human-Robot Communication

Ting Li , David Porfirio

Authors on Pith no claims yet

Pith reviewed 2026-05-15 02:42 UTC · model grok-4.3

classification 💻 cs.RO cs.HC

keywords human-robot interactionintent elicitationtask specificationnatural language interfacesend-user programmingrobot task refinementcrowdsourcing evaluation

0 comments

The pith

Distill refines initial robot task specifications by removing steps, generalizing meanings, and relaxing order constraints to better match users' true intent.

A machine-rendered reading of the paper's core claim, the machinery that carries it, and where it could break.

This paper presents Distill as a way to bridge the gap between how people actually instruct robots and what the robots need to understand. Natural language instructions tend to be too vague while end-user programs are too rigid, so Distill starts from whatever the user first provides and applies three targeted changes: it drops steps that are not essential, it broadens the meaning of remaining steps, and it loosens the required sequence among them. The authors built a web interface that performs these operations and tested it through crowdsourced participants, showing that the refined specifications more accurately reflect what users really wanted. If the approach holds, robot interfaces could start from imperfect human inputs and still arrive at usable plans without forcing users to be either perfectly precise or perfectly general from the start.

Core claim

Given a task specification provided by the user, Distill removes unnecessary steps, generalizes the meaning behind individual steps, and relaxes ordering constraints between steps, thereby eliciting and refining the user's true underlying intent, as shown by implementation in a web interface and validation through a crowdsourcing study.

What carries the argument

The Distill process, which applies three operations to a user's initial task specification: removing unnecessary steps, generalizing the meanings of individual steps, and relaxing ordering constraints between steps.

If this is right

Robot interfaces can start from imprecise natural-language or overly specific program inputs and still produce usable task plans.
The web interface implementation shows that the three operations can be applied automatically to refine user specifications in real time.
Crowdsourced validation demonstrates measurable improvement in how well the refined output captures what users meant.
Communication between humans and robots becomes less dependent on users providing perfect initial specifications.

Where Pith is reading between the lines

These are editorial extensions of the paper, not claims the author makes directly.

The same refinement steps could be applied to task specifications in non-robotic settings such as virtual assistants or automated planning tools.
Combining Distill with later user feedback loops might further reduce the risk of unintended changes to intent.
The approach implies that user intent is often best represented at a level of abstraction higher than the initial specification, which could guide design of future intent-capture systems.

Load-bearing premise

The three operations accurately uncover and preserve the user's true underlying intent without introducing distortions or requiring additional user feedback.

What would settle it

A follow-up crowdsourcing study in which participants judge whether the Distill-refined specifications match their original intent; if a large fraction of participants report mismatches, the claim that the operations reliably elicit true intent would be falsified.

Figures

Figures reproduced from arXiv: 2605.14262 by David Porfirio, Ting Li.

**Figure 1.** Figure 1: The Distill approach to eliciting ground-truth user input from natural task specification paradigms. Abstract As robots become increasingly integrated into everyday environments, intuitive communication paradigms such as natural language and end-user programming have become indispensable for specifying autonomous robot behavior. However, these mechanisms are ineffective at fully capturing user intent—nat… view at source ↗

**Figure 2.** Figure 2: Example input to the first and second phases of the [PITH_FULL_IMAGE:figures/full_fig_p004_2.png] view at source ↗

**Figure 3.** Figure 3: Distill’s third phase (left) involves filtering non-critical actions from the user’s initial task trace. Distill’s fourth phase (top-right) relaxes the constraint that the robot must perform specific actions in order to achieve desired goal predicates. The fifth phase (bottom-right) relaxes the constraint that the robot must follow instructions in a certain order. action (the goals). Outcomes are represent… view at source ↗

**Figure 4.** Figure 4: Our implementation of the first and second phases of the [PITH_FULL_IMAGE:figures/full_fig_p006_4.png] view at source ↗

**Figure 5.** Figure 5: Our implementation of the third, fourth, and fifth [PITH_FULL_IMAGE:figures/full_fig_p007_5.png] view at source ↗

**Figure 6.** Figure 6: Comparison of natural language input length (left) and of the different lexical features occurring (right) between the [PITH_FULL_IMAGE:figures/full_fig_p008_6.png] view at source ↗

**Figure 7.** Figure 7: Trace and plan length for the structured study (n=21). Lower values are better. †p<0.1, *p<0.05, **p<0.01, ***p<0.001 Structured study. In novel environments, user-created traces required 𝑀 = 25.10 steps (𝑆𝐷 = 7.82), while system-filtered traces required 𝑀 = 19.62 steps (𝑆𝐷 = 6.28), user-filtered traces required 𝑀 = 20.95 steps (𝑆𝐷 = 8.10), and abstracted traces required 𝑀 = 17.76 steps (𝑆𝐷 = 5.73). User-c… view at source ↗

**Figure 8.** Figure 8: Trace and plan length on a logarithmic scale for the [PITH_FULL_IMAGE:figures/full_fig_p011_8.png] view at source ↗

read the original abstract

As robots become increasingly integrated into everyday environments, intuitive communication paradigms such as natural language and end-user programming have become indispensable for specifying autonomous robot behavior. However, these mechanisms are ineffective at fully capturing user intent: natural language is imprecise and ambiguous, whereas end-user programming can be overly specific. As a result, understanding what users truly mean when they interact with robots remains a central challenge for human-AI communication systems. To address this issue, we propose the Distill approach for human-robot communication interfaces. Given a task specification provided by the user, Distill (1) removes unnecessary steps; (2) generalizes the meaning behind individual steps; and (3) relaxes ordering constraints between steps. We implemented Distill on a web interface and, through a crowdsourcing study, demonstrated its ability to elicit and refine user intent from initial task specifications.

Editorial analysis

A structured set of objections, weighed in public.

Desk editor's note, referee report, simulated authors' rebuttal, and a circularity audit. Tearing a paper down is the easy half of reading it; the pith above is the substance, this is the friction.

Desk Editor's Note private letter to a colleague

Distill gives a concrete three-step way to clean up robot task specs but the crowdsourcing study offers no check that the output matches the user's actual intent.

read the letter

The paper's core contribution is a simple pipeline called Distill that takes an initial task specification and runs three operations on it: dropping steps that look unnecessary, generalizing the meaning of remaining steps, and relaxing the order between them. They built this into a web interface and ran a crowdsourcing study to show that the results help refine user intent. That combination of operations is presented as new for the human-robot interaction setting, and it directly targets the mismatch between vague natural language and overly rigid end-user programs. The operations themselves are easy to understand and implement, which is a practical plus for anyone building robot interfaces. The abstract is clear about the problem and the proposed fix, and the implementation claim is concrete enough that someone could recreate the web tool from the description. The evaluation is the clear weak point. The study is described only at the level of showing that Distill can elicit and refine intent, with no numbers, no task examples, no metrics, and no independent way to verify whether the distilled version actually captures what the user meant rather than just producing a simpler or vaguer spec that participants rate as helpful. Without a ground-truth check, a re-execution match, or a comparison to another elicitation method, the central claim rests on perceived helpfulness alone. This paper is aimed at HRI researchers who work on communication interfaces and need lightweight ways to handle ambiguous inputs. A reader could take the three operations and try them in their own system without much trouble. It is worth sending to peer review because the method is well-defined and the implementation is reproducible; referees can push for stronger validation without starting from scratch.

Referee Report

2 major / 1 minor

Summary. The paper proposes the Distill approach for refining user intent in human-robot communication. Given an initial task specification, Distill applies three operations—removing unnecessary steps, generalizing the meaning of individual steps, and relaxing ordering constraints between steps—to produce a more accurate representation of the user's underlying intent. The authors implemented the method in a web interface and report that a crowdsourcing study demonstrates its ability to elicit and refine intent from initial specifications.

Significance. If the central claim holds with proper validation, the work could meaningfully advance intuitive interfaces for robot task specification by bridging the gap between imprecise natural language and overly rigid end-user programming. The approach is conceptually straightforward and targets a recognized challenge in HRI, but its significance is currently constrained by the absence of detailed empirical evidence.

major comments (2)

[Crowdsourcing study description] The description of the crowdsourcing study (mentioned in the abstract and the implementation paragraph) provides no methodology details, participant information, metrics, quantitative results, or error analysis. This absence leaves the central empirical claim—that Distill refines user intent—without visible data support and prevents assessment of whether the three operations recover latent intent or merely produce simpler but altered specifications.
[Distill approach definition] The weakest assumption—that the three operations (remove steps, generalize meanings, relax ordering) accurately uncover and preserve true user intent without distortion—is not tested against an independent ground truth. No post-distillation confirmation step, behavioral re-execution match, follow-up interview, or comparison to alternative elicitation methods (e.g., demonstrations) is described that could falsify the possibility of plausible but incorrect refinements.

minor comments (1)

[Abstract] The abstract would be strengthened by including at least one concrete metric or qualitative finding from the crowdsourcing study rather than a general statement of demonstration.

Simulated Author's Rebuttal

2 responses · 0 unresolved

We thank the referee for their thoughtful and constructive comments, which highlight important gaps in the empirical validation of our work. We address each major comment below and commit to a major revision that strengthens the manuscript's claims with additional details and discussion.

read point-by-point responses

Referee: [Crowdsourcing study description] The description of the crowdsourcing study (mentioned in the abstract and the implementation paragraph) provides no methodology details, participant information, metrics, quantitative results, or error analysis. This absence leaves the central empirical claim—that Distill refines user intent—without visible data support and prevents assessment of whether the three operations recover latent intent or merely produce simpler but altered specifications.

Authors: We agree that the current version of the manuscript provides insufficient detail on the crowdsourcing study. This was an oversight during preparation. In the revised manuscript, we will add a full subsection describing the study protocol, participant recruitment and demographics (e.g., number of participants, age range, prior robot experience), the exact metrics collected (intent alignment ratings on a Likert scale), quantitative results with statistical tests, and an error analysis showing cases where distillation improved versus altered the specification. These additions will directly support the claim that the three operations recover latent intent. revision: yes
Referee: [Distill approach definition] The weakest assumption—that the three operations (remove steps, generalize meanings, relax ordering) accurately uncover and preserve true user intent without distortion—is not tested against an independent ground truth. No post-distillation confirmation step, behavioral re-execution match, follow-up interview, or comparison to alternative elicitation methods (e.g., demonstrations) is described that could falsify the possibility of plausible but incorrect refinements.

Authors: The crowdsourcing study asked participants to compare original and distilled specifications and rate which better matched their intended task, providing direct user validation of the operations. However, we acknowledge that this self-report approach does not constitute an independent ground truth such as behavioral re-execution or comparison to demonstrations. In the revision we will explicitly discuss this limitation, add a paragraph on potential distortion risks, and include a small additional analysis comparing a subset of distilled outputs against user-provided demonstrations where available. We believe the user-centric validation is appropriate for an intent-elicitation interface but will strengthen it as noted. revision: partial

Circularity Check

0 steps flagged

No circularity: Distill operations defined independently and evaluated externally

full rationale

The paper defines the Distill method through three explicit, non-recursive operations applied to user-provided task specifications: removing unnecessary steps, generalizing step meanings, and relaxing ordering constraints. These are introduced as a direct proposal without equations, fitted parameters, self-citations for uniqueness theorems, or any reduction where the output is defined in terms of itself. Validation occurs via an external crowdsourcing study on a web interface that measures participant ratings, which serves as an independent check rather than a self-referential loop. No load-bearing step reduces by construction to the inputs, satisfying the criteria for a self-contained method proposal.

Axiom & Free-Parameter Ledger

0 free parameters · 1 axioms · 0 invented entities

The central claim rests on the domain assumption that user intent can be uncovered through these specific transformations without additional validation.

axioms (1)

domain assumption User-provided task specifications contain unnecessary steps, over-specific meanings, and strict ordering constraints that can be removed, generalized, or relaxed while still capturing true intent.
This premise directly justifies the three core operations described in the abstract.

pith-pipeline@v0.9.0 · 5438 in / 1128 out tokens · 41741 ms · 2026-05-15T02:42:54.220080+00:00 · methodology

discussion (0)

Lean theorems connected to this paper

Citations machine-checked in the Pith Canon. Every link opens the source theorem in the public Lean library.

IndisputableMonolith/Cost/FunctionalEquation.lean washburn_uniqueness_aczel unclear

?

unclear
Relation between the paper passage and the cited Recognition theorem.

Distill (1) removes unnecessary steps; (2) generalizes the meaning behind individual steps; and (3) relaxes ordering constraints between steps.
IndisputableMonolith/Foundation/RealityFromDistinction.lean reality_from_one_distinction unclear

?

unclear
Relation between the paper passage and the cited Recognition theorem.

We implemented Distill on a web interface and, through a crowdsourcing study, demonstrated its ability to elicit and refine user intent from initial task specifications.

What do these tags mean?

matches: The paper's claim is directly supported by a theorem in the formal canon.
supports: The theorem supports part of the paper's argument, but the paper may add assumptions or extra steps.
extends: The paper goes beyond the formal theorem; the theorem is a base layer rather than the whole result.
uses: The paper appears to rely on the theorem as machinery.
contradicts: The paper's claim conflicts with a theorem or certificate in the canon.
unclear: Pith found a possible connection, but the passage is too broad, indirect, or ambiguous to say the theorem truly supports the claim.

Reference graph

Works this paper leans on

66 extracted references · 66 canonical work pages · 1 internal anchor

[1]

[n. d.]. LimeZu. https://limezu.itch.io/. Accessed: 2026-04-25

work page 2026
[2]

Gopika Ajaykumar, Maureen Steele, and Chien-Ming Huang. 2021. A survey on end-user robot programming.ACM Computing Surveys (CSUR)54, 8 (2021), 1–36. doi:10.1145/3466819

work page doi:10.1145/3466819 2021
[3]

Sonya Alexandrova, Zachary Tatlock, and Maya Cakmak. 2015. RoboFlow: A flow-based visual programming language for mobile manipulation tasks. In2015 IEEE international conference on robotics and automation (ICRA). IEEE, 5537–5544. doi:10.1109/ICRA.2015.7139973

work page doi:10.1109/icra.2015.7139973 2015
[4]

Saleema Amershi, Dan Weld, Mihaela Vorvoreanu, Adam Fourney, Besmira Nushi, Penny Collisson, Jina Suh, Shamsi Iqbal, Paul N Bennett, Kori Inkpen, et al. 2019. Guidelines for human-AI interaction. InProceedings of the 2019 chi conference on human factors in computing systems. 1–13. doi:10.1145/3290605.3300233 Distill: Uncovering the True Intent behind Huma...

work page doi:10.1145/3290605.3300233 2019
[5]

Virginia Braun and Victoria Clarke. 2021. Thematic analysis: A practical guide. (2021)

work page 2021
[6]

Gordon Briggs, Tom Williams, and Matthias Scheutz. 2017. Enabling robots to understand indirect speech acts in task-based interactions.Journal of Human- Robot Interaction6, 1 (2017), 64–94. doi:10.5898/JHRI.6.1.Briggs

work page doi:10.5898/jhri.6.1.briggs 2017
[7]

Anthony Brohan, Yevgen Chebotar, Chelsea Finn, Karol Hausman, Alexander Herzog, Daniel Ho, Julian Ibarz, Alex Irpan, Eric Jang, Ryan Julian, et al. 2023. Do as i can, not as i say: Grounding language in robotic affordances. InConference on robot learning. PMLR, 287–318

work page 2023
[8]

Yuanzhi Cao, Tianyi Wang, Xun Qian, Pawan S Rao, Manav Wadhawan, Ke Huo, and Karthik Ramani. 2019. GhostAR: A time-space editor for embodied authoring of human-robot collaborative task with augmented reality. InProceedings of the 32nd Annual ACM Symposium on User Interface Software and Technology. 521–534. doi:10.1145/3332165.3347902

work page doi:10.1145/3332165.3347902 2019
[9]

Yuanzhi Cao, Zhuangying Xu, Fan Li, Wentao Zhong, Ke Huo, and Karthik Ramani. 2019. V.Ra: An in-situ visual authoring system for robot-iot task planning with augmented reality. InProceedings of the 2019 on designing interactive systems conference. 1059–1070. doi:10.1145/3322276.3322278

work page doi:10.1145/3322276.3322278 2019
[10]

Eugenio Chisari, Jan Ole Von Hartz, Fabien Despinoy, and Abhinav Valada. 2025. Robotic Task Ambiguity Resolution via Natural Language Interaction. In2025 IEEE/RSJ International Conference on Intelligent Robots and Systems (IROS). 14821– 14827. doi:10.1109/IROS60139.2025.11247661

work page doi:10.1109/iros60139.2025.11247661 2025
[11]

Dagoberto Cruz-Sandoval, Michele Murakami, Alyssa Kubota, and Laurel D Riek. 2025. PODER: A Robot Programming Framework to Further Inclusion of People with Mild Cognitive Impairment in HRI Research. In2025 20th ACM/IEEE International Conference on Human-Robot Interaction (HRI). IEEE, 599–609. doi:10. 1109/HRI61500.2025.10974039

work page arXiv 2025
[12]

Fethiye Irmak Doğan, Ilaria Torre, and Iolanda Leite. 2022. Asking follow-up clarifications to resolve ambiguities in human-robot conversation. In2022 17th ACM/IEEE International Conference on Human-Robot Interaction (HRI). IEEE, 461–

work page 2022
[13]

doi:10.1109/HRI53351.2022.9889368

work page doi:10.1109/hri53351.2022.9889368 2022
[14]

Maria Fox and Derek Long. 2003. PDDL2. 1: An extension to PDDL for expressing temporal planning domains.Journal of artificial intelligence research20 (2003), 61–124. doi:10.1613/jair.1129

work page doi:10.1613/jair.1129 2003
[15]

2016.Automated planning and acting

Malik Ghallab, Dana Nau, and Paolo Traverso. 2016.Automated planning and acting. Cambridge University Press

work page 2016
[16]

Dylan Glas, Satoru Satake, Takayuki Kanda, and Norihiro Hagita. 2012. An interaction design framework for social robots. InRobotics: Science and Systems, Vol. 7. 89. doi:10.15607/RSS.2011.VII.014

work page doi:10.15607/rss.2011.vii.014 2012
[17]

Javi F Gorostiza and Miguel A Salichs. 2011. End-user programming of a social robot by dialog.Robotics and Autonomous Systems59, 12 (2011). doi:10.1016/j. robot.2011.07.009

work page doi:10.1016/j 2011
[19]

Peter E Hart, Nils J Nilsson, and Bertram Raphael. 1968. A formal basis for the heuristic determination of minimum cost paths.IEEE transactions on Systems Science and Cybernetics4, 2 (1968), 100–107. doi:10.1109/TSSC.1968.300136

work page doi:10.1109/tssc.1968.300136 1968
[20]

Malte Helmert. 2006. The fast downward planning system.Journal of Artificial Intelligence Research26 (2006), 191–246. doi:10.1613/jair.1705

work page doi:10.1613/jair.1705 2006
[21]

Eric Horvitz. 1999. Principles of mixed-initiative user interfaces. InProceedings of the SIGCHI conference on Human Factors in Computing Systems. 159–166. doi:10. 1145/302979.303030

work page arXiv 1999
[22]

Gaoping Huang, Pawan S Rao, Meng-Han Wu, Xun Qian, Shimon Y Nof, Karthik Ramani, and Alexander J Quinn. 2020. Vipo: Spatial-visual programming with functions for robot-IoT workflows. InProceedings of the 2020 CHI conference on human factors in computing systems. 1–13. doi:10.1145/3313831.3376670

work page doi:10.1145/3313831.3376670 2020
[23]

Justin Huang and Maya Cakmak. 2017. Code3: A system for end-to-end program- ming of mobile manipulator robots for novices and experts. InProceedings of the 2017 ACM/IEEE International Conference on Human-Robot Interaction. 453–462. doi:10.1145/2909824.3020215

work page doi:10.1145/2909824.3020215 2017
[24]

Justin Huang, Tessa Lau, and Maya Cakmak. 2016. Design and evaluation of a rapid programming system for service robots. In2016 11th ACM/IEEE International Conference on Human-Robot Interaction (HRI). IEEE, 295–302. doi:10.1109/HRI. 2016.7451765

work page doi:10.1109/hri 2016
[25]

Subbarao Kambhampati, Karthik Valmeekam, Lin Guan, Mudit Verma, Kaya Stechly, Siddhant Bhambri, Lucas Paul Saldyt, and Anil B Murthy. 2024. Position: LLMs Can’t Plan, But Can Help Planning in LLM-Modulo Frameworks. InPro- ceedings of the 41st International Conference on Machine Learning (Proceedings of Machine Learning Research, Vol. 235), Ruslan Salakhut...

work page 2024
[26]

Stephanie Kim, Jacy Reese Anthis, and Sarah Sebo. 2024. A taxonomy of robot autonomy for human-robot interaction. InProceedings of the 2024 ACM/IEEE In- ternational Conference on Human-Robot Interaction. 381–393. doi:10.1145/3610977. 3634993

work page doi:10.1145/3610977 2024
[27]

Alyssa Kubota, Emma IC Peterson, Vaishali Rajendren, Hadas Kress-Gazit, and Laurel D Riek. 2020. Jessie: Synthesizing social robot behaviors for personalized neurorehabilitation and beyond. InProceedings of the 2020 ACM/IEEE international conference on human-robot interaction. 121–130. doi:10.1145/3319502.3374836

work page doi:10.1145/3319502.3374836 2020
[28]

Christine P Lee, David Porfirio, Xinyu Jessica Wang, Kevin Chenkai Zhao, and Bilge Mutlu. 2025. Veriplan: Integrating formal verification and llms into end- user planning. InProceedings of the 2025 CHI Conference on Human Factors in Computing Systems. 1–19. doi:10.1145/3706598.3714113

work page doi:10.1145/3706598.3714113 2025
[29]

Yuan-Hong Liao, Xavier Puig, Marko Boben, Antonio Torralba, and Sanja Fidler

work page
[30]

InPro- ceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition

Synthesizing environment-aware activities via activity sketches. InPro- ceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition. 6291–6299. doi:10.1109/CVPR.2019.00645

work page doi:10.1109/cvpr.2019.00645 2019
[31]

Stephanie Lukin, Claire Bonial, Matthew Marge, Taylor A Hudson, Cory Hayes, Kimberly Pollard, Anthony Baker, Ashley N Foots, Ron Artstein, Felix Gervits, et al. 2024. SCOUT: A situated and multi-modal human-robot dialogue corpus. In Proceedings of the 2024 Joint International Conference on Computational Linguistics, Language Resources and Evaluation (LREC...

work page 2024
[32]

Matthew Marge, Felix Gervits, Gordon Briggs, Matthias Scheutz, and Antonio Roque. 2020. Let’s do that first! a comparative analysis of instruction-giving in human-human and human-robot situated dialogue. InProceedings of the 24th Workshop on the Semantics and Pragmatics of Dialogue (SemDial). 18–19

work page 2020
[33]

Andrea Micheli, Arthur Bit-Monnot, Gabriele Röger, Enrico Scala, Alessandro Valentini, Luca Framba, Alberto Rovetta, Alessandro Trapasso, Luigi Bonassi, Alfonso Emilio Gerevini, et al. 2025. Unified Planning: Modeling, manipulating and solving AI planning problems in Python.SoftwareX29 (2025), 102012. doi:10.1016/j.softx.2024.102012

work page doi:10.1016/j.softx.2024.102012 2025
[34]

Dipendra K Misra, Jaeyong Sung, Kevin Lee, and Ashutosh Saxena. 2016. Tell me dave: Context-sensitive grounding of natural language to manipulation in- structions.The International Journal of Robotics Research35, 1-3 (2016), 281–300. doi:10.1177/0278364915602060

work page doi:10.1177/0278364915602060 2016
[35]

Aishwarya Padmakumar, Jesse Thomason, Ayush Shrivastava, Patrick Lange, Anjali Narayan-Chen, Spandana Gella, Robinson Piramuthu, Gokhan Tur, and Dilek Hakkani-Tur. 2022. TEACh: Task-Driven Embodied Agents That Chat. Proceedings of the AAAI Conference on Artificial Intelligence36, 2 (Jun. 2022), 2017–2025. doi:10.1609/aaai.v36i2.20097

work page doi:10.1609/aaai.v36i2.20097 2022
[36]

Chris Paxton, Andrew Hundt, Felix Jonathan, Kelleher Guerin, and Gregory D Hager. 2017. CoSTAR: Instructing collaborative robots with behavior trees and vision. In2017 IEEE international conference on robotics and automation (ICRA). IEEE, 564–571. doi:10.1109/ICRA.2017.7989070

work page doi:10.1109/icra.2017.7989070 2017
[37]

Steven T Piantadosi, Harry Tily, and Edward Gibson. 2012. The communicative function of ambiguity in language.Cognition122, 3 (2012), 280–291. doi:10.1016/ j.cognition.2011.10.004

work page 2012
[38]

See You Later, Alligator

Kaitlynn Taylor Pineda, Ethan Brown, and Chien-Ming Huang. 2025. “See You Later, Alligator”: Impacts of Robot Small Talk on Task, Rapport, and In- teraction Dynamics in Human-Robot Collaboration. In2025 20th ACM/IEEE International Conference on Human-Robot Interaction (HRI). IEEE, 819–828. doi:10.1109/HRI61500.2025.10973942

work page doi:10.1109/hri61500.2025.10973942 2025
[39]

David Porfirio, Vincent Hsiao, Morgan Fine-Morris, Leslie Smith, and Laura M. Hiatt. 2025. Bootstrapping Human-Like Planning via LLMs. In2025 34th IEEE International Conference on Robot and Human Interactive Communication (RO- MAN). 665–670. doi:10.1109/RO-MAN63969.2025.11217637

work page doi:10.1109/ro-man63969.2025.11217637 2025
[40]

David Porfirio, Mark Roberts, and Laura M Hiatt. 2024. Goal-oriented end- user programming of robots. InProceedings of the 2024 ACM/IEEE International Conference on Human-Robot Interaction. 582–591. doi:10.1145/3610977.3634974

work page doi:10.1145/3610977.3634974 2024
[41]

David Porfirio, Mark Roberts, and Laura M Hiatt. 2025. An Interaction Speci- fication Language for Robot Application Development. In2025 20th ACM/IEEE International Conference on Human-Robot Interaction (HRI). IEEE, 1062–1066. doi:10.1109/HRI61500.2025.10973839

work page doi:10.1109/hri61500.2025.10973839 2025
[42]

David Porfirio, Allison Sauppé, Aws Albarghouthi, and Bilge Mutlu. 2018. Au- thoring and verifying human-robot interactions. InProceedings of the 31st annual acm symposium on user interface software and technology. 75–86. doi:10.1145/ 3242587.3242634

work page arXiv 2018
[43]

David Porfirio, Allison Sauppé, Maya Cakmak, Aws Albarghouthi, and Bilge Mutlu. 2023. Crowdsourcing Task Traces for Service Robotics. InCompanion of the 2023 ACM/IEEE International Conference on Human-Robot Interaction. 389–393. doi:10.1145/3568294.3580112

work page doi:10.1145/3568294.3580112 2023
[44]

Emmanuel Pot, Jérôme Monceaux, Rodolphe Gelin, and Bruno Maisonnier. 2009. Choregraphe: a graphical tool for humanoid robot programming. InRo-man 2009-the 18th ieee international symposium on robot and human interactive com- munication. IEEE, 46–51. doi:10.1109/ROMAN.2009.5326209

work page doi:10.1109/roman.2009.5326209 2009
[45]

Xavier Puig, Kevin Ra, Marko Boben, Jiaman Li, Tingwu Wang, Sanja Fidler, and Antonio Torralba. 2018. Virtualhome: Simulating household activities via programs. InProceedings of the IEEE conference on computer vision and pattern recognition. 8494–8502. doi:10.1109/CVPR.2018.00886

work page doi:10.1109/cvpr.2018.00886 2018
[46]

Benedict Quartey, Eric Rosen, Stefanie Tellex, and George Konidaris. 2025. Ver- ifiably following complex robot instructions with foundation models. In2025 IEEE International Conference on Robotics and Automation (ICRA). IEEE, 1–8. DIS ’26, June 13–17, 2026, Singapore, Singapore Li & Porfirio doi:10.1109/ICRA55743.2025.11127418

work page doi:10.1109/icra55743.2025.11127418 2025
[47]

Rebecca Ramnauth, Dražen Brščić, and Brian Scassellati. 2025. A Robot-Assisted Approach to Small Talk Training for Adults with ASD. InProceedings of Robotics: Science and Systems. LosAngeles, CA, USA. doi:10.15607/RSS.2025.XXI.088

work page doi:10.15607/rss.2025.xxi.088 2025
[48]

Allison Sauppé and Bilge Mutlu. 2014. Design patterns for exploring and pro- totyping human-robot interactions. InProceedings of the SIGCHI conference on human factors in computing systems. 1439–1448. doi:10.1145/2556288.2557057

work page doi:10.1145/2556288.2557057 2014
[49]

Andrew Schoen, Curt Henrichs, Mathias Strohkirch, and Bilge Mutlu. 2020. Authr: A task authoring environment for human-robot teams. InProceedings of the 33rd annual acm symposium on user interface software and technology. 1194–1208. doi:10.1145/3379337.3415872

work page doi:10.1145/3379337.3415872 2020
[50]

Andrew Schoen and Bilge Mutlu. 2024. OpenVP: A Customizable Visual Programming Environment for Robotics Applications. InProceedings of the 2024 ACM/IEEE International Conference on Human-Robot Interaction. 944–948. doi:10.1145/3610977.3637477

work page doi:10.1145/3610977.3637477 2024
[51]

Andrew Schoen, Dakota Sullivan, Ze Dong Zhang, Daniel Rakita, and Bilge Mutlu

work page
[52]

InProceedings of the 2023 ACM/IEEE International Conference on Human-Robot Interaction

Lively: Enabling multimodal, lifelike, and extensible real-time robot motion. InProceedings of the 2023 ACM/IEEE International Conference on Human-Robot Interaction. 594–602. doi:10.1145/3568162.3576982

work page doi:10.1145/3568162.3576982 2023
[53]

Andrew Schoen, Nathan White, Curt Henrichs, Amanda Siebert-Evenstone, David Shaffer, and Bilge Mutlu. 2022. CoFrame: A system for training novice cobot programmers. In2022 17th ACM/IEEE International Conference on Human-Robot Interaction (HRI). IEEE, 185–194. doi:10.5555/3523760.3523788

work page doi:10.5555/3523760.3523788 2022
[54]

Emmanuel Senft, Michael Hagenow, Robert Radwin, Michael Zinn, Michael Gleicher, and Bilge Mutlu. 2021. Situated live programming for human-robot collaboration. InThe 34th Annual ACM Symposium on User Interface Software and Technology. 613–625. doi:10.1145/3472749.3474773

work page doi:10.1145/3472749.3474773 2021
[55]

Parshin Shojaee, Iman Mirzadeh, Keivan Alizadeh, Maxwell Horton, Samy Bengio, and Mehrdad Farajtabar. 2025. The illusion of thinking: Understanding the strengths and limitations of reasoning models via the lens of problem complexity. arXiv preprint arXiv:2506.06941(2025)

work page internal anchor Pith review arXiv 2025
[56]

Mohit Shridhar, Jesse Thomason, Daniel Gordon, Yonatan Bisk, Winson Han, Roozbeh Mottaghi, Luke Zettlemoyer, and Dieter Fox. 2020. Alfred: A benchmark for interpreting grounded instructions for everyday tasks. InProceedings of the IEEE/CVF conference on computer vision and pattern recognition. 10740–10749. doi:10.1109/CVPR42600.2020.01075

work page doi:10.1109/cvpr42600.2020.01075 2020
[57]

Saurav Singh, Esa Rantanen, and Jamison Heard. 2026. Human–Robot Teaming: A Comprehensive Survey on Collaboration, Communication, and Cognition.ACM Transactions on Human-Robot Interaction15, 2 (2026), 1–48. doi:10.1145/3776548

work page doi:10.1145/3776548 2026
[58]

David E Smith, Jeremy Frank, and William Cushing. 2008. The ANML language. InThe ICAPS-08 Workshop on Knowledge Engineering for Planning and Scheduling (KEPS), Vol. 31

work page 2008
[59]

David Speck. 2023. SymK–A versatile symbolic search planner.Tenth International Planning Competition (IPC-10): Planner Abstracts(2023)

work page 2023
[60]

Laura Stegner, Yuna Hwang, David Porfirio, and Bilge Mutlu. 2024. Understanding on-the-fly end-user robot programming. InProceedings of the 2024 ACM Designing Interactive Systems Conference. 2468–2480. doi:10.1145/3643834.3660721

work page doi:10.1145/3643834.3660721 2024
[61]

J Gregory Trafton and Brian J Reiser. 1991. Providing natural representations to facilitate novices’ understanding in a new domain: Forward and backward reasoning in programming. InProceedings of the Annual Meeting of the Cognitive Science Society, Vol. 13

work page 1991
[62]

Thank You for Sharing that Interesting Fact!

Tom Williams, Daria Thames, Julia Novakoff, and Matthias Scheutz. 2018. "Thank You for Sharing that Interesting Fact!" Effects of Capability and Context on Indirect Speech Act Use in Task-Based Human-Robot Dialogue. InProceedings of the 2018 acm/ieee international conference on human-robot interaction. 298–306. doi:10.1145/3171221.3171246

work page doi:10.1145/3171221.3171246 2018
[63]

Alex Wuqi Zhang, Rafael Queiroz, and Sarah Sebo. 2025. Balancing User Control and Perceived Robot Social Agency through the Design of End-User Robot Pro- gramming Interfaces. In2025 20th ACM/IEEE International Conference on Human- Robot Interaction (HRI). IEEE, 899–908. doi:10.1109/HRI61500.2025.10974063

work page doi:10.1109/hri61500.2025.10974063 2025
[64]

Yan Zhang, Tharaka Sachintha Ratnayake, Cherie Sew, Jarrod Knibbe, Jorge Goncalves, and Wafa Johal. 2025. Can you pass that tool?: Implications of indirect speech in physical human-robot collaboration. InProceedings of the 2025 CHI Conference on Human Factors in Computing Systems. 1–18. doi:10.1145/3706598. 3713780

work page doi:10.1145/3706598 2025
[65]

Fangyun Zhao, Curt Henrichs, and Bilge Mutlu. 2020. Task interdependence in human-robot teaming. In2020 29th IEEE International Conference on Robot and Human Interactive Communication (RO-MAN). IEEE, 1143–1149. doi:10.1109/RO- MAN47096.2020.9223555

work page doi:10.1109/ro- 2020
[66]

Sanketi, Grecia Salazar, Michael S

Brianna Zitkovich, Tianhe Yu, Sichun Xu, Peng Xu, Ted Xiao, Fei Xia, Jialin Wu, Paul Wohlhart, Stefan Welker, Ayzaan Wahid, Quan Vuong, Vincent Van- houcke, Huong Tran, Radu Soricut, Anikait Singh, Jaspiar Singh, Pierre Ser- manet, Pannag R. Sanketi, Grecia Salazar, Michael S. Ryoo, Krista Reymann, Kanishka Rao, Karl Pertsch, Igor Mordatch, Henryk Michale...

work page
[67]

InProceedings of The 7th Conference on Robot Learning (Proceedings of Ma- chine Learning Research, Vol

RT-2: Vision-Language-Action Models Transfer Web Knowledge to Robotic Control. InProceedings of The 7th Conference on Robot Learning (Proceedings of Ma- chine Learning Research, Vol. 229), Jie Tan, Marc Toussaint, and Kourosh Darvish (Eds.). PMLR, 2165–2183. https://proceedings.mlr.press/v229/zitkovich23a.html Distill: Uncovering the True Intent behind Hu...

work page 2026