pith. sign in
Pith Number

pith:4FMPN5N6

pith:2026:4FMPN5N6VZ6FLX7NF37KVTPMQC
not attested not anchored not stored refs resolved

Learning Transferable Latent User Preferences for Human-Aligned Decision Making

Alina Hyk, Sandhya Saisubramanian

CLIPR learns transferable natural language rules from minimal conversations to align LLM decisions with latent user preferences.

arxiv:2605.12682 v1 · 2026-05-12 · cs.AI

Add to your LaTeX paper
\usepackage{pith}
\pithnumber{4FMPN5N6VZ6FLX7NF37KVTPMQC}

Prints a linked badge after your title and injects PDF metadata. Compiles on arXiv. Learn more · Embed verified badge

Record completeness

1 Bitcoin timestamp
2 Internet Archive
3 Author claim open · sign in to claim
4 Citations open
5 Replications open
Portable graph bundle live · download bundle · merged state
The bundle contains the canonical record plus signed events. A mirror can host it anywhere and recompute the same current state with the deterministic merge algorithm.

Claims

C1strongest claim

We introduce CLIPR ... a framework that learns actionable, transferable natural language rules that represent latent user preferences from minimal conversational input. These rules are iteratively refined through adaptive feedback and applied to both in-distribution and out-of-distribution ambiguous tasks across multiple environments. Evaluations on three datasets and a user study show that CLIPR consistently outperforms existing methods in improving alignment and reducing inference costs.

C2weakest assumption

That the natural language rules extracted from limited conversations are sufficiently transferable and actionable to guide downstream decision making across in- and out-of-distribution tasks without introducing new misalignment or requiring extensive validation.

C3one line summary

CLIPR learns transferable natural language rules for latent user preferences from minimal conversational input to improve LLM alignment in decision making and outperforms prior methods on three datasets plus a user study.

References

29 extracted · 29 resolved · 0 Pith anchors

[1] You have at most {max_msg} messages with the user
[4] When you have learned enough, end your response with "PAUSE: true". Be conversational but efficient. Filled-in example (excerpt). You are a moderator learning user preferences for a kitchen and home r
[5] Pour sparkling water
[6] Bring me a snack
[8] 8 additional training scenarios ...] Rules for this conversation:
Receipt and verification
First computed 2026-05-18T03:09:49.985123Z
Builder pith-number-builder-2026-05-17-v1
Signature Pith Ed25519 (pith-v1-2026-05) · public key
Schema pith-number/v1.0

Canonical hash

e158f6f5beae7c55dfed2efeaacdec80a46b15c1022bd734bddfb03da611998a

Aliases

arxiv: 2605.12682 · arxiv_version: 2605.12682v1 · doi: 10.48550/arxiv.2605.12682 · pith_short_12: 4FMPN5N6VZ6F · pith_short_16: 4FMPN5N6VZ6FLX7N · pith_short_8: 4FMPN5N6
Agent API
Verify this Pith Number yourself
curl -sH 'Accept: application/ld+json' https://pith.science/pith/4FMPN5N6VZ6FLX7NF37KVTPMQC \
  | jq -c '.canonical_record' \
  | python3 -c "import sys,json,hashlib; b=json.dumps(json.loads(sys.stdin.read()), sort_keys=True, separators=(',',':'), ensure_ascii=False).encode(); print(hashlib.sha256(b).hexdigest())"
# expect: e158f6f5beae7c55dfed2efeaacdec80a46b15c1022bd734bddfb03da611998a
Canonical record JSON
{
  "metadata": {
    "abstract_canon_sha256": "ce38683c9821fecc4e2c6f4a527cb6bfeb40ad2037bf24756058168346ddcb78",
    "cross_cats_sorted": [],
    "license": "http://creativecommons.org/licenses/by-sa/4.0/",
    "primary_cat": "cs.AI",
    "submitted_at": "2026-05-12T19:32:10Z",
    "title_canon_sha256": "ced1e3e3b5d684046218a072f3657eb550a60cd2c52d1fd2291f19383b884340"
  },
  "schema_version": "1.0",
  "source": {
    "id": "2605.12682",
    "kind": "arxiv",
    "version": 1
  }
}