Pith Number

pith:4FMPN5N6

pith:2026:4FMPN5N6VZ6FLX7NF37KVTPMQC

not attested not anchored not stored refs resolved

Learning Transferable Latent User Preferences for Human-Aligned Decision Making

Alina Hyk, Sandhya Saisubramanian

CLIPR learns transferable natural language rules from minimal conversations to align LLM decisions with latent user preferences.

arxiv:2605.12682 v1 · 2026-05-12 · cs.AI

Open paper page JSON Open Graph Bundle Merged state Verified badge What is a Pith Number?

Add to your LaTeX paper

\usepackage{pith}
\pithnumber{4FMPN5N6VZ6FLX7NF37KVTPMQC}

Prints a linked badge after your title and injects PDF metadata. Compiles on arXiv. Learn more · Embed verified badge

Record completeness

1 Bitcoin timestamp

2 Internet Archive

3 Author claim open · sign in to claim

4 Citations open

5 Replications open

✓ Portable graph bundle live · download bundle · merged state

The bundle contains the canonical record plus signed events. A mirror can host it anywhere and recompute the same current state with the deterministic merge algorithm.

Claims

C1strongest claim

We introduce CLIPR ... a framework that learns actionable, transferable natural language rules that represent latent user preferences from minimal conversational input. These rules are iteratively refined through adaptive feedback and applied to both in-distribution and out-of-distribution ambiguous tasks across multiple environments. Evaluations on three datasets and a user study show that CLIPR consistently outperforms existing methods in improving alignment and reducing inference costs.

C2weakest assumption

That the natural language rules extracted from limited conversations are sufficiently transferable and actionable to guide downstream decision making across in- and out-of-distribution tasks without introducing new misalignment or requiring extensive validation.

C3one line summary

CLIPR learns transferable natural language rules for latent user preferences from minimal conversational input to improve LLM alignment in decision making and outperforms prior methods on three datasets plus a user study.

References

29 extracted · 29 resolved · 0 Pith anchors

[1] You have at most {max_msg} messages with the user

[4] When you have learned enough, end your response with "PAUSE: true". Be conversational but efficient. Filled-in example (excerpt). You are a moderator learning user preferences for a kitchen and home r

[5] Pour sparkling water

[6] Bring me a snack

[8] 8 additional training scenarios ...] Rules for this conversation:

Receipt and verification

First computed	2026-05-18T03:09:49.985123Z
Builder	pith-number-builder-2026-05-17-v1
Signature	Pith Ed25519 (`pith-v1-2026-05`) · public key
Schema	pith-number/v1.0

Canonical hash

e158f6f5beae7c55dfed2efeaacdec80a46b15c1022bd734bddfb03da611998a

Aliases

arxiv: 2605.12682 · arxiv_version: 2605.12682v1 · doi: 10.48550/arxiv.2605.12682 · pith_short_12: 4FMPN5N6VZ6F · pith_short_16: 4FMPN5N6VZ6FLX7N · pith_short_8: 4FMPN5N6

Agent API

Resolver JSON Graph JSON Events JSON Schema Signing key

Verify this Pith Number yourself

curl -sH 'Accept: application/ld+json' https://pith.science/pith/4FMPN5N6VZ6FLX7NF37KVTPMQC \
  | jq -c '.canonical_record' \
  | python3 -c "import sys,json,hashlib; b=json.dumps(json.loads(sys.stdin.read()), sort_keys=True, separators=(',',':'), ensure_ascii=False).encode(); print(hashlib.sha256(b).hexdigest())"
# expect: e158f6f5beae7c55dfed2efeaacdec80a46b15c1022bd734bddfb03da611998a

Canonical record JSON

{
  "metadata": {
    "abstract_canon_sha256": "ce38683c9821fecc4e2c6f4a527cb6bfeb40ad2037bf24756058168346ddcb78",
    "cross_cats_sorted": [],
    "license": "http://creativecommons.org/licenses/by-sa/4.0/",
    "primary_cat": "cs.AI",
    "submitted_at": "2026-05-12T19:32:10Z",
    "title_canon_sha256": "ced1e3e3b5d684046218a072f3657eb550a60cd2c52d1fd2291f19383b884340"
  },
  "schema_version": "1.0",
  "source": {
    "id": "2605.12682",
    "kind": "arxiv",
    "version": 1
  }
}