pith:56GWHSSX
SafeManip: A Property-Driven Benchmark for Temporal Safety Evaluation in Robotic Manipulation
A new benchmark shows that robotic manipulation policies often violate temporal safety rules even on tasks they complete successfully.
arxiv:2605.12386 v2 · 2026-05-12 · cs.RO
Add to your LaTeX paper
\usepackage{pith}
\pithnumber{56GWHSSX22L4FBE6S7D3WLCK2W}
Prints a linked badge after your title and injects PDF metadata. Compiles on arXiv. Learn more · Embed verified badge
Record completeness
Claims
Results show that even strong models often behave unsafely. Task-success gains do not reliably translate into safer execution: many successful rollouts remain unsafe, while longer-horizon or more complex tasks expose more violations.
The LTLf safety templates and the mapping from observed rollouts to symbolic predicate traces accurately capture all relevant temporal safety properties without introducing false positives or missing critical real-world constraints.
SafeManip is a new benchmark that applies LTLf monitors to assess temporal safety properties across eight categories in robotic manipulation, demonstrating that task success frequently fails to ensure safe execution in vision-language-action policies.
Formal links
Receipt and verification
| First computed | 2026-06-11T02:09:30.756267Z |
|---|---|
| Builder | pith-number-builder-2026-05-17-v1 |
| Signature | Pith Ed25519
(pith-v1-2026-05) · public key |
| Schema | pith-number/v1.0 |
Canonical hash
ef8d63ca57d697c2849e97c7bb2c4ad5be7da783abc59f60446a80065f37b7c0
Aliases
· · · · ·Agent API
Verify this Pith Number yourself
curl -sH 'Accept: application/ld+json' https://pith.science/pith/56GWHSSX22L4FBE6S7D3WLCK2W \
| jq -c '.canonical_record' \
| python3 -c "import sys,json,hashlib; b=json.dumps(json.loads(sys.stdin.read()), sort_keys=True, separators=(',',':'), ensure_ascii=False).encode(); print(hashlib.sha256(b).hexdigest())"
# expect: ef8d63ca57d697c2849e97c7bb2c4ad5be7da783abc59f60446a80065f37b7c0
Canonical record JSON
{
"metadata": {
"abstract_canon_sha256": "e48b1d6e7014fefe724fb74a1bce032a262a5f70c8fb634c1c9da864a8b24a10",
"cross_cats_sorted": [],
"license": "http://creativecommons.org/licenses/by/4.0/",
"primary_cat": "cs.RO",
"submitted_at": "2026-05-12T16:49:28Z",
"title_canon_sha256": "c33e6927696d000caf790804a5cdb6c955cff6d8f248b7bd48ea53f3c7078a90"
},
"schema_version": "1.0",
"source": {
"id": "2605.12386",
"kind": "arxiv",
"version": 2
}
}