{"record_type":"pith_number_record","schema_url":"https://pith.science/schemas/pith-number/v1.json","pith_number":"pith:2026:XZ7356RACVIGFLLJUSPJGGD2GV","short_pith_number":"pith:XZ7356RA","schema_version":"1.0","canonical_sha256":"be7fbefa20155062ad69a49e93187a356a8eb3aa6f3e505e579e6222de10c808","source":{"kind":"arxiv","id":"2605.13153","version":1},"attestation_state":"computed","paper":{"title":"Strikingness-Aware Evaluation for Temporal Knowledge Graph Reasoning","license":"http://arxiv.org/licenses/nonexclusive-distrib/1.0/","headline":"A strikingness measure shows that temporal knowledge graph models degrade on rare outstanding events and that ensemble gains often come from fitting trivial repetitions instead.","cross_cats":[],"primary_cat":"cs.AI","authors_text":"Rikui Huang, Shengzhe Zhang, Wei Wei","submitted_at":"2026-05-13T08:17:54Z","abstract_excerpt":"Temporal Knowledge Graph Reasoning (TKGR) aims at inferring missing (especially future) events from historical data. Current evaluation in TKGR uniformly weights all events, ignoring that most are trivial repetitions, which overestimate the true reasoning ability. Therefore, the rare outstanding events, whose prediction demands deeper reasoning, should be distinguished and emphasized. To this end, we propose a strikingness-aware evaluation framework, which introduces a rule-based strikingness measuring framework (RSMF) to quantify event strikingness by comparing its expected occurrence with pe"},"verification_status":{"content_addressed":true,"pith_receipt":true,"author_attested":false,"weak_author_claims":0,"strong_author_claims":0,"externally_anchored":false,"storage_verified":false,"citation_signatures":0,"replication_records":0,"graph_snapshot":true,"references_resolved":true,"formal_links_present":false},"canonical_record":{"source":{"id":"2605.13153","kind":"arxiv","version":1},"metadata":{"license":"http://arxiv.org/licenses/nonexclusive-distrib/1.0/","primary_cat":"cs.AI","submitted_at":"2026-05-13T08:17:54Z","cross_cats_sorted":[],"title_canon_sha256":"266cbd1d0f89791f59a15ed3a4a3320ed904690d05b8a1406cbbdaa616828a8e","abstract_canon_sha256":"b33dd82b932514ea497c29560e32b72fc1ddc729e38e1548a38bd17ddde1203b"},"schema_version":"1.0"},"receipt":{"kind":"pith_receipt","key_id":"pith-v1-2026-05","algorithm":"ed25519","signed_at":"2026-05-18T03:08:57.077804Z","signature_b64":"SZRE/02cNAr8vvJZPK2kvhMf+gmxG5GLVDJ/YvIVY+wFAP8Tp5xecphdKbWG1jdpbHomilSPkFD6NFslMifOCA==","signed_message":"canonical_sha256_bytes","builder_version":"pith-number-builder-2026-05-17-v1","receipt_version":"0.3","canonical_sha256":"be7fbefa20155062ad69a49e93187a356a8eb3aa6f3e505e579e6222de10c808","last_reissued_at":"2026-05-18T03:08:57.076931Z","signature_status":"signed_v1","first_computed_at":"2026-05-18T03:08:57.076931Z","public_key_fingerprint":"8d4b5ee74e4693bcd1df2446408b0d54"},"graph_snapshot":{"paper":{"title":"Strikingness-Aware Evaluation for Temporal Knowledge Graph Reasoning","license":"http://arxiv.org/licenses/nonexclusive-distrib/1.0/","headline":"A strikingness measure shows that temporal knowledge graph models degrade on rare outstanding events and that ensemble gains often come from fitting trivial repetitions instead.","cross_cats":[],"primary_cat":"cs.AI","authors_text":"Rikui Huang, Shengzhe Zhang, Wei Wei","submitted_at":"2026-05-13T08:17:54Z","abstract_excerpt":"Temporal Knowledge Graph Reasoning (TKGR) aims at inferring missing (especially future) events from historical data. Current evaluation in TKGR uniformly weights all events, ignoring that most are trivial repetitions, which overestimate the true reasoning ability. Therefore, the rare outstanding events, whose prediction demands deeper reasoning, should be distinguished and emphasized. To this end, we propose a strikingness-aware evaluation framework, which introduces a rule-based strikingness measuring framework (RSMF) to quantify event strikingness by comparing its expected occurrence with pe"},"claims":{"count":4,"items":[{"kind":"strongest_claim","text":"Experiments on four TKG benchmarks reveal that all representative models perform worse as event strikingness increases, path-based methods excel on low-strikingness events while representation-based ones excel on high-strikingness events, and an ensemble method's gains stem from fitting trivial events rather than reasoning improvement.","source":"verdict.strongest_claim","status":"machine_extracted","claim_id":"C1","attestation":"unclaimed"},{"kind":"weakest_assumption","text":"That the rule-based strikingness measuring framework accurately identifies events whose prediction requires deeper reasoning by comparing expected occurrence against peer events derived from temporal rules extracted from the same data.","source":"verdict.weakest_assumption","status":"machine_extracted","claim_id":"C2","attestation":"unclaimed"},{"kind":"one_line_summary","text":"A rule-based strikingness measure is added to TKGR metrics to weight rare events higher, revealing that models weaken on striking events and ensemble gains come mostly from trivial fits.","source":"verdict.one_line_summary","status":"machine_extracted","claim_id":"C3","attestation":"unclaimed"},{"kind":"headline","text":"A strikingness measure shows that temporal knowledge graph models degrade on rare outstanding events and that ensemble gains often come from fitting trivial repetitions instead.","source":"verdict.pith_extraction.headline","status":"machine_extracted","claim_id":"C4","attestation":"unclaimed"}],"snapshot_sha256":"7b6f814834fde28b78a6f96e49fe127c530bb6e415685bb6fafd68a8100f20d7"},"source":{"id":"2605.13153","kind":"arxiv","version":1},"verdict":{"id":"61339480-babf-4c51-8c8f-14fc14a08057","model_set":{"reader":"grok-4.3"},"created_at":"2026-05-14T20:22:24.189556Z","strongest_claim":"Experiments on four TKG benchmarks reveal that all representative models perform worse as event strikingness increases, path-based methods excel on low-strikingness events while representation-based ones excel on high-strikingness events, and an ensemble method's gains stem from fitting trivial events rather than reasoning improvement.","one_line_summary":"A rule-based strikingness measure is added to TKGR metrics to weight rare events higher, revealing that models weaken on striking events and ensemble gains come mostly from trivial fits.","pipeline_version":"pith-pipeline@v0.9.0","weakest_assumption":"That the rule-based strikingness measuring framework accurately identifies events whose prediction requires deeper reasoning by comparing expected occurrence against peer events derived from temporal rules extracted from the same data.","pith_extraction_headline":"A strikingness measure shows that temporal knowledge graph models degrade on rare outstanding events and that ensemble gains often come from fitting trivial repetitions instead."},"references":{"count":73,"sample":[{"doi":"10.1145/3450287","year":2022,"title":"ACM Computing Surveys , volume =","work_id":"68e55231-a3ba-4594-a455-d4520c28124a","ref_index":1,"cited_arxiv_id":"","is_internal_anchor":false},{"doi":"","year":2010,"title":"International studies review , volume=","work_id":"46cfcfec-a40a-4a82-aec0-d5bcc1c4cb20","ref_index":2,"cited_arxiv_id":"","is_internal_anchor":false},{"doi":"10.1038/s42256-023-00735-0","year":2023,"title":"Coutinho and Sagi Eppel and Jacob Gates Foster and Andrew Gritsevskiy and Harlin Lee and Yichao Lu and Jo","work_id":"79210512-438b-40f5-95d2-d4990554c8be","ref_index":3,"cited_arxiv_id":"","is_internal_anchor":false},{"doi":"","year":1979,"title":"ISA annual convention , volume=","work_id":"c132259e-6cde-49f5-9e62-f1f86798c4c4","ref_index":4,"cited_arxiv_id":"","is_internal_anchor":false},{"doi":"10.1109/tpami.2024.3417451","year":2024,"title":"IEEE Transactions on Pattern Analysis and Machine Intelligence , author =","work_id":"9cef28b2-439e-433b-8327-06658bf2d77a","ref_index":5,"cited_arxiv_id":"","is_internal_anchor":false}],"resolved_work":73,"snapshot_sha256":"a1e388d6068c7f7c5b5479fcbc3c0af46eca0ab3e6f26bea3ef30e72cca10bc7","internal_anchors":0},"formal_canon":{"evidence_count":0,"snapshot_sha256":"258153158e38e3291e3d48162225fcdb2d5a3ed65a07baac614ab91432fd4f57"},"author_claims":{"count":0,"strong_count":0,"snapshot_sha256":"258153158e38e3291e3d48162225fcdb2d5a3ed65a07baac614ab91432fd4f57"},"builder_version":"pith-number-builder-2026-05-17-v1"},"aliases":[{"alias_kind":"arxiv","alias_value":"2605.13153","created_at":"2026-05-18T03:08:57.077082+00:00"},{"alias_kind":"arxiv_version","alias_value":"2605.13153v1","created_at":"2026-05-18T03:08:57.077082+00:00"},{"alias_kind":"doi","alias_value":"10.48550/arxiv.2605.13153","created_at":"2026-05-18T03:08:57.077082+00:00"},{"alias_kind":"pith_short_12","alias_value":"XZ7356RACVIG","created_at":"2026-05-18T12:33:37.589309+00:00"},{"alias_kind":"pith_short_16","alias_value":"XZ7356RACVIGFLLJ","created_at":"2026-05-18T12:33:37.589309+00:00"},{"alias_kind":"pith_short_8","alias_value":"XZ7356RA","created_at":"2026-05-18T12:33:37.589309+00:00"}],"events":[],"event_summary":{},"paper_claims":[],"inbound_citations":{"count":0,"internal_anchor_count":0,"sample":[]},"formal_canon":{"evidence_count":0,"sample":[],"anchors":[]},"links":{"html":"https://pith.science/pith/XZ7356RACVIGFLLJUSPJGGD2GV","json":"https://pith.science/pith/XZ7356RACVIGFLLJUSPJGGD2GV.json","graph_json":"https://pith.science/api/pith-number/XZ7356RACVIGFLLJUSPJGGD2GV/graph.json","events_json":"https://pith.science/api/pith-number/XZ7356RACVIGFLLJUSPJGGD2GV/events.json","paper":"https://pith.science/paper/XZ7356RA"},"agent_actions":{"view_html":"https://pith.science/pith/XZ7356RACVIGFLLJUSPJGGD2GV","download_json":"https://pith.science/pith/XZ7356RACVIGFLLJUSPJGGD2GV.json","view_paper":"https://pith.science/paper/XZ7356RA","resolve_alias":"https://pith.science/api/pith-number/resolve?arxiv=2605.13153&json=true","fetch_graph":"https://pith.science/api/pith-number/XZ7356RACVIGFLLJUSPJGGD2GV/graph.json","fetch_events":"https://pith.science/api/pith-number/XZ7356RACVIGFLLJUSPJGGD2GV/events.json","actions":{"anchor_timestamp":"https://pith.science/pith/XZ7356RACVIGFLLJUSPJGGD2GV/action/timestamp_anchor","attest_storage":"https://pith.science/pith/XZ7356RACVIGFLLJUSPJGGD2GV/action/storage_attestation","attest_author":"https://pith.science/pith/XZ7356RACVIGFLLJUSPJGGD2GV/action/author_attestation","sign_citation":"https://pith.science/pith/XZ7356RACVIGFLLJUSPJGGD2GV/action/citation_signature","submit_replication":"https://pith.science/pith/XZ7356RACVIGFLLJUSPJGGD2GV/action/replication_record"}},"created_at":"2026-05-18T03:08:57.077082+00:00","updated_at":"2026-05-18T03:08:57.077082+00:00"}