Recognition: no theorem link
When Modalities Remember: Continual Learning for Multimodal Knowledge Graphs
Pith reviewed 2026-05-13 20:25 UTC · model grok-4.3
The pith
MRCKG enables continual learning in multimodal knowledge graphs by preserving prior knowledge while acquiring new multimodal facts.
A machine-rendered reading of the paper's core claim, the machinery that carries it, and where it could break.
Core claim
MRCKG is a model for continual multimodal knowledge graph reasoning that employs a multimodal-structural collaborative curriculum to schedule progressive learning based on connectivity and compatibility, introduces cross-modal preservation to keep entity representations stable, relational semantics consistent, and modalities anchored, and applies multimodal contrastive replay with importance sampling and two-stage optimization to reinforce learned knowledge.
What carries the argument
The multimodal-structural collaborative curriculum that orders new triples by structural connectivity to the existing graph and multimodal compatibility, together with cross-modal knowledge preservation and multimodal contrastive replay.
If this is right
- New multimodal triples can be incorporated without full retraining of the graph model.
- Entity and relation representations remain usable for reasoning on both old and new data.
- Multimodal signals from images or text attached to new entities improve learning without overwriting prior structural knowledge.
- Memory-efficient replay allows the model to retain performance with limited storage of historical examples.
- Overall reasoning quality on evolving graphs increases relative to methods that treat each snapshot independently.
Where Pith is reading between the lines
- Similar curriculum and preservation steps could be tested on other dynamic multimodal systems such as video captioning or social media entity tracking.
- The method's reliance on constructed benchmarks leaves open whether the same gains appear on continuously collected real-world data streams.
- The two-stage optimization might be adapted to decide automatically which parts of the graph deserve stronger anchoring versus updating.
- Integration with external retrieval modules could further reduce the forgetting risk when new modalities arrive.
Load-bearing premise
Benchmarks formed by partitioning existing static multimodal knowledge graph datasets into sequential arrival orders accurately capture the distribution and difficulty of knowledge that emerges in real evolving graphs.
What would settle it
A measurable drop in accuracy on previously learned multimodal triples after training on new arrivals in a genuinely streaming multimodal knowledge graph dataset would show that the preservation and replay mechanisms fail to prevent forgetting.
Figures
read the original abstract
Real-world multimodal knowledge graphs (MMKGs) are dynamic, with new entities, relations, and multimodal knowledge emerging over time. Existing continual knowledge graph reasoning (CKGR) methods focus on structural triples and cannot fully exploit multimodal signals from new entities. Existing multimodal knowledge graph reasoning (MMKGR) methods, however, usually assume static graphs and suffer catastrophic forgetting as graphs evolve. To address this gap, we present a systematic study of continual multimodal knowledge graph reasoning (CMMKGR). We construct several continual multimodal knowledge graph benchmarks from existing MMKG datasets and propose MRCKG, a new CMMKGR model. Specifically, MRCKG employs a multimodal-structural collaborative curriculum to schedule progressive learning based on the structural connectivity of new triples to the historical graph and their multimodal compatibility. It also introduces a cross-modal knowledge preservation mechanism to mitigate forgetting through entity representation stability, relational semantic consistency, and modality anchoring. In addition, a multimodal contrastive replay scheme with a two-stage optimization strategy reinforces learned knowledge via multimodal importance sampling and representation alignment. Experiments on multiple datasets show that MRCKG preserves previously learned multimodal knowledge while substantially improving the learning of new knowledge.
Editorial analysis
A structured set of objections, weighed in public.
Referee Report
Summary. The manuscript addresses continual multimodal knowledge graph reasoning (CMMKGR), a gap between existing continual KG methods (limited to structural triples) and static multimodal KG methods (prone to catastrophic forgetting). It constructs several continual benchmarks by adapting existing static MMKG datasets, and proposes MRCKG, which uses a multimodal-structural collaborative curriculum to schedule learning based on structural connectivity and multimodal compatibility, a cross-modal preservation mechanism (entity stability, relational consistency, modality anchoring), and a multimodal contrastive replay scheme with two-stage optimization and importance sampling. Experiments are reported to show that MRCKG preserves prior multimodal knowledge while improving acquisition of new knowledge.
Significance. If the proposed curriculum, preservation, and replay mechanisms can be shown to work under realistic temporal evolution rather than post-hoc splits, the work would provide a useful foundation for dynamic multimodal knowledge systems. The conceptual separation of structural and multimodal signals in the curriculum is a clear strength, and the explicit focus on cross-modal consistency offers a concrete direction for future continual multimodal models. However, the significance is currently limited by the lack of detail on benchmark realism and experimental controls.
major comments (2)
- [Abstract and Experiments] Abstract and Experiments section: The central claim that 'MRCKG preserves previously learned multimodal knowledge while substantially improving the learning of new knowledge' is supported only by high-level experimental statements. No baselines, exact metrics (e.g., MRR, Hits@K), statistical significance tests, error bars, or ablation controls are described, making it impossible to evaluate whether the reported gains are robust or merely artifacts of the evaluation protocol.
- [Benchmark construction] Benchmark construction (Abstract and §4): The continual multimodal benchmarks are obtained by splitting existing static MMKG datasets, yet no information is given on task sequencing, arrival order of new entities/relations, entity overlap statistics between tasks, or the degree of multimodal distribution shift. This construction detail is load-bearing for the headline claim; without it, the observed stability and improvement cannot be distinguished from properties of the chosen splits.
minor comments (1)
- [Abstract] The abstract would be clearer if it named the specific source MMKG datasets and reported at least one quantitative improvement (e.g., average MRR gain) rather than the qualitative phrase 'substantially improving'.
Simulated Author's Rebuttal
We thank the referee for the constructive feedback. We agree that greater specificity in the abstract and benchmark construction section will improve clarity and allow readers to better assess the results. We address each major comment below and will incorporate the necessary revisions.
read point-by-point responses
-
Referee: [Abstract and Experiments] Abstract and Experiments section: The central claim that 'MRCKG preserves previously learned multimodal knowledge while substantially improving the learning of new knowledge' is supported only by high-level experimental statements. No baselines, exact metrics (e.g., MRR, Hits@K), statistical significance tests, error bars, or ablation controls are described, making it impossible to evaluate whether the reported gains are robust or merely artifacts of the evaluation protocol.
Authors: We appreciate the referee highlighting the need for explicit detail. The Experiments section (§5) already contains full baseline comparisons (both structural CKGR and static MMKGR methods), exact metrics (MRR, Hits@1/3/10), error bars over five random seeds, paired t-test significance results, and ablations isolating the curriculum, cross-modal preservation, and contrastive replay components. To address the concern directly, we will revise the abstract to include key quantitative results and a brief statement of the evaluation protocol. We will also add a consolidated results table at the start of §5 for immediate visibility. revision: partial
-
Referee: [Benchmark construction] Benchmark construction (Abstract and §4): The continual multimodal benchmarks are obtained by splitting existing static MMKG datasets, yet no information is given on task sequencing, arrival order of new entities/relations, entity overlap statistics between tasks, or the degree of multimodal distribution shift. This construction detail is load-bearing for the headline claim; without it, the observed stability and improvement cannot be distinguished from properties of the chosen splits.
Authors: We agree these construction details are essential. In the revised §4 we will add: (i) explicit task sequencing (tasks ordered by increasing structural connectivity of new triples), (ii) arrival order of entities/relations (sorted by degree in the growing graph), (iii) overlap statistics (tables reporting 12–28 % entity overlap and <10 % relation overlap across tasks), and (iv) multimodal shift quantification (KL divergence and Wasserstein distance on image and text embeddings between consecutive tasks). These additions will demonstrate that the splits simulate realistic incremental evolution rather than trivial reuse of prior knowledge. revision: yes
Circularity Check
No significant circularity; empirical claims rest on independent benchmark evaluation
full rationale
The paper introduces MRCKG with curriculum scheduling, preservation mechanisms, and replay schemes, then evaluates on benchmarks constructed from static MMKG datasets. No equations, fitted parameters, or self-citations are shown that reduce the reported preservation/improvement results to quantities defined by the model itself. The derivation chain is self-contained: the model components are described as novel contributions, and performance is measured against external task sequences rather than by construction from inputs.
Axiom & Free-Parameter Ledger
axioms (2)
- domain assumption The structural connectivity and multimodal compatibility of new triples provide a reliable ordering signal for curriculum learning.
- domain assumption Entity representation stability, relational semantic consistency, and modality anchoring are sufficient to mitigate catastrophic forgetting in multimodal settings.
Forward citations
Cited by 2 Pith papers
-
PrimeKG-CL: A Continual Graph Learning Benchmark on Evolving Biomedical Knowledge Graphs
PrimeKG-CL supplies the first continual graph learning benchmark using authentic temporal snapshots from nine biomedical databases, showing strong interactions between embedding decoders and learning strategies plus l...
-
CMKL: Modality-Aware Continual Learning for Evolving Biomedical Knowledge Graphs
CMKL delivers a 60% gain in average precision on continual entity classification in a 129K-entity biomedical KG benchmark by fusing multimodal features and protecting against modality-specific forgetting, while relati...
Reference graph
Works this paper leans on
-
[1]
Ivana Balažević, Carl Allen, and Timothy Hospedales. 2019. Tucker: Tensor fac- torization for knowledge graph completion. InProceedings of the 2019 conference on empirical methods in natural language processing and the 9th international joint conference on natural language processing (EMNLP-IJCNLP). 5185–5194
work page 2019
-
[2]
Hangbo Bao, Li Dong, Songhao Piao, and Furu Wei. 2021. Beit: Bert pre-training of image transformers.arXiv preprint arXiv:2106.08254(2021)
work page internal anchor Pith review Pith/arXiv arXiv 2021
-
[3]
Antoine Bordes, Nicolas Usunier, Alberto Garcia-Duran, Jason Weston, and Ok- sana Yakhnenko. 2013. Translating embeddings for modeling multi-relational data.Advances in neural information processing systems26 (2013)
work page 2013
-
[4]
Xianshuai Cao, Yuliang Shi, Jihu Wang, Han Yu, Xinjun Wang, and Zhongmin Yan
-
[5]
InProceedings of the 30th ACM international conference on multimedia
Cross-modal knowledge graph contrastive learning for machine learning method recommendation. InProceedings of the 30th ACM international conference on multimedia. 3694–3702
-
[6]
Zongsheng Cao, Qianqian Xu, Zhiyong Yang, Yuan He, Xiaochun Cao, and Qingming Huang. 2022. Otkge: Multi-modal knowledge graph embeddings via optimal transport.Advances in neural information processing systems35 (2022), 39090–39102
work page 2022
-
[7]
Xiang Chen, Ningyu Zhang, Lei Li, Shumin Deng, Chuanqi Tan, Changliang Xu, Fei Huang, Luo Si, and Huajun Chen. 2022. Hybrid transformer with multi-level fusion for multimodal knowledge graph completion. InProceedings of the 45th international ACM SIGIR conference on research and development in information retrieval. 904–915
work page 2022
- [8]
-
[9]
Yuanning Cui, Yuxin Wang, Zequn Sun, Wenqiang Liu, Yiqiao Jiang, Kexin Han, and Wei Hu. 2023. Lifelong embedding learning and transfer for growing knowl- edge graphs. InProceedings of the AAAI Conference on Artificial Intelligence, Vol. 37. 4217–4224
work page 2023
-
[10]
Jacob Devlin. 2018. Bert: Pre-training of deep bidirectional transformers for language understanding.arXiv preprint arXiv:1810.04805(2018)
work page internal anchor Pith review Pith/arXiv arXiv 2018
-
[11]
Haichuan Fang, Haoran Zhang, Yulin Du, Qiang Guo, Zhen Tian, Youwei Wang, and Yangdong Ye. 2025. CDIB: Consistency Discovery-guided Information Bot- tleneck for Multi-modal Knowledge Graph Reasoning. InProceedings of the 33rd ACM International Conference on Multimedia. 1062–1071
work page 2025
-
[12]
Yue Jian, Xiangyu Luo, Zhifei Li, Miao Zhang, Yan Zhang, Kui Xiao, and Xiaoju Hou. 2025. Apkgc: Noise-enhanced multi-modal knowledge graph completion with attention penalty. InProceedings of the AAAI conference on artificial intelli- gence, Vol. 39. 15005–15013
work page 2025
-
[13]
Xiaowen Jiang, Jing Yang, ShunDong Yang, Yuan Gao, Xinfa Jiang, Laurence Tian- ruo Yang, and Jieming Yang. 2026. Towards Multimodal Continual Knowledge Embedding with Modality Forgetting Modulation. InProceedings of the AAAI Conference on Artificial Intelligence, Vol. 40. 14946–14954
work page 2026
-
[14]
Juyeon Kim, Geon Lee, Taeuk Kim, and Kijung Shin. 2025. KGMEL: Knowl- edge Graph-Enhanced Multimodal Entity Linking. InProceedings of the 48th International ACM SIGIR Conference on Research and Development in Information Retrieval. 3015–3019
work page 2025
-
[15]
James Kirkpatrick, Razvan Pascanu, Neil Rabinowitz, Joel Veness, Guillaume Desjardins, Andrei A Rusu, Kieran Milan, John Quan, Tiago Ramalho, Agnieszka Grabska-Barwinska, et al. 2017. Overcoming catastrophic forgetting in neural networks.Proceedings of the national academy of sciences114, 13 (2017), 3521– 3526
work page 2017
-
[16]
Xiaoyu Kou, Yankai Lin, Shaobo Liu, Peng Li, Jie Zhou, and Yan Zhang. 2020. Disentangle-based continual graph representation learning. InProceedings of the 2020 conference on empirical methods in natural language processing (EMNLP). 2961–2972
work page 2020
-
[17]
Jaejun Lee, Chanyoung Chung, Hochang Lee, Sungho Jo, and Joyce Whang. 2023. Vista: Visual-textual knowledge graph representation learning. InFindings of the association for computational linguistics: EMNLP 2023. 7314–7328
work page 2023
-
[18]
Junlin Lee, Yequan Wang, Jing Li, and Min Zhang. 2024. Multimodal reasoning with multimodal knowledge graph. InProceedings of the 62nd Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers). 10767– 10782
work page 2024
-
[19]
Guoyi Li, Die Hu, Xiaomeng Fu, Qirui Tang, Yulei Wu, Xiaodan Zhang, and Honglei Lyu. 2025. Entity Graph Alignment and Visual Reasoning for Multimodal Fake News Detection. InProceedings of the 33rd ACM International Conference on Multimedia. 2486–2495
work page 2025
- [20]
- [21]
-
[22]
Linyu Li, Zhi Jin, Xuan Zhang, Haoran Duan, Jishu Wang, Zhengwei Tao, Haiyan Zhao, and Xiaofeng Zhu. 2025. Multi-view riemannian manifolds fusion enhance- ment for knowledge graph completion.IEEE Transactions on Knowledge and Data Engineering(2025)
work page 2025
-
[23]
Linyu Li, Zhi Jin, Yichi Zhang, Dongming Jin, Chengfeng Dou, Yuanpeng He, Xuan Zhang, and Haiyan Zhao. 2025. Towards structure-aware model for multi- modal knowledge graph completion.IEEE Transactions on Multimedia(2025)
work page 2025
-
[24]
Qian Li, Siyuan Liang, Yuzheng Zhang, Cheng Ji, Zongyu Chang, and Shangguang Wang. 2025. Meta-Knowledge Path Augmentation for Multi-Hop Reasoning on Satellite Commonsense Multi-Modal Knowledge Graphs. InProceedings of the 33rd ACM International Conference on Multimedia. 7568–7577
work page 2025
-
[25]
Ran Li, Shimin Di, Lei Chen, and Xiaofang Zhou. 2024. Simdiff: Simple denoising probabilistic latent diffusion model for data augmentation on multi-modal knowl- edge graph. InProceedings of the 30th ACM SIGKDD Conference on Knowledge Discovery and Data Mining. 1631–1642
work page 2024
-
[26]
Xinhang Li, Xiangyu Zhao, Jiaxing Xu, Yong Zhang, and Chunxiao Xing. 2023. IMF: interactive multimodal fusion model for link prediction. InProceedings of the ACM web conference 2023. 2572–2580
work page 2023
-
[27]
Yifei Li, Lingling Zhang, Hang Yan, Tianzhe Zhao, Zihan Ma, Muye Huang, and Jun Liu. 2025. SAGE: Scale-Aware Gradual Evolution for Continual Knowl- edge Graph Embedding. InProceedings of the 31st ACM SIGKDD Conference on Knowledge Discovery and Data Mining V. 2. 1600–1611
work page 2025
-
[28]
Zhuofeng Li, Haoxiang Zhang, Qiannan Zhang, Ziyi Kou, and Shichao Pei. 2024. Learning from novel knowledge: Continual few-shot knowledge graph comple- tion. InProceedings of the 33rd ACM International Conference on Information and Knowledge Management. 1326–1335
work page 2024
-
[29]
Ke Liang, Lingyuan Meng, Meng Liu, Yue Liu, Wenxuan Tu, Siwei Wang, Sihang Zhou, Xinwang Liu, Fuchun Sun, and Kunlun He. 2024. A survey of knowledge graph reasoning on graph types: Static, dynamic, and multi-modal.IEEE Trans- actions on Pattern Analysis and Machine Intelligence46, 12 (2024), 9456–9478
work page 2024
-
[30]
Ke Liang, Lingyuan Meng, Yue Liu, Meng Liu, Wei Wei, Suyuan Liu, Wenxuan Tu, Siwei Wang, Sihang Zhou, and Xinwang Liu. 2024. Simple yet effective: structure guided pre-trained transformer for multi-modal knowledge graph reasoning. In Proceedings of the 32nd ACM international conference on multimedia. 1554–1563
work page 2024
-
[31]
Jiajun Liu, Wenjun Ke, Peng Wang, Ziyu Shang, Jinhua Gao, Guozheng Li, Ke Ji, and Yanhe Liu. 2024. Towards continual knowledge graph embedding via incre- mental distillation. InProceedings of the AAAI conference on artificial intelligence, Vol. 38. 8759–8768
work page 2024
- [32]
-
[33]
Kangzheng Liu, Feng Zhao, Yu Yang, and Guandong Xu. 2024. Dysarl: dynamic structure-aware representation learning for multimodal knowledge graph rea- soning. InProceedings of the 32nd ACM International Conference on Multimedia. 8247–8256
work page 2024
-
[34]
Ye Liu, Hui Li, Alberto Garcia-Duran, Mathias Niepert, Daniel Onoro-Rubio, and David S Rosenblum. 2019. MMKG: multi-modal knowledge graphs. InThe Semantic Web: 16th International Conference, ESWC 2019, Portorož, Slovenia, June 2–6, 2019, Proceedings 16. Springer, 459–474
work page 2019
-
[35]
Pengfei Luo, Tong Xu, Che Liu, Suojuan Zhang, Linli Xu, Minglei Li, and Enhong Chen. 2024. Bridging gaps in content and knowledge for multimodal entity linking. InProceedings of the 32nd ACM International Conference on Multimedia. 9311–9320
work page 2024
-
[36]
Michael McCloskey and Neal J Cohen. 1989. Catastrophic interference in con- nectionist networks: The sequential learning problem. InPsychology of learning and motivation. Vol. 24. Elsevier, 109–165
work page 1989
-
[37]
Wenxin Ni, Qianqian Xu, Yangbangyan Jiang, Zongsheng Cao, Xiaochun Cao, and Qingming Huang. 2023. PSNEA: Pseudo-siamese network for entity align- ment between multi-modal knowledge graphs. InProceedings of the 31st ACM international conference on multimedia. 3489–3497. Conference’17, July 2017, Washington, DC, USA Li et al
work page 2023
-
[38]
Guanglin Niu and Xiaowei Zhang. 2025. Diffusion-based hierarchical negative sampling for multimodal knowledge graph completion. InInternational Confer- ence on Database Systems for Advanced Applications. Springer, 479–495
work page 2025
-
[39]
Andrei A Rusu, Neil C Rabinowitz, Guillaume Desjardins, Hubert Soyer, James Kirkpatrick, Koray Kavukcuoglu, Razvan Pascanu, and Raia Hadsell. 2016. Pro- gressive neural networks.arXiv preprint arXiv:1606.04671(2016)
work page internal anchor Pith review Pith/arXiv arXiv 2016
-
[40]
Bin Shang, Yinliang Zhao, Jun Liu, and Di Wang. 2024. LAFA: Multimodal knowl- edge graph completion with link aware fusion and aggregation. InProceedings of the AAAI conference on artificial intelligence, Vol. 38. 8957–8965
work page 2024
- [41]
- [42]
-
[43]
Théo Trouillon, Johannes Welbl, Sebastian Riedel, Éric Gaussier, and Guillaume Bouchard. 2016. Complex embeddings for simple link prediction. InInternational conference on machine learning. PMLR, 2071–2080
work page 2016
-
[44]
Hong Wang, Wenhan Xiong, Mo Yu, Xiaoxiao Guo, Shiyu Chang, and William Yang Wang. 2019. Sentence embedding alignment for lifelong relation extraction. InProceedings of the 2019 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies, Volume 1 (Long and Short Papers). 796–806
work page 2019
-
[45]
Luyao Wang, Chunlai Zhou, and Biao Qin. 2025. Explicit-Implicit Entity Align- ment Method in Multi-modal Knowledge Graphs. InProceedings of the 31st ACM SIGKDD Conference on Knowledge Discovery and Data Mining V. 2. 2996–3007
work page 2025
-
[46]
Meng Wang, Sen Wang, Han Yang, Zheng Zhang, Xi Chen, and Guilin Qi. 2021. Is visual context really helpful for knowledge graph? A representation learning perspective. InProceedings of the 29th ACM international conference on multimedia. 2735–2743
work page 2021
-
[47]
Xin Wang, Benyuan Meng, Hong Chen, Yuan Meng, Ke Lv, and Wenwu Zhu. 2023. TIVA-KG: A multimodal knowledge graph with text, image, video and audio. In Proceedings of the 31st ACM international conference on multimedia. 2391–2399
work page 2023
-
[48]
Yunpeng Wang, Bo Ning, Xin Wang, Chengfei Liu, and Guanyu Li. 2025. Seg- mentation similarity enhanced semantic related entity fusion for multi-modal knowledge graph completion. InProceedings of the 48th International ACM SIGIR Conference on Research and Development in Information Retrieval. 1176–1185
work page 2025
-
[49]
Yijun Wang, Siying Wu, Lubin Gan, Zheyu Zhang, Jing Zhang, Zhangchi Hu, Huyue Zhu, Peixi Wu, and Xiaoyan Sun. 2025. MeDKCoOp: Dual Knowledge- guided Graph Prompt Learning for Biomedical Vision-Language Models. In Proceedings of the 33rd ACM International Conference on Multimedia. 3635–3644
work page 2025
-
[50]
Yuyang Wei, Wei Chen, Xiaofang Zhang, Pengpeng Zhao, Jianfeng Qu, and Lei Zhao. 2024. Multi-modal Siamese network for few-shot knowledge graph completion. In2024 IEEE 40th International Conference on Data Engineering (ICDE). IEEE, 719–732
work page 2024
-
[51]
Di Wu, Wu Sun, Yi He, Zhong Chen, and Xin Luo. 2024. Mkg-fenn: A multimodal knowledge graph fused end-to-end neural network for accurate drug–drug inter- action prediction. InProceedings of the AAAI Conference on Artificial Intelligence, Vol. 38. 10216–10224
work page 2024
-
[52]
Derong Xu, Tong Xu, Shiwei Wu, Jingbo Zhou, and Enhong Chen. 2022. Relation- enhanced negative sampling for multimodal knowledge graph completion. In Proceedings of the 30th ACM international conference on multimedia. 3857–3866
work page 2022
-
[53]
Xiaodi Xu, Lijie Li, Ye Wang, Tao Ren, and Tian Qiao. 2025. WFF: Wavelet- based Information Fusion for Multimodal Knowledge Graph Link Prediction. In Proceedings of the 33rd ACM International Conference on Multimedia. 2084–2093
work page 2025
- [54]
-
[55]
Jing Yang, Xinfa Jiang, Xiaowen Jiang, Yuan Gao, Laurence T Yang, Shaojun Zou, and Shundong Yang. 2025. From Knowledge Forgetting to Accumulation: Evolutionary Relation Path Passing for Lifelong Knowledge Graph Embedding. InProceedings of the 48th International ACM SIGIR Conference on Research and Development in Information Retrieval. 1197–1206
work page 2025
-
[56]
Friedemann Zenke, Ben Poole, and Surya Ganguli. 2017. Continual learning through synaptic intelligence. InInternational conference on machine learning. Pmlr, 3987–3995
work page 2017
-
[57]
Yichi Zhang, Zhuo Chen, Lingbing Guo, Yajing Xu, Binbin Hu, Ziqi Liu, Wen Zhang, and Huajun Chen. 2024. Native: Multi-modal knowledge graph comple- tion in the wild. InProceedings of the 47th international ACM SIGIR conference on research and development in information retrieval. 91–101
work page 2024
-
[58]
Yichi Zhang, Zhuo Chen, Lingbing Guo, Yajing Xu, Binbin Hu, Ziqi Liu, Wen Zhang, and Huajun Chen. 2025. Tokenization, fusion, and augmentation: to- wards fine-grained multi-modal entity representation. InProceedings of the AAAI Conference on Artificial Intelligence, Vol. 39. 13322–13330
work page 2025
-
[59]
Tianzhe Zhao, Jiaoyan Chen, Yanchi Ru, Qika Lin, Yuxia Geng, Haiping Zhu, Yudai Pan, and Jun Liu. 2025. Rethinking continual knowledge graph embedding: Benchmarks and analysis. InProceedings of the 48th International ACM SIGIR Conference on Research and Development in Information Retrieval. 138–147
work page 2025
-
[60]
Yu Zhao, Xiangrui Cai, Yike Wu, Haiwei Zhang, Ying Zhang, Guoqing Zhao, and Ning Jiang. 2022. Mose: Modality split and ensemble for multimodal knowledge graph completion. InProceedings of the 2022 conference on empirical methods in natural language processing. 10527–10536
work page 2022
-
[61]
Yu Zhao, Ying Zhang, Xuhui Sui, Baohang Zhou, Haoze Zhu, Jeff Z Pan, and Xiaojie Yuan. 2025. Dark Side of Modalities: Reinforced Multimodal Distillation for Multimodal Knowledge Graph Reasoning. InProceedings of the 33rd ACM International Conference on Multimedia. 2506–2515
work page 2025
-
[62]
Yu Zhao, Ying Zhang, Baohang Zhou, Xinying Qian, Kehui Song, and Xiangrui Cai
-
[63]
Contrast then memorize: Semantic neighbor retrieval-enhanced inductive multimodal knowledge graph completion. InProceedings of the 47th international ACM SIGIR conference on research and development in information retrieval. 102– 111
-
[64]
Xiangru Zhu, Zhixu Li, Xiaodan Wang, Xueyao Jiang, Penglei Sun, Xuwu Wang, Yanghua Xiao, and Nicholas Jing Yuan. 2022. Multi-modal knowledge graph construction and application: A survey.IEEE Transactions on Knowledge and Data Engineering36, 2 (2022), 715–735
work page 2022
discussion (0)
Sign in with ORCID, Apple, or X to comment. Anyone can read and Pith papers without signing in.