Fine-Tune Language Models as Multi-Modal Differential Equation Solvers

Liu Yang; Siting Liu; Stanley J. Osher

arxiv: 2308.05061 · v4 · pith:N2A7IWD3new · submitted 2023-08-09 · 💻 cs.LG · cs.NA· math.NA· stat.ML

Fine-Tune Language Models as Multi-Modal Differential Equation Solvers

Liu Yang , Siting Liu , Stanley J. Osher This is my paper

classification 💻 cs.LG cs.NAmath.NAstat.ML

keywords learningoperatorlanguagemodelsin-contextdatamulti-modaldifferential

0 comments

read the original abstract

In the growing domain of scientific machine learning, in-context operator learning has shown notable potential in building foundation models, as in this framework the model is trained to learn operators and solve differential equations using prompted data, during the inference stage without weight updates. However, the current model's overdependence on function data overlooks the invaluable human insight into the operator. To address this, we present a transformation of in-context operator learning into a multi-modal paradigm. In particular, we take inspiration from the recent success of large language models, and propose using "captions" to integrate human knowledge about the operator, expressed through natural language descriptions and equations. Also, we introduce a novel approach to train a language-model-like architecture, or directly fine-tune existing language models, for in-context operator learning. We beat the baseline on single-modal learning tasks, and also demonstrated the effectiveness of multi-modal learning in enhancing performance and reducing function data requirements. The proposed method not only significantly enhanced the development of the in-context operator learning paradigm, but also created a new path for the application of language models.

This paper has not been read by Pith yet.

discussion (0)

Forward citations

Cited by 1 Pith paper

Reviewed papers in the Pith corpus that reference this work. Sorted by Pith novelty score.

Multiple Neural Operators Achieve Near-Optimal Rates for Multi-Task Learning
cs.LG 2026-05 unverdicted novelty 5.0

Multiple Neural Operators achieve near-optimal approximation and generalization rates for multi-task operator learning, matching single-task scaling laws and performing similarly to a multi-task DeepONet extension.