DN-Hypo-Pipeline: An AI-Driven Workflow for Generating Hypotheses using Large Language Models and Scientific Explanations
Pith reviewed 2026-06-27 18:58 UTC · model grok-4.3
The pith
A workflow that applies the structure of scientific explanations lets large language models generate hypotheses that outperform those from direct prompting and can be turned into working algorithms.
A machine-rendered reading of the paper's core claim, the machinery that carries it, and where it could break.
Core claim
The DN-Hypo-Pipeline adopts a layered scaffold in which Hempel's deductive-nomological model supplies the output form and deductive validity, Salmon's causal-process account organizes the search for governing laws, and Armstrong's view of laws as relations between universals bridges from a phenomenon's constituent processes to candidate laws. Given an explanandum, the workflow abstracts the universals instantiated in the formation process, retrieves the laws that relate those universals, and deductively reconstructs a new, testable explanation. Hypotheses produced this way significantly outperform those from direct prompting in expert and LLM judgments, and the two top hypotheses were transl
What carries the argument
The DN-Hypo-Pipeline, a layered explanation-theoretic scaffold that combines Hempel's DN model for hypothesis form, Salmon's causal-process account for search constraints, and Armstrong's universals-relations view to connect processes to laws, so that the LLM abstracts universals, retrieves laws, and deductively generates new hypotheses.
If this is right
- Hypotheses generated through the principled reasoning significantly outperform those from direct prompting when judged by both LLMs and human experts.
- The two highest-scoring hypotheses were translated into novel algorithms, one reducing the Transformer's theoretical complexity with only minimal performance loss and another achieving competitive accuracy with substantially fewer parameters.
- The framework searches the space of principles that govern a phenomenon rather than the space of what has already been written.
- The layered scaffold supplies output form and deductive validity from Hempel, search constraints from Salmon, and the bridge from processes to laws from Armstrong.
Where Pith is reading between the lines
- The same workflow could be tested on phenomena outside data science, such as physical or biological systems, to see whether the philosophical scaffolding transfers.
- If the deductive reconstructions remain valid across domains, the approach might support iterative refinement where a generated hypothesis is tested and the results fed back to update the universals or laws.
- One could examine whether the method produces hypotheses that are more readily falsifiable in experiment than those from unguided prompting.
Load-bearing premise
The three cited philosophical accounts of explanation can be operationalized into an LLM workflow that reliably abstracts universals from a phenomenon's formation process, retrieves governing laws, and produces deductively valid novel hypotheses.
What would settle it
A side-by-side evaluation in which human experts or LLMs rate direct-prompt hypotheses as equal to or better than those from the DN-Hypo-Pipeline, or in which the two translated Transformer algorithms fail to show the claimed reductions in complexity or parameter count while preserving accuracy.
read the original abstract
Modern artificial intelligence excels at prediction but cannot explain. From large language models to AI-for-science systems, today's machines answer what by recombining patterns already present in the human literature, yet they cannot reason out why a phenomenon must arise from underlying principles even though explanation, not prediction, lies at the heart of scientific discovery. Here we ask whether the structure of scientific explanation can be operationalized to guide how a machine generates hypotheses. We introduce DN-Hypo-Pipeline, a hypothesis-generation framework that adopts a layered, explanation-theoretic scaffold: Hempel's deductive-nomological (DN) model supplies the output form and deductive validity of a hypothesis, Salmon's causal-process account supplies an organizing constraint on where to search for the governing laws, and Armstrong's view of laws as relations between universals supplies the bridge from a phenomenon's constituent processes to the laws that may be associated with it. Rather than searching the space of what has been written, the framework searches the space of what principles govern a phenomenon: given an explanandum, it abstracts the universals instantiated in the phenomenon's formation process, retrieves the laws relating those universals, and deductively reconstructs a new, testable explanation. Evaluated in data-science modeling and judged by both LLMs and human experts, hypotheses generated through this principled reasoning significantly outperform those from direct prompting. Crucially, we translated the two highest-scoring hypotheses into novel algorithms one that reduces the Transformer's theoretical complexity with only minimal performance loss, and another that achieves competitive accuracy with substantially fewer parameters.
Editorial analysis
A structured set of objections, weighed in public.
Referee Report
Summary. The manuscript introduces the DN-Hypo-Pipeline, a hypothesis-generation framework that operationalizes Hempel's deductive-nomological (DN) model for output form and deductive validity, Salmon's causal-process account for searching governing laws, and Armstrong's universals-relations view for bridging phenomena to laws. Given an explanandum, the pipeline abstracts universals from formation processes, retrieves laws, and deductively reconstructs novel testable explanations. It claims that this yields hypotheses that significantly outperform direct prompting in data-science modeling (judged by LLMs and humans) and that the two highest-scoring hypotheses were translated into novel algorithms reducing Transformer complexity with minimal performance loss and achieving competitive accuracy with fewer parameters.
Significance. If the claims hold with demonstrated deductive validity and quantitative support, the work could advance AI-for-science by providing a structured, philosophy-grounded alternative to pattern-recombination approaches in LLMs. The explicit linkage of three philosophical accounts to a concrete workflow and the downstream translation of hypotheses into algorithms are potential strengths if rigorously evidenced.
major comments (2)
- [Abstract] Abstract: The central claims of significant outperformance over direct prompting and successful translation of hypotheses into novel algorithms are asserted without any reported implementation details, evaluation metrics, baselines, controls, statistical tests, or quantitative results, rendering the claims impossible to assess.
- [Framework definition (abstract)] Framework definition (abstract): The assertion that the pipeline produces deductively valid hypotheses per Hempel's DN model (explanandum as logical consequence of laws plus initial conditions) is unsupported by the described LLM workflow of natural-language 'abstraction of universals', 'retrieval of laws', and 'deductive reconstruction'; no formal logic engine, theorem prover, or entailment check is indicated, allowing for non-entailed steps or invented premises that violate the DN requirement.
minor comments (1)
- [Abstract] Abstract: The sentence 'we translated the two highest-scoring hypotheses into novel algorithms one that reduces...' lacks punctuation (e.g., a colon or period) for readability.
Simulated Author's Rebuttal
Thank you for the constructive referee report. We address each major comment below. Where the comments identify gaps in the abstract or need for clarification on the framework, we will revise accordingly while preserving the core contributions.
read point-by-point responses
-
Referee: [Abstract] Abstract: The central claims of significant outperformance over direct prompting and successful translation of hypotheses into novel algorithms are asserted without any reported implementation details, evaluation metrics, baselines, controls, statistical tests, or quantitative results, rendering the claims impossible to assess.
Authors: We agree the abstract presents the claims at a high level. The full manuscript (Sections 4–6) reports the experimental protocol, including LLM and human evaluation metrics for hypothesis quality, direct-prompting baselines, controls for prompt length and temperature, and statistical tests (e.g., paired t-tests) showing significant differences. The algorithm translations include complexity analysis and accuracy comparisons on standard benchmarks. We will revise the abstract to include concise quantitative summaries and pointers to these sections so the claims can be assessed from the abstract alone. revision: yes
-
Referee: [Framework definition (abstract)] Framework definition (abstract): The assertion that the pipeline produces deductively valid hypotheses per Hempel's DN model (explanandum as logical consequence of laws plus initial conditions) is unsupported by the described LLM workflow of natural-language 'abstraction of universals', 'retrieval of laws', and 'deductive reconstruction'; no formal logic engine, theorem prover, or entailment check is indicated, allowing for non-entailed steps or invented premises that violate the DN requirement.
Authors: The observation is accurate: the pipeline implements the DN structure through LLM-guided natural-language steps rather than a formal theorem prover or entailment verifier. We will revise the abstract and framework sections to state explicitly that the workflow operationalizes the DN model heuristically—structuring prompts to encourage deductive reconstruction—while acknowledging that it does not guarantee formal logical validity. This change clarifies the distinction between philosophical inspiration and formal proof without altering the reported empirical results. revision: yes
Circularity Check
No circularity: framework operationalizes external philosophical accounts into empirical workflow
full rationale
The paper's derivation chain consists of defining a hypothesis-generation pipeline that adopts Hempel's DN model for deductive form, Salmon's account for search constraints, and Armstrong's universals for bridging processes to laws, then applies this scaffold via LLMs to abstract universals, retrieve laws, and reconstruct explanations from an explanandum. Evaluation compares generated hypotheses against direct prompting baselines, with downstream translation to algorithms presented as empirical outcomes. No equations, fitted parameters, or self-citations appear in the provided text; the central claims rest on the operationalization of externally cited philosophical sources rather than any reduction of outputs to inputs by construction. The workflow is self-contained against external benchmarks (LLM and human expert judgments) without load-bearing self-referential steps.
Axiom & Free-Parameter Ledger
axioms (3)
- domain assumption Hempel's deductive-nomological model supplies the output form and deductive validity of a hypothesis
- domain assumption Salmon's causal-process account supplies an organizing constraint on where to search for the governing laws
- domain assumption Armstrong's view of laws as relations between universals supplies the bridge from a phenomenon's constituent processes to the laws
invented entities (1)
-
DN-Hypo-Pipeline
no independent evidence
Reference graph
Works this paper leans on
-
[1]
Additionally, Figs
Experimental Results Analysis and Discussion The experiments show that not only can LLMs propose hypotheses, but also the best hypotheses generated by DN -Hypo-Pipeline improved the aggregate sum of scores by an average of more than 6 points (53-46.33=6.67) across a total of 80 scores (4 LLMs * total 20 scores), as outlined in Table 6. Additionally, Figs....
-
[2]
profound and self-consistent
Limitations and Conclusions The limitations of our approach largely parallel the limitations of LLMs. When using an LLM to generate open-ended ontologies, such as when generating universals and law s, hallucinations can manifest in a particularly stubborn and intractable form. Hence, when a model is required to construct a complet e conceptual system from...
-
[3]
https://en.wikipedia.org/wiki/Scientific_method
scientific method, (n.d.). https://en.wikipedia.org/wiki/Scientific_method
-
[4]
https://plato.stanford.edu/entries/scientific-method/
scientific-method, (n.d.). https://plato.stanford.edu/entries/scientific-method/
-
[5]
Hempel, Philosophy of Natural Science, Prentice Hall, 1966
C.G. Hempel, Philosophy of Natural Science, Prentice Hall, 1966
1966
-
[6]
S. Ren, P. Jian, Z. Ren, C. Leng, C. Xie, J. Zhang, Towards Scientific Intelligence: A Survey of LLM-based Scientific Agents, (2025). http://arxiv.org/abs/2503.24047
arXiv 2025
-
[7]
J. Gottweis, W.-H. Weng, A. Daryin, T. Tu, P. Sirkovic, A. Myaskovsky, G. Glowaty, F. Weissenberger, A. Orlandi, D. Popovici, A. Palepu, K. Rong, R. Tanno, K. Saab, F. Zhang, J. Blum, A. Carroll, K. Kulkarni, N. Tomašev, D. Zverinski, I. Rendulic, E. Vedadi, F. Hasler, L. Rimanic, M. Boia, I. Budiselic, B. Feinstein, M. Bellaiche, T. Sheffer, J. Freyberg,...
-
[8]
C. Lu, C. Lu, R.T. Lange, Y. Yamada, S. Hu, J. Foerster, D. Ha, J. Clune, Towards end -to-end automation of AI research, Nature 651 (2026) 914–919. https://doi.org/10.1038/s41586-026-10265-5
-
[9]
Z. Wang, B. Danek, Z. Yang, Z. Chen, J. Sun, Can Large Language Models Replace Data Scientists in Clinical Research?, Arxiv (2024) 1–28
2024
-
[10]
Sprueill, C
H.W. Sprueill, C. Edwards, K. Agarwal, M. V Olarte, U. Sanyal, C. Johnston, H. Liu, H. Ji, S. Choudhury, CHEMREASONER: heuristic search over a large language model ’s knowledge space using quantum-chemical feedback, in: Proc. 41st Int. Conf. Mach. Learn., JMLR.org, 2024
2024
-
[11]
C. Cao, X. Cao, M. Cashman, M. Kumar, A. Timoshenko, J. Yang, S. Yu, J. Zhang, Y. Zhu, B. Wernerfelt, How do successful scholars get their best research ideas? An exploration, Mark. Lett. 30 (2019) 221–232. https://www.jstor.org/stable/48701541
arXiv 2019
-
[12]
W.C. Salmon, W.C. Salmon, Scientific Explanation and the Causal Structure of the World, Princeton University Press, Princeton, 2020. https://doi.org/doi:10.1515/9780691221489
-
[13]
Salmon, Causality and Explanation: A Reply to Two Critiques, Philos
W.C. Salmon, Causality and Explanation: A Reply to Two Critiques, Philos. Sci. 64 (1997) 461–
1997
-
[14]
http://www.jstor.org/stable/188320 (accessed June 23, 2026)
2026
-
[15]
ARMSTRONG, Laws of Nature As Relations Between Universals, and As Universals, Philos
D.M. ARMSTRONG, Laws of Nature As Relations Between Universals, and As Universals, Philos. Top. 13 (1982) 7–24. http://www.jstor.org/stable/43153907
arXiv 1982
-
[16]
H., Steinbach, M., Banerjee, A., Ganguly, A., Shekhar, S., Samatova, N., and Kumar, V
A. Karpatne, G. Atluri, J.H. Faghmous, M. Steinbach, A. Banerjee, A. Ganguly, S. Shekhar, N. Samatova, V. Kumar, Theory-Guided Data Science: A New Paradigm for Scientific Discovery from Data, IEEE Trans. Knowl. Data Eng. 29 (2017) 2318–2331. https://doi.org/10.1109/TKDE.2017.2720168
-
[17]
I. Ciucă, Y.-S. Ting, S. Kruk, K. Iyer, Harnessing the Power of Adversarial Prompting and Large Language Models for Robust Hypothesis Generation in Astronomy, (2023). http://arxiv.org/abs/2306.11648
arXiv 2023
-
[18]
T. O’Brien, J. Stremmel, L. Pio-Lopez, P. McMillen, C. Rasmussen-Ivey, M. Levin, Machine learning for hypothesis generation in biology and medicine: exploring the latent space of neuroscience and developmental bioelectricity, Digit. Discov. 3 (2024) 249–263. https://doi.org/https://doi.org/10.1039/d3dd00185g
-
[19]
B. Qi, K. Zhang, K. Tian, H. Li, Z.-R. Chen, S. Zeng, E. Hua, H. Jinfang, B. Zhou, Large Language Models as Biomedical Hypothesis Generators: A Comprehensive Evaluation, (2024). http://arxiv.org/abs/2407.08940
arXiv 2024
-
[20]
M. Radensky, S. Shahid, R. Fok, P. Siangliulue, T. Hope, D.S. Weld, Scideator: Human -LLM Compound System for Scientific Ideation through Facet Recombination and Novelty Evaluation, in: Proc. ACM Conf. AI Agentic Syst., Association for Computing Machinery, New York, NY, USA, 2026: pp. 348–374. https://doi.org/10.1145/3786335.3813161
-
[21]
doi:10.18653/v1/2025.naacl-long.342 , url=
J. Baek, S.K. Jauhar, S. Cucerzan, S.J. Hwang, ResearchAgent: Iterative Research Idea Generation over Scientific Literature with Large Language Models, Proc. 2025 Annu. Conf. Nations Am. Chapter Assoc. Comput. Linguist. Hum. Lang. Technol. Long Pap. NAACL -HLT 2025 1 (2025) 6709–6738. https://doi.org/10.18653/v1/2025.naacl-long.342
-
[22]
C. O’Neill, T. Ghosal, R. Răileanu, M. Walmsley, T. Bui, K. Schawinski, I. Ciucă, Sparks of Science: Hypothesis Generation Using Structured Paper Data, (2025). http://arxiv.org/abs/2504.12976
arXiv 2025
-
[23]
R. Li, L. Jing, C. Han, J. Zhou, X. Du, Learning to Generate Research Idea with Dynamic Control, (2024). http://arxiv.org/abs/2412.14626
arXiv 2024
- [24]
-
[25]
A. Ghafarollahi, M.J. Buehler, SciAgents: Automating Scientific Discovery Through Bioinspired Multi-Agent Intelligent Graph Reasoning, Adv. Mater. 37 (2025) 2413523. https://doi.org/https://doi.org/10.1002/adma.202413523
- [26]
-
[27]
C. Si, D. Yang, T. Hashimoto, Can LLMs Generate Novel Research Ideas? A Large -Scale Human Study with 100+ NLP Researchers, in: Y. Yue, A. Garg, N. Peng, F. Sha, R. Yu (Eds.), Int. Conf. Learn. Represent., 2025: pp. 94003–94092. https://proceedings.iclr.cc/paper_files/paper/2025/file/ea94957d81b1c1caf87ef5319fa6b467 -Paper-Conference.pdf
2025
-
[28]
Q. Wang, D. Downey, H. Ji, T. Hope, {S}ci{MON}: Scientific Inspiration Machines Optimized for Novelty, in: L.-W. Ku, A. Martins, V. Srikumar (Eds.), Proc. 62nd Annu. Meet. Assoc. Comput. Linguist. (Volume 1 Long Pap., Association for Computational Linguistics, Bangkok, Thailand, 2024: pp. 279–299. https://doi.org/10.18653/v1/2024.acl-long.18
-
[29]
Z. Yang, X. Du, J. Li, J. Zheng, S. Poria, E. Cambria, Large language models for automated open-domain scientific hypotheses discovery, in: Find. Assoc. Comput. Linguist. ACL 2024, 2024: pp. 13545–13565
2024
-
[30]
Y. Pu, T. Lin, H. Chen, PiFlow: Principle-aware Scientific Discovery with Multi-Agent Collaboration, (2025). http://arxiv.org/abs/2505.15047
arXiv 2025
-
[31]
Y. Pu, T. Lin, H. Chen, Principle-Evolvable Scientific Discovery via Uncertainty Minimization, (2026). http://arxiv.org/abs/2602.06448
Pith/arXiv arXiv 2026
-
[32]
R. Vasu, C. Basu, B. Dalvi Mishra, C. Sarasua, P. Clark, A. Bernstein, {H}yp{ER}: Literature-grounded Hypothesis Generation and Distillation with Provenance, in: C. Christodoulopoulos, T. Chakraborty, C. Rose, V. Peng (Eds.), Proc. 2025 Conf. Empir. Methods Nat. Lang. Process., Association for Computational Linguistics, Suzhou, China, 2025: pp. 25413–2543...
-
[33]
Z. Yang, W. Liu, B. Gao, T. Xie, Y. Li, W. Ouyang, S. Poria, E. Cambria, D. Zhou, {MOOSE}-Chem: Large Language Models for Rediscovering Unseen Chemistry Scientific Hypotheses, in: Thirteen. Int. Conf. Learn. Represent., 2025. https://openreview.net/forum?id=X9OfMNNepI
2025
-
[34]
Y. Liu, Z. Yang, T. Xie, J. Ni, B. Gao, Y. Li, S. Tang, W. Ouyang, E. Cambria, D. Zhou, ResearchBench: Benchmarking LLMs in Scientific Discovery via Inspiration-Based Task Decomposition, (2025). http://arxiv.org/abs/2503.21248
Pith/arXiv arXiv 2025
-
[35]
https://plato.stanford.edu/archives/win2019/entries/scientific-explanation/
Scientific Explanation, (n.d.). https://plato.stanford.edu/archives/win2019/entries/scientific-explanation/
-
[36]
Hempel, P
C.G. Hempel, P. Oppenheim, Studies in the Logic of Explanation, Philos. Sci. 15 (1948) 135 –175. http://www.jstor.org/stable/185169 (accessed June 23, 2026)
1948
-
[37]
S. Yao, D. Yu, J. Zhao, I. Shafran, T.L. Griffiths, Y. Cao, K. Narasimhan, Tree of Thoughts: Deliberate Problem Solving with Large Language Models, Adv. Neural Inf. Process. Syst. 36 (2023) 1–14
2023
-
[38]
Cooper, How to write an original research paper (and get it published)., J
I.D. Cooper, How to write an original research paper (and get it published)., J. Med. Libr. Assoc. 103 (2015) 67–68. https://doi.org/10.3163/1536-5050.103.2.001
-
[39]
Sollaci, M.G
L.B. Sollaci, M.G. Pereira, The introduction, methods, results, and discussion (IMRAD) structure: a fifty -year survey., J. Med. Libr. Assoc. 92 (2004) 364–367
2004
-
[40]
R. Arp, B. Smith, A.D. Spear, Building Ontologies with Basic Formal Ontology, The MIT Press,
-
[41]
http://www.jstor.org/stable/j.ctt17kk7vw
-
[42]
Smith, CLASSIFYING PROCESSES: AN ESSAY IN APPLIED ONTOLOGY., Ratio 25 (2012) 463 –
B. Smith, CLASSIFYING PROCESSES: AN ESSAY IN APPLIED ONTOLOGY., Ratio 25 (2012) 463 –
2012
-
[43]
https://doi.org/10.1111/j.1467-9329.2012.00557.x
-
[44]
J. Li, H. Yu, X. Luo, Q. Liu, {COSIGN}: Contextual Facts Guided Generation for Knowledge Graph Completion, in: K. Duh, H. Gomez, S. Bethard (Eds.), Proc. 2024 Conf. North Am. Chapter Assoc. Comput. Linguist. Hum. Lang. Technol. (Volume 1 Long Pap., Association for Computational Linguistics, Mexico City, Mexico, 2024: pp. 1669–1682. https://doi.org/10.1865...
-
[45]
S. Toro, A. V Anagnostopoulos, S.M. Bello, K. Blumberg, R. Cameron, L. Carmody, A.D. Diehl, D.M. Dooley, W.D. Duncan, P. Fey, P. Gaudet, N.L. Harris, M.P. Joachimiak, L. Kiani, T. Lubiana, M.C. Munoz-Torres, S. O‘Neil, D. Osumi-Sutherland, A. Puig-Barbe, J.T. Reese, L. Reiser, S.M.C. Robb, T. Ruemping, J. Seager, E. Sid, R. Stefancsik, M. Weber, V. Wood, ...
-
[46]
J. Gu, X. Jiang, Z. Shi, H. Tan, X. Zhai, C. Xu, W. Li, Y. Shen, S. Ma, H. Liu, S. Wang, K. Zhang, Z. Lin, B. Zhang, L. Ni, W. Gao, Y. Wang, J. Guo, A survey on LLM-as-a-judge, Innov. 7 (2026) 101253. https://doi.org/https://doi.org/10.1016/j.xinn.2025.101253
-
[47]
D. Li, B. Jiang, L. Huang, A. Beigi, C. Zhao, Z. Tan, A. Bhattacharjee, Y. Jiang, C. Chen, T. Wu, K. Shu, L. Cheng, H. Liu, From Generation to Judgment: Opportunities and Challenges of {LLM}-as-a-judge, in: C. Christodoulopoulos, T. Chakraborty, C. Rose, V. Peng (Eds.), Proc. 2025 Conf. Empir. Methods Nat. Lang. Process., Association for Computational Lin...
-
[48]
Z. Yue, H. Zeng, L. Shang, Y. Liu, Y. Zhang, D. Wang, Retrieval Augmented Fact Verification by Synthesizing Contrastive Arguments, in: L.-W. Ku, A. Martins, V. Srikumar (Eds.), Proc. 62nd Annu. Meet. Assoc. Comput. Linguist. (Volume 1 Long Pap., Association for Computational Linguistics, Bangkok, Thailand, 2024: pp. 10331–10343. https://doi.org/10.18653/v...
-
[49]
A. Crisan, B. Fiore-Gartland, M. Tory, Passing the Data Baton : A Retrospective Analysis on Data Science Work and Workers, IEEE Trans. Vis. Comput. Graph. 27 (2021) 1860 –1870. https://doi.org/10.1109/TVCG.2020.3030340
-
[50]
Giordano, M.D
F.R. Giordano, M.D. Weir, A first course in mathematical modeling / Frank R. Giordano, Maurice D. Weir., Brooks/Cole Pub. Co., Monterey, CA, 1985
1985
-
[51]
https://doi.org/https://doi.org/10.1007/978-1-4614-7276-6
Glenn Ledder, Mathematics for the Life Sciences, Springer New York, NY, 2016. https://doi.org/https://doi.org/10.1007/978-1-4614-7276-6
-
[52]
https://en.wikipedia.org/wiki/Law_(principle)
Law (principle), (n.d.). https://en.wikipedia.org/wiki/Law_(principle)
-
[53]
K. He, X. Zhang, S. Ren, J. Sun, Deep Residual Learning for Image Recognition, in: 2016 IEEE Conf. Comput. Vis. Pattern Recognit., 2016: pp. 770–778. https://doi.org/10.1109/CVPR.2016.90
-
[54]
Vaswani, N
A. Vaswani, N. Shazeer, N. Parmar, J. Uszkoreit, L. Jones, A.N. Gomez, Ł. Kaiser, I. Polosukhin, Attention is All you Need, in: I. Guyon, U. Von Luxburg, S. Bengio, H. Wallach, R. Fergus, S. Vishwanathan, R. Garnett (Eds.), Adv. Neural Inf. Process. Syst., Curran Associates, Inc., 2017. https://proceedings.neurips.cc/paper_files/paper/2017/file/3f5ee24354...
2017
-
[55]
Mikolov, K
T. Mikolov, K. Chen, G.S. Corrado, J. Dean, Efficient Estimation of Word Representations in Vector Space, in: Int. Conf. Learn. Represent., 2013. https://api.semanticscholar.org/CorpusID:5959482
2013
-
[56]
https://doi.org/10.34740/KAGGLE/DSV/7548853
arXiv.org submitters, arXiv Dataset, (2024). https://doi.org/10.34740/KAGGLE/DSV/7548853
-
[57]
M. Sampson, L. Zhang, A. Morrison, N.J. Barrowman, T.J. Clifford, R.W. Platt, T.P. Klassen, D. Moher, An alternative to the hand searching gold standard: validating methodological search filters using relative recall, BMC Med. Res. Methodol. 6 (2006) 33. https://doi.org/10.1186/1471-2288-6-33
-
[58]
https://en.wikipedia.org/wiki/Amdahl%27s_law
Amdahl’s Law, (n.d.). https://en.wikipedia.org/wiki/Amdahl%27s_law
-
[59]
https://en.wikipedia.org/wiki/Zipf%27s_law
Zipf’s law, (n.d.). https://en.wikipedia.org/wiki/Zipf%27s_law
-
[60]
Woolson, Wilcoxon Signed-Rank Test, in: Wiley Encycl
R.F. Woolson, Wilcoxon Signed-Rank Test, in: Wiley Encycl. Clin. Trials, John Wiley & Sons, Ltd, 2008: pp. 1–3. https://doi.org/https://doi.org/10.1002/9780471462422.eoct979
-
[61]
Mangiafico, Scheirer–Ray–Hare Test, in: Summ
Salvatore S. Mangiafico, Scheirer–Ray–Hare Test, in: Summ. Anal. Ext. Progr. Eval. R, 2016. https://rcompanion.org/handbook/F_14.html
2016
-
[62]
M.S. Nixon, A.S. Aguado, 12 - Distance, classification and learning, in: Featur. Extr. Image Process. Comput. Vis. (Fourth Ed., Fourth Edition, Academic Press, 2020: pp. 571 –604. https://doi.org/https://doi.org/10.1016/B978-0-12-814976-8.00012-9
-
[63]
Y.-H.H. Tsai, S. Bai, M. Yamada, L.-P. Morency, R. Salakhutdinov, Transformer Dissection: An Unified Understanding for Transformer{’}s Attention via the Lens of Kernel, in: K. Inui, J. Jiang, V. Ng, X. Wan (Eds.), Proc. 2019 Conf. Empir. Methods Nat. Lang. Process. 9th Int. Jt. Conf. Nat. Lang. Process., Association for Computational Linguistics, Hong Kon...
2019
-
[64]
https://doi.org/10.18653/v1/D19-1443
-
[65]
Swarztrauber, On Computing the Points and Weights for Gauss--Legendre Quadrature, SIAM J
P.N. Swarztrauber, On Computing the Points and Weights for Gauss--Legendre Quadrature, SIAM J. Sci. Comput. 24 (2003) 945–954. https://doi.org/10.1137/S1064827500379690
-
[66]
Heckbert, Fourier Transforms and the Fast Fourier Transform ( FFT ) Algorithm, in: 1998
P.S. Heckbert, Fourier Transforms and the Fast Fourier Transform ( FFT ) Algorithm, in: 1998. https://api.semanticscholar.org/CorpusID:6022157
1998
-
[67]
Katharopoulos, A
A. Katharopoulos, A. Vyas, N. Pappas, F. Fleuret, Transformers are RNNs: fast autoregressive transformers with linear attention, in: Proc. 37th Int. Conf. Mach. Learn., JMLR.org, 2020
2020
-
[68]
Y. Chen, K. Ren, Y. Wang, Y. Fang, W. Sun, D. Li, ContiFormer: continuous -time transformer for irregular time series modeling, in: Proc. 37th Int. Conf. Neural Inf. Process. Syst., Curran Associates Inc., Red Hook, NY, USA, 2023
2023
-
[69]
Baevski, M
A. Baevski, M. Auli, Adaptive Input Representations for Neural Language Modeling, in: Int. Conf. Learn. Represent., 2019. https://openreview.net/forum?id=ByxZX20qFQ
2019
-
[70]
Mikolov, I
T. Mikolov, I. Sutskever, K. Chen, G. Corrado, J. Dean, Distributed representations of words and phrases and their compositionality, in: Proc. 27th Int. Conf. Neural Inf. Process. Syst. - Vol. 2, Curran Associates Inc., Red Hook, NY, USA, 2013: pp. 3111–3119
2013
-
[71]
Y. Pinter, R. Guthrie, J. Eisenstein, Mimicking Word Embeddings using Subword {RNN}s, in: M. Palmer, R. Hwa, S. Riedel (Eds.), Proc. 2017 Conf. Empir. Methods Nat. Lang. Process., Association for Computational Linguistics, Copenhagen, Denmark, 2017: pp. 102–112. https://doi.org/10.18653/v1/D17-1010
-
[72]
R. Shu, H. Nakayama, Compressing Word Embeddings via Deep Compositional Code Learning, in: Int. Conf. Learn. Represent., 2018. https://openreview.net/forum?id=BJRZzFlRb
2018
-
[73]
Bojanowski, E
P. Bojanowski, E. Grave, A. Joulin, T. Mikolov, Enriching Word Vectors with Subword Information., TACL 5 (2017) 135–146. http://dblp.uni-trier.de/db/journals/tacl/tacl5.html#BojanowskiGJM17
2017
-
[74]
Svenstrup, J.M
D. Svenstrup, J.M. Hansen, O. Winther, Hash embeddings for efficient word representations, in: Proc. 31st Int. Conf. Neural Inf. Process. Syst., Curran Associates Inc., Red Hook, NY, USA, 2017: pp. 4935–4943
2017
-
[75]
Kevrekidis, Lu Lu, Paris Perdikaris, Sifan Wang, and Liu Yang
G.E. Karniadakis, I.G. Kevrekidis, L. Lu, P. Perdikaris, S. Wang, L. Yang, P hysics-informed machine learning, Nat. Rev. Phys. 3 (2021) 422–440. https://doi.org/10.1038/s42254-021-00314-5. Appendix A. Continuous-Time Attention Transformer A.1 Overall Architecture CTAT (Continuous-Time Attention Transformer) Models: Instead of using positional encoding and...
-
[76]
Dense S oftmax, which explicitly computes 𝑂(𝐿2) attention scores and adds a Gaussian distance kernel bias
-
[77]
Dense Linear (ELU+1) , which is a linear form of attention that explicitly computes the kernel-weighted sum in 𝑂(𝐿2)
-
[78]
FFT Linear, which uses the convolution theorem and a Fast Fourier Transform to reduce the complexity of the linear attention to 𝑂(𝐿 𝑙𝑜𝑔(𝐿))
-
[79]
word manifold
Gauss-Legendre finite -interval approximation , which approximates the continuous -time integral using a fixed number of quadrature nodes, but is restricted to a learnable causal window, where 𝑂(𝐿𝑀),𝑀 is the number of interpolation nodes. A.2 Key Mathematical Definitions A.2.1 Gaussian Distance Kernel For positions 𝑖 and 𝑗(𝑗 ≤ 𝑖), the time distance is 𝑟 =...
-
[80]
[First step in the logical deduction connecting laws and conditions to the result.]
discussion (0)
Sign in with ORCID, Apple, or X to comment. Anyone can read and Pith papers without signing in.