Objective-Behavior Alignment: Diagnostics for MORL Policy Selection

Antonio Mone; Florian Felten; Frans A. Oliehoek; Luciano Cavalcante Siebert; Mark Fuge; Pradeep K. Murukannaiah; Zuzanna Osika

arxiv: 2606.21321 · v1 · pith:CTXOBTUXnew · submitted 2026-06-19 · 💻 cs.LG

Objective-Behavior Alignment: Diagnostics for MORL Policy Selection

Antonio Mone , Zuzanna Osika , Florian Felten , Pradeep K. Murukannaiah , Mark Fuge , Frans A. Oliehoek , Luciano Cavalcante Siebert This is my paper

Pith reviewed 2026-06-26 14:32 UTC · model grok-4.3

classification 💻 cs.LG

keywords multi-objective reinforcement learningPareto frontpolicy selectionbehavioral diagnosticsMORLobjective alignmentpolicy inspection

0 comments

The pith

Policies achieving similar objective trade-offs in multi-objective reinforcement learning can still differ substantially in their actual behaviors.

A machine-rendered reading of the paper's core claim, the machinery that carries it, and where it could break.

In multi-objective reinforcement learning, sets of policies are generated to represent different trade-offs between competing goals. These policies are usually evaluated only by their expected returns on each objective, which can make two policies look identical even when they produce very different sequences of actions. The paper introduces a diagnostic workflow that automatically detects and visualizes these hidden behavioral differences along the Pareto front. A sympathetic reader would care because decision makers in real applications need to choose policies based on more than just numbers if the behaviors have different practical consequences. The workflow provides both quantitative measures and visual tools to support such inspection.

Core claim

The paper claims that value vectors alone can obscure substantial behavioral variation among policies on the Pareto front in MORL, and introduces an exploratory diagnostic workflow that highlights this variation using quantitative and visual tools, validated on gridworld examples and continuous control benchmarks.

What carries the argument

The exploratory diagnostic workflow that automatically highlights behavioral variation along the Pareto front.

Load-bearing premise

That policies with similar value vectors exhibit substantial behavioral variation that the diagnostic workflow can detect and present usefully.

What would settle it

Running the workflow on a set of policies known to have identical behaviors but similar values and finding that it reports no variation, or failing to detect differences in cases where behaviors clearly differ.

Figures

Figures reproduced from arXiv: 2606.21321 by Antonio Mone, Florian Felten, Frans A. Oliehoek, Luciano Cavalcante Siebert, Mark Fuge, Pradeep K. Murukannaiah, Zuzanna Osika.

**Figure 1.** Figure 1: Left–Right DST. Motivating Example. To illustrate why evaluating behavioral dynamics alongside objective trade-offs is essential, we introduce a modified version of the well-established Deep Sea Treasure (DST) benchmark (Vamplew et al., 2011; Felten et al., 2022). In the standard DST task, an agent controls a submarine in a grid world to balance treasure value against time-totarget. We propose a variati… view at source ↗

**Figure 2.** Figure 2: Overview of the proposed model-agnostic workflow. It extracts trajectories and expected returns [PITH_FULL_IMAGE:figures/full_fig_p005_2.png] view at source ↗

**Figure 3.** Figure 3: Lipschitz scatter plot with zones Lipschitz scatterplots Since the above metrics are based on neighborhood rank orderings, they provide only a generalized view of local structure preservation and may not capture the magnitude of distance changes. To address this, we introduce scatterplots inspired by the concept of Lipschitz continuity (Cobzaş et al., 2019). A function f is Lipschitz continuous if its r… view at source ↗

**Figure 4.** Figure 4: Pareto front and behavioral embeddings for Left–Right DST. Colours indicate directional behavior (left vs. right). Both manual and transformer embeddings produce consistent behavioral clustering. Another component of our assessment is a qualitative analysis of local relationships between policies using the Lipschitz-inspired scatter plots. In Left–Right DST, the largest discrepancies occur between policies… view at source ↗

**Figure 5.** Figure 5: Pareto front and behavioral embeddings for Smooth DST. Both manual and transformer embeddings produce consistent behavioral clustering [PITH_FULL_IMAGE:figures/full_fig_p009_5.png] view at source ↗

**Figure 6.** Figure 6: Distances between consecutive policies in the objective and behavior space (mean over random seeds 0–4). Please note that the axes have different scales in the two plots. 4.3 MuJoCo environments Having demonstrated the effectiveness of our approach on DST, we extend the analysis to more complex environments: 2-objective MO-HalfCheetah and 3-objective MO-Hopper. As shown in table 2, trustworthiness 9 [PIT… view at source ↗

**Figure 7.** Figure 7: Pareto fronts and mean transformer embedding distances between consecutive policies in the objective and behavior space for MO-HalfCheetah and MO-Hopper. Highlighted policy pairs occupy either the critical upper-left region (close in objective space, far in behavior space) or exhibit large distances in both spaces, and are selected for further trajectory analysis. Please note that the axes have different s… view at source ↗

**Figure 8.** Figure 8: Smooth DST [PITH_FULL_IMAGE:figures/full_fig_p016_8.png] view at source ↗

**Figure 9.** Figure 9: Left–Right DST. 16 [PITH_FULL_IMAGE:figures/full_fig_p016_9.png] view at source ↗

**Figure 10.** Figure 10: Distances between consecutive policies over the PF in the objective and behavior space (mean [PITH_FULL_IMAGE:figures/full_fig_p019_10.png] view at source ↗

**Figure 11.** Figure 11: Distances between consecutive policies over the PF in the objective and behavior spaces across [PITH_FULL_IMAGE:figures/full_fig_p020_11.png] view at source ↗

**Figure 12.** Figure 12: Distances between consecutive policies over the PF in the objective and behavior spaces across [PITH_FULL_IMAGE:figures/full_fig_p020_12.png] view at source ↗

**Figure 13.** Figure 13: Distances between consecutive policies over the PF in the objective and behavior spaces across [PITH_FULL_IMAGE:figures/full_fig_p021_13.png] view at source ↗

**Figure 14.** Figure 14: Distances between consecutive policies over the PF in the objective and behavior spaces across [PITH_FULL_IMAGE:figures/full_fig_p022_14.png] view at source ↗

read the original abstract

Real-world decision-making often requires optimizing multiple competing objectives simultaneously. In reinforcement learning (RL), this is typically addressed by combining reward signals into a single scalar objective via a scalarization function, which can be fragile: small changes in the weights can induce drastically different policies. Multi-objective reinforcement learning (MORL) instead produces sets of policies that explicitly represent trade-offs between objectives. However, these policies are typically presented to the decision maker only through their value vectors, which can obscure substantial behavioral variation: policies that induce distinct trajectories may appear indistinguishable when evaluated solely by expected returns. We propose an exploratory diagnostic workflow that automatically highlights behavioral variation along the Pareto front that objective values alone do not reveal, providing both quantitative and visual tools to support policy inspection. We validate our approach on simple grid examples and scale it to continuous control benchmarks, demonstrating that it remains effective as problem complexity increases.

Editorial analysis

A structured set of objections, weighed in public.

Desk editor's note, referee report, simulated authors' rebuttal, and a circularity audit. Tearing a paper down is the easy half of reading it; the pith above is the substance, this is the friction.

Desk Editor's Note private letter to a colleague

The paper proposes a diagnostic workflow to flag behavioral differences among MORL policies that share similar value vectors, but the abstract supplies almost no implementation details or results.

read the letter

The main thing to know is that this work introduces an exploratory diagnostic workflow aimed at revealing behavioral variation along MORL Pareto fronts that objective values alone miss, with both quantitative and visual outputs for policy inspection.

It does a reasonable job naming a genuine usability problem: scalarization is fragile and value vectors can hide trajectory differences that matter when a decision maker has to pick one policy. Testing the idea first on grid examples then moving to continuous control benchmarks is a sensible way to check whether the approach holds as complexity grows.

The soft spots are straightforward. The abstract gives no equations, no description of the actual diagnostics or metrics, and no quantitative results or error analysis, so there is no way to judge whether the workflow reliably detects variation or produces useful outputs. Without those specifics it is also hard to tell how much this overlaps with existing RL visualization or trajectory comparison methods. The validation claim is stated but not supported here.

This paper is for people already working in multi-objective RL who need better tools for policy selection in applied settings. A reader focused on that subfield could pick up the core idea and think about how to implement or extend it.

I would send it to peer review. The problem it targets is real and the proposal is a direct attempt to address it, so referees could usefully push on the methods and evidence even if the current version needs substantial work.

Referee Report

1 major / 0 minor

Summary. The paper proposes an exploratory diagnostic workflow for multi-objective reinforcement learning (MORL) that automatically highlights behavioral variation along the Pareto front not revealed by objective value vectors alone. It supplies quantitative and visual tools to support policy inspection and claims validation on grid examples scaled to continuous control benchmarks, showing the workflow remains effective as complexity increases.

Significance. If the workflow reliably detects and presents behavioral differences among policies with similar value vectors, it could meaningfully aid decision-making in applied MORL settings by moving beyond scalarized or vector-valued summaries. The scaling claim to continuous-control domains is a positive indicator of practicality, but the absence of any reported metrics, baselines, or error analysis makes the practical significance difficult to gauge from the manuscript.

major comments (1)

[Abstract] Abstract: the manuscript states that the workflow is validated on grid examples and continuous control benchmarks 'demonstrating that it remains effective as problem complexity increases,' yet supplies no methods, quantitative results, error analysis, or comparison to existing MORL inspection techniques. This directly undermines assessment of the central claim that the diagnostic reveals behaviorally distinct policies.

Simulated Author's Rebuttal

1 responses · 0 unresolved

We thank the referee for the constructive feedback. We address the major comment below.

read point-by-point responses

Referee: [Abstract] Abstract: the manuscript states that the workflow is validated on grid examples and continuous control benchmarks 'demonstrating that it remains effective as problem complexity increases,' yet supplies no methods, quantitative results, error analysis, or comparison to existing MORL inspection techniques. This directly undermines assessment of the central claim that the diagnostic reveals behaviorally distinct policies.

Authors: The manuscript presents the diagnostic workflow through a series of illustrative case studies on grid environments and continuous-control tasks. These examples include both visual trajectory comparisons and quantitative measures (e.g., divergence metrics between policies that share similar value vectors) to show that behavioral differences exist and can be surfaced by the workflow. We acknowledge, however, that the abstract's claim of demonstrating effectiveness as complexity increases is stated without accompanying error bars, statistical tests, or explicit comparisons to prior MORL inspection methods. We will revise the abstract to describe the validation as exploratory and illustrative rather than comprehensive, and we will add a dedicated limitations subsection that discusses the absence of baselines and outlines directions for more rigorous quantitative evaluation. revision: partial

Circularity Check

0 steps flagged

No significant circularity

full rationale

The paper presents a methodological proposal for an exploratory diagnostic workflow in MORL without equations, fitted parameters, derivations, or self-citation chains that reduce claims to inputs by construction. Validation is descriptive on gridworlds and benchmarks; no load-bearing steps match the enumerated circularity patterns.

Axiom & Free-Parameter Ledger

0 free parameters · 0 axioms · 0 invented entities

Abstract supplies no information on free parameters, axioms, or invented entities.

pith-pipeline@v0.9.1-grok · 5705 in / 963 out tokens · 23779 ms · 2026-06-26T14:32:56.576558+00:00 · methodology

discussion (0)

Reference graph

Works this paper leans on

93 extracted references · 23 canonical work pages · 1 internal anchor

[1]

Artificial Neural Networks —

Jarkko Venna and Samuel Kaski , title =. Artificial Neural Networks —. 2001 , doi =

2001
[2]

Bradley Knox and Alessandro Allievi and Holger Banzhaf and Felix Schmitt and Peter Stone , keywords =

W. Bradley Knox and Alessandro Allievi and Holger Banzhaf and Felix Schmitt and Peter Stone , keywords =. Reward (Mis)design for autonomous driving , journal =. 2023 , issn =. doi:https://doi.org/10.1016/j.artint.2022.103829 , url =

work page doi:10.1016/j.artint.2022.103829 2023
[3]

Proceedings of the AAAI Conference on Artificial Intelligence , author=

The Perils of Trial-and-Error Reward Design: Misdesign through Overfitting and Invalid Task Specifications , volume=. Proceedings of the AAAI Conference on Artificial Intelligence , author=. 2023 , month=. doi:10.1609/aaai.v37i5.25733 , abstractNote=

work page doi:10.1609/aaai.v37i5.25733 2023
[4]

Todorov, Emanuel and Erez, Tom and Tassa, Yuval , month = oct, year =. 2012. doi:10.1109/IROS.2012.6386109 , abstract =

work page doi:10.1109/iros.2012.6386109 2012
[5]

Journal of Artificial Intelligence Research , author =

A. Journal of Artificial Intelligence Research , author =. 2013 , note =. doi:10.1613/jair.3987 , abstract =

work page doi:10.1613/jair.3987 2013
[6]

Nature , author =

Outracing champion. Nature , author =. 2022 , note =. doi:10.1038/s41586-021-04357-7 , language =

work page doi:10.1038/s41586-021-04357-7 2022
[7]

2024 , volume=

Jeon, Hyeon and Kuo, Yun-Hsin and Aupetit, Michael and Ma, Kwan-Liu and Seo, Jinwook , journal=. 2024 , volume=. doi:10.1109/TVCG.2023.3327187 , url =

work page doi:10.1109/tvcg.2023.3327187 2024
[8]

Advances in Neural Information Processing Systems , volume=

Lipschitz regularity of deep neural networks: analysis and efficient estimation , author=. Advances in Neural Information Processing Systems , volume=
[9]

Proceedings of the International Conference on Learning Representations (ICLR) , year =

Grigory Khromov and Sidak Pal Singh , title =. Proceedings of the International Conference on Learning Representations (ICLR) , year =
[10]

2019 , publisher=

Lipschitz functions , author=. 2019 , publisher=

2019
[11]

and Terry, Jordan K

Felten, Florian and Ucak, Umut and Azmani, Hicham and Peng, Gao and Röpke, Willem and Baier, Hendrik and Mannion, Patrick and Roijers, Diederik M. and Terry, Jordan K. and Talbi, El-Ghazali and Danoy, Grégoire and Nowé, Ann and Rădulescu, Roxana , month = jul, year =. doi:10.48550/arXiv.2407.16312 , abstract =

work page doi:10.48550/arxiv.2407.16312
[12]

Felten, Florian , month = jun, year =. Multi-
[13]

Haarnoja, Tuomas and Zhou, Aurick and Abbeel, Pieter and Levine, Sergey , month = jul, year =. Soft. Proceedings of the 35th
[14]

2007 , note =

IEEE Transactions on Evolutionary Computation , author =. 2007 , note =. doi:10.1109/TEVC.2007.892759 , number =

work page doi:10.1109/tevc.2007.892759 2007
[15]

Journal of Artificial Intelligence Research , volume=

Multi-objective reinforcement learning based on decomposition: A taxonomy and framework , author=. Journal of Artificial Intelligence Research , volume=
[16]

Greenwade

George D. Greenwade. The C omprehensive T ex A rchive N etwork ( CTAN ). TUGBoat. 1993

1993
[17]

Journal of computational and applied mathematics , volume=

Silhouettes: a graphical aid to the interpretation and validation of cluster analysis , author=. Journal of computational and applied mathematics , volume=. 1987 , publisher=

1987
[18]

Similarity Search and Applications: 12th International Conference, SISAP 2019, Newark, NJ, USA, October 2--4, 2019, Proceedings 12 , pages=

Faster k-medoids clustering: improving the PAM, CLARA, and CLARANS algorithms , author=. Similarity Search and Applications: 12th International Conference, SISAP 2019, Newark, NJ, USA, October 2--4, 2019, Proceedings 12 , pages=. 2019 , organization=

2019
[19]

Parallel Problem Solving from Nature-PPSN VIII: 8th International Conference, Birmingham, UK, September 18-22, 2004

Finding knees in multi-objective optimization , author=. Parallel Problem Solving from Nature-PPSN VIII: 8th International Conference, Birmingham, UK, September 18-22, 2004. Proceedings 8 , pages=. 2004 , organization=

2004
[20]

International conference on machine learning , pages=

Dynamic weights in multi-objective deep reinforcement learning , author=. International conference on machine learning , pages=. 2019 , organization=

2019
[21]

Comparing partitions , url =

Hubert, Lawrence and Arabie, Phipps , date =. Comparing partitions , url =. Journal of Classification , number =. 1985 , bdsk-url-1 =. doi:10.1007/BF01908075 , id =

work page doi:10.1007/bf01908075 1985
[22]

Adaptive Agents and Multi-Agent Systems , year=

HIGHLIGHTS: Summarizing Agent Behavior to People , author=. Adaptive Agents and Multi-Agent Systems , year=
[23]

The Journal of Machine Learning Research , volume=

Multi-objective reinforcement learning using sets of pareto dominating policies , author=. The Journal of Machine Learning Research , volume=. 2014 , publisher=

2014
[24]

2021 , eprint=

A Review of the Deep Sea Treasure problem as a Multi-Objective Reinforcement Learning Benchmark , author=. 2021 , eprint=

2021
[25]

Proceedings of the 2013 International Conference on Autonomous Agents and Multi-Agent Systems , pages =

Torrey, Lisa and Taylor, Matthew , title =. Proceedings of the 2013 International Conference on Autonomous Agents and Multi-Agent Systems , pages =. 2013 , isbn =

2013
[26]

Efficient Reinforcement Learning with Multiple Reward Functions for Randomized Controlled Trial Analysis , year =

Lizotte, Daniel and Bowling, Michael and Murphy, Susan , journal =. Efficient Reinforcement Learning with Multiple Reward Functions for Randomized Controlled Trial Analysis , year =
[27]

Journal of Experimental & Theoretical artificial intelligence , volume=

Multi-objective optimization of radiotherapy: distributed Q-learning and agent-based simulation , author=. Journal of Experimental & Theoretical artificial intelligence , volume=. 2017 , publisher=

2017
[28]

Proceedings of the 18th International Conference on Autonomous Agents and MultiAgent Systems , pages =

Li, Changjian and Czarnecki, Krzysztof , title =. Proceedings of the 18th International Conference on Autonomous Agents and MultiAgent Systems , pages =. 2019 , isbn =

2019
[29]

Journal of Water Resources Planning and Management , volume=

Curses, tradeoffs, and scalable management: Advancing evolutionary multiobjective direct policy search to improve water reservoir operations , author=. Journal of Water Resources Planning and Management , volume=. 2016 , publisher=

2016
[30]

Tree-based Fitted Q-iteration for Multi-Objective Markov Decision problems , year=

Castelletti, Andrea and Pianosi, Francesca and Restelli, Marcello , booktitle=. Tree-based Fitted Q-iteration for Multi-Objective Markov Decision problems , year=
[31]

I Don’t Think So

“I Don’t Think So”: Summarizing Policy Disagreements for Agent Comparison , author=. Proceedings of the AAAI Conference on Artificial Intelligence , volume=
[32]

Proceedings of the AAAI Conference on Artificial Intelligence , volume=

Deepsynth: Automata synthesis for automatic task segmentation in deep reinforcement learning , author=. Proceedings of the AAAI Conference on Artificial Intelligence , volume=
[33]

2018 IEEE/RSJ International Conference on Intelligent Robots and Systems (IROS) , pages=

Establishing appropriate trust via critical states , author=. 2018 IEEE/RSJ International Conference on Intelligent Robots and Systems (IROS) , pages=. 2018 , organization=

2018
[34]

IJCAI: proceedings of the conference , volume=

Exploring computational user models for agent policy summarization , author=. IJCAI: proceedings of the conference , volume=. 2019 , organization=

2019
[35]

Proceedings of the AAAI Conference on Artificial Intelligence , volume=

Iterative bounding mdps: Learning interpretable policies via non-interpretable methods , author=. Proceedings of the AAAI Conference on Artificial Intelligence , volume=
[36]

Proceedings of the International Conference on Automated Planning and Scheduling , volume=

Tldr: Policy summarization for factored ssp problems using temporal abstractions , author=. Proceedings of the International Conference on Automated Planning and Scheduling , volume=
[37]

International conference on machine learning , pages=

Graying the black box: Understanding dqns , author=. International conference on machine learning , pages=. 2016 , organization=

2016
[38]

Proceedings of the AAAI Conference on Artificial Intelligence , volume=

Generation of policy-level explanations for reinforcement learning , author=. Proceedings of the AAAI Conference on Artificial Intelligence , volume=
[39]

International conference on machine learning , pages=

Prediction-guided multi-objective reinforcement learning for continuous robot control , author=. International conference on machine learning , pages=. 2020 , organization=

2020
[40]

Computers & Operations Research , volume=

Multi-objective optimization models for patient allocation during a pandemic influenza outbreak , author=. Computers & Operations Research , volume=. 2014 , publisher=

2014
[41]

GitHub repository , howpublished =

Leurent, Edouard , title =. GitHub repository , howpublished =. 2018 , publisher =

2018
[42]

Roijers and Frans A

Zuzanna Osika and Jazmin ZatarainSalazar and Diederik M. Roijers and Frans A. Oliehoek and Pradeep K. Murukannaiah , title =. Proceedings of the 32nd International Joint Conference on Artificial Intelligence , series =. 2023 , address =

2023
[43]

Coello , booktitle=

Falcón-Cardona, Jesús Guillermo and Ishibuchi, Hisao and Coello, Carlos A. Coello , booktitle=. Riesz s-energy-based Reference Sets for Multi-Objective optimization , year=
[44]

and Bazzan, Ana L

Alegre, Lucas N. and Bazzan, Ana L. C. and Roijers, Diederik M. and Now\'. Sample-Efficient Multi-Objective Learning via Generalized Policy Improvement Prioritization , year =. Proceedings of the 2023 International Conference on Autonomous Agents and Multiagent Systems , pages =

2023
[45]

ACM Comput

Milani, Stephanie and Topin, Nicholay and Veloso, Manuela and Fang, Fei , title =. ACM Comput. Surv. , month =. 2023 , publisher =. doi:10.1145/3616864 , abstract =

work page doi:10.1145/3616864 2023
[46]

and Vamplew, Peter and Whiteson, Shimon and Dazeley, Richard , title =

Roijers, Diederik M. and Vamplew, Peter and Whiteson, Shimon and Dazeley, Richard , title =. J. Artif. Int. Res. , month =. 2013 , issue_date =

2013
[47]

Alegre and Florian Felten and El-Ghazali Talbi and Gr

Lucas N. Alegre and Florian Felten and El-Ghazali Talbi and Gr. Proceedings of the 34th Benelux Conference on Artificial Intelligence BNAIC/Benelearn 2022 , year =

2022
[48]

and Nowé, Ann , booktitle=

Van Moffaert, Kristof and Drugan, Madalina M. and Nowé, Ann , booktitle=. Scalarized multi-objective reinforcement learning: Novel design techniques , year=
[49]

Pareto-Set Analysis: Biobjective Clustering in Decision and Objective Spaces , url =

Ulrich, Tamara , doi =. Pareto-Set Analysis: Biobjective Clustering in Decision and Objective Spaces , url =. 2013 , bdsk-url-1 =. https://onlinelibrary.wiley.com/doi/pdf/10.1002/mcda.1477 , journal =

work page doi:10.1002/mcda.1477 2013
[50]

Coit and Alexandra Brintrup and Anupong Wannakrairot and Ajith Kumar Parlikad , doi =

Sanyapong Petchrompo and David W. Coit and Alexandra Brintrup and Anupong Wannakrairot and Ajith Kumar Parlikad , doi =. A review of Pareto pruning methods for multi-objective optimization , url =. Computers & Industrial Engineering , keywords =. 2022 , bdsk-url-1 =

2022
[51]

Ng and Kalyanmoy Deb , doi =

Sunith Bandaru and Amos H.C. Ng and Kalyanmoy Deb , doi =. Data mining methods for knowledge discovery in multi-objective optimization: Part A - Survey , url =. Expert Systems with Applications , keywords =. 2017 , bdsk-url-1 =

2017
[52]

Journal of Building Engineering , volume=

Multi-objective optimization methodology for net zero energy buildings , author=. Journal of Building Engineering , volume=. 2018 , publisher=

2018
[53]

IEEE Transactions on Industrial Electronics , volume=

Multiobjective gas turbine engine controller design using genetic algorithms , author=. IEEE Transactions on Industrial Electronics , volume=. 1996 , publisher=

1996
[54]

Evolutionary computation , volume=

Multi-objective genetic algorithms: Problem difficulties and construction of test problems , author=. Evolutionary computation , volume=. 1999 , publisher=

1999
[55]

Felten, Florian and Alegre, Lucas N. and Now. A Toolkit for Reliable Benchmarking and Research in Multi-Objective Reinforcement Learning , booktitle =
[56]

Machine Learning , year=

Hypervolume indicator and dominance reward based multi-objective Monte-Carlo Tree Search , author=. Machine Learning , year=
[57]

Quinn, J. D. and Reed, P. M. and Giuliani, M. and Castelletti, A. , title =. Water Resources Research , volume =. doi:https://doi.org/10.1029/2018WR024177 , url =. https://agupubs.onlinelibrary.wiley.com/doi/pdf/10.1029/2018WR024177 , abstract =

work page doi:10.1029/2018wr024177
[58]

IEEE Transactions on Transportation Electrification , year=

Multi-Objective Battery Charging Strategy Based on Deep Reinforcement Learning , author=. IEEE Transactions on Transportation Electrification , year=
[59]

Automation in Construction , volume=

Multi-objective reinforcement learning for autonomous drone navigation in urban areas with wind zones , author=. Automation in Construction , volume=. 2024 , publisher=

2024
[60]

IEEE Access , year=

Multi-Objective Reinforcement Learning for Power Allocation in Massive MIMO Networks: A Solution to Spectral and Energy Trade-Offs , author=. IEEE Access , year=
[61]

Autonomous Agents and Multi-Agent Systems , volume=

A practical guide to multi-objective reinforcement learning and planning , author=. Autonomous Agents and Multi-Agent Systems , volume=. 2022 , publisher=

2022
[62]

Is Conditional Generative Modeling all you need for Decision-Making?

Anurag Ajay and Yilun Du and Abhi Gupta and Joshua B. Tenenbaum and Tommi S. Jaakkola and Pulkit Agrawal , title =. CoRR , volume =. 2022 , url =. doi:10.48550/ARXIV.2211.15657 , eprinttype =. 2211.15657 , timestamp =

work page internal anchor Pith review Pith/arXiv arXiv doi:10.48550/arxiv.2211.15657 2022
[63]

Proceedings of the 36th International Conference on Neural Information Processing Systems , articleno =

Carroll, Micah and Paradise, Orr and Lin, Jessy and Georgescu, Raluca and Sun, Mingfei and Bignell, David and Milani, Stephanie and Hofmann, Katja and Hausknecht, Matthew and Dragan, Anca and Devlin, Sam , title =. Proceedings of the 36th International Conference on Neural Information Processing Systems , articleno =. 2022 , isbn =

2022
[64]

Proceedings of the 24th International Conference on Autonomous Agents and Multiagent Systems , pages =

Ge, Zichang and Chen, Changyu and Sinha, Arunesh and Varakantham, Pradeep , title =. Proceedings of the 24th International Conference on Autonomous Agents and Multiagent Systems , pages =. 2025 , isbn =

2025
[65]

ECAI , year=

Navigating Trade-offs: Policy Summarization for Multi-Objective Reinforcement Learning , author=. ECAI , year=
[66]

International Conference on Learning Representations , year=

Evolutionary diversity optimization with clustering-based selection for reinforcement learning , author=. International Conference on Learning Representations , year=
[67]

arXiv preprint arXiv:1802.06971 , year=

A survey on trajectory clustering analysis , author=. arXiv preprint arXiv:1802.06971 , year=

Pith/arXiv arXiv
[68]

Information systems , volume=

Time-series clustering--a decade review , author=. Information systems , volume=. 2015 , publisher=

2015
[69]

Machine Learning , author =

Empirical evaluation methods for multiobjective reinforcement learning algorithms , volume =. Machine Learning , author =. 2011 , pages =. doi:10.1007/s10994-010-5232-5 , abstract =

work page doi:10.1007/s10994-010-5232-5 2011
[70]

Metaheuristics-based

Felten, Florian and Danoy, Grégoire and Talbi, El-Ghazali and Bouvry, Pascal , year =. Metaheuristics-based. Proceedings of the 14th. doi:10.5220/0010989100003116 , language =

work page doi:10.5220/0010989100003116
[71]

Autonomous Agents and Multi-Agent Systems , author =

Scalar reward is not enough: a response to. Autonomous Agents and Multi-Agent Systems , author =. 2022 , keywords =. doi:10.1007/s10458-022-09575-5 , abstract =

work page doi:10.1007/s10458-022-09575-5 2022
[72]

Marius Z

Abhishek Vivekanandan and Christian Hubschneider and J. Marius Z. Contrast. CoRR , volume =. 2025 , url =. doi:10.48550/ARXIV.2506.02571 , eprinttype =. 2506.02571 , timestamp =

work page doi:10.48550/arxiv.2506.02571 2025
[73]

Yanchuan Chang and Jianzhong Qi and Yuxuan Liang and Egemen Tanin , title =. 39th. 2023 , url =. doi:10.1109/ICDE55515.2023.00224 , timestamp =

work page doi:10.1109/icde55515.2023.00224 2023
[74]

Learning Options via Compression , url =

Yiding Jiang and Evan Zheran Liu and others , bibsource =. Learning Options via Compression , url =. Adv. Neural Inf. Process. Syst. (NIPS) , timestamp =
[75]

Gomez and Lukasz Kaiser and Illia Polosukhin , bibsource =

Ashish Vaswani and Noam Shazeer and Niki Parmar and Jakob Uszkoreit and Llion Jones and Aidan N. Gomez and Lukasz Kaiser and Illia Polosukhin , bibsource =. Attention is All you Need , url =. Adv. Neural Inf. Process. Syst. (NIPS) , pages =
[76]

Trading positional complexity vs deepness in coordinate networks , year =

Zheng, Jianqiao and Ramasinghe, Sameera and others , booktitle =. Trading positional complexity vs deepness in coordinate networks , year =. doi:http://dx.doi.org/10.1007/978-3-031-19812-0_9 , organization =

work page doi:10.1007/978-3-031-19812-0_9
[77]

Learnable Fourier Features for Multi-dimensional Spatial Positional Encoding , url =

Yang Li and Si Si and others , bibsource =. Learnable Fourier Features for Multi-dimensional Spatial Positional Encoding , url =. Adv. Neural Inf. Process. Syst. (NIPS) , pages =
[78]

Srinivasan and others , bibsource =

Matthew Tancik and Pratul P. Srinivasan and others , bibsource =. Fourier Features Let Networks Learn High Frequency Functions in Low Dimensional Domains , url =. Adv. Neural Inf. Process. Syst. (NIPS) , timestamp =
[79]

S im CSE : Simple Contrastive Learning of Sentence Embeddings

Gao, Tianyu and Yao, Xingcheng and Chen, Danqi , booktitle =. doi:10.18653/v1/2021.emnlp-main.552 , pages =

work page doi:10.18653/v1/2021.emnlp-main.552 2021
[80]

Oord, Aaron van den and Li, Yazhe and Vinyals, Oriol , title =

Showing first 80 references.

[1] [1]

Artificial Neural Networks —

Jarkko Venna and Samuel Kaski , title =. Artificial Neural Networks —. 2001 , doi =

2001

[2] [2]

Bradley Knox and Alessandro Allievi and Holger Banzhaf and Felix Schmitt and Peter Stone , keywords =

W. Bradley Knox and Alessandro Allievi and Holger Banzhaf and Felix Schmitt and Peter Stone , keywords =. Reward (Mis)design for autonomous driving , journal =. 2023 , issn =. doi:https://doi.org/10.1016/j.artint.2022.103829 , url =

work page doi:10.1016/j.artint.2022.103829 2023

[3] [3]

Proceedings of the AAAI Conference on Artificial Intelligence , author=

The Perils of Trial-and-Error Reward Design: Misdesign through Overfitting and Invalid Task Specifications , volume=. Proceedings of the AAAI Conference on Artificial Intelligence , author=. 2023 , month=. doi:10.1609/aaai.v37i5.25733 , abstractNote=

work page doi:10.1609/aaai.v37i5.25733 2023

[4] [4]

Todorov, Emanuel and Erez, Tom and Tassa, Yuval , month = oct, year =. 2012. doi:10.1109/IROS.2012.6386109 , abstract =

work page doi:10.1109/iros.2012.6386109 2012

[5] [5]

Journal of Artificial Intelligence Research , author =

A. Journal of Artificial Intelligence Research , author =. 2013 , note =. doi:10.1613/jair.3987 , abstract =

work page doi:10.1613/jair.3987 2013

[6] [6]

Nature , author =

Outracing champion. Nature , author =. 2022 , note =. doi:10.1038/s41586-021-04357-7 , language =

work page doi:10.1038/s41586-021-04357-7 2022

[7] [7]

2024 , volume=

Jeon, Hyeon and Kuo, Yun-Hsin and Aupetit, Michael and Ma, Kwan-Liu and Seo, Jinwook , journal=. 2024 , volume=. doi:10.1109/TVCG.2023.3327187 , url =

work page doi:10.1109/tvcg.2023.3327187 2024

[8] [8]

Advances in Neural Information Processing Systems , volume=

Lipschitz regularity of deep neural networks: analysis and efficient estimation , author=. Advances in Neural Information Processing Systems , volume=

[9] [9]

Proceedings of the International Conference on Learning Representations (ICLR) , year =

Grigory Khromov and Sidak Pal Singh , title =. Proceedings of the International Conference on Learning Representations (ICLR) , year =

[10] [10]

2019 , publisher=

Lipschitz functions , author=. 2019 , publisher=

2019

[11] [11]

and Terry, Jordan K

Felten, Florian and Ucak, Umut and Azmani, Hicham and Peng, Gao and Röpke, Willem and Baier, Hendrik and Mannion, Patrick and Roijers, Diederik M. and Terry, Jordan K. and Talbi, El-Ghazali and Danoy, Grégoire and Nowé, Ann and Rădulescu, Roxana , month = jul, year =. doi:10.48550/arXiv.2407.16312 , abstract =

work page doi:10.48550/arxiv.2407.16312

[12] [12]

Felten, Florian , month = jun, year =. Multi-

[13] [13]

Haarnoja, Tuomas and Zhou, Aurick and Abbeel, Pieter and Levine, Sergey , month = jul, year =. Soft. Proceedings of the 35th

[14] [14]

2007 , note =

IEEE Transactions on Evolutionary Computation , author =. 2007 , note =. doi:10.1109/TEVC.2007.892759 , number =

work page doi:10.1109/tevc.2007.892759 2007

[15] [15]

Journal of Artificial Intelligence Research , volume=

Multi-objective reinforcement learning based on decomposition: A taxonomy and framework , author=. Journal of Artificial Intelligence Research , volume=

[16] [16]

Greenwade

George D. Greenwade. The C omprehensive T ex A rchive N etwork ( CTAN ). TUGBoat. 1993

1993

[17] [17]

Journal of computational and applied mathematics , volume=

Silhouettes: a graphical aid to the interpretation and validation of cluster analysis , author=. Journal of computational and applied mathematics , volume=. 1987 , publisher=

1987

[18] [18]

Similarity Search and Applications: 12th International Conference, SISAP 2019, Newark, NJ, USA, October 2--4, 2019, Proceedings 12 , pages=

Faster k-medoids clustering: improving the PAM, CLARA, and CLARANS algorithms , author=. Similarity Search and Applications: 12th International Conference, SISAP 2019, Newark, NJ, USA, October 2--4, 2019, Proceedings 12 , pages=. 2019 , organization=

2019

[19] [19]

Parallel Problem Solving from Nature-PPSN VIII: 8th International Conference, Birmingham, UK, September 18-22, 2004

Finding knees in multi-objective optimization , author=. Parallel Problem Solving from Nature-PPSN VIII: 8th International Conference, Birmingham, UK, September 18-22, 2004. Proceedings 8 , pages=. 2004 , organization=

2004

[20] [20]

International conference on machine learning , pages=

Dynamic weights in multi-objective deep reinforcement learning , author=. International conference on machine learning , pages=. 2019 , organization=

2019

[21] [21]

Comparing partitions , url =

Hubert, Lawrence and Arabie, Phipps , date =. Comparing partitions , url =. Journal of Classification , number =. 1985 , bdsk-url-1 =. doi:10.1007/BF01908075 , id =

work page doi:10.1007/bf01908075 1985

[22] [22]

Adaptive Agents and Multi-Agent Systems , year=

HIGHLIGHTS: Summarizing Agent Behavior to People , author=. Adaptive Agents and Multi-Agent Systems , year=

[23] [23]

The Journal of Machine Learning Research , volume=

Multi-objective reinforcement learning using sets of pareto dominating policies , author=. The Journal of Machine Learning Research , volume=. 2014 , publisher=

2014

[24] [24]

2021 , eprint=

A Review of the Deep Sea Treasure problem as a Multi-Objective Reinforcement Learning Benchmark , author=. 2021 , eprint=

2021

[25] [25]

Proceedings of the 2013 International Conference on Autonomous Agents and Multi-Agent Systems , pages =

Torrey, Lisa and Taylor, Matthew , title =. Proceedings of the 2013 International Conference on Autonomous Agents and Multi-Agent Systems , pages =. 2013 , isbn =

2013

[26] [26]

Efficient Reinforcement Learning with Multiple Reward Functions for Randomized Controlled Trial Analysis , year =

Lizotte, Daniel and Bowling, Michael and Murphy, Susan , journal =. Efficient Reinforcement Learning with Multiple Reward Functions for Randomized Controlled Trial Analysis , year =

[27] [27]

Journal of Experimental & Theoretical artificial intelligence , volume=

Multi-objective optimization of radiotherapy: distributed Q-learning and agent-based simulation , author=. Journal of Experimental & Theoretical artificial intelligence , volume=. 2017 , publisher=

2017

[28] [28]

Proceedings of the 18th International Conference on Autonomous Agents and MultiAgent Systems , pages =

Li, Changjian and Czarnecki, Krzysztof , title =. Proceedings of the 18th International Conference on Autonomous Agents and MultiAgent Systems , pages =. 2019 , isbn =

2019

[29] [29]

Journal of Water Resources Planning and Management , volume=

Curses, tradeoffs, and scalable management: Advancing evolutionary multiobjective direct policy search to improve water reservoir operations , author=. Journal of Water Resources Planning and Management , volume=. 2016 , publisher=

2016

[30] [30]

Tree-based Fitted Q-iteration for Multi-Objective Markov Decision problems , year=

Castelletti, Andrea and Pianosi, Francesca and Restelli, Marcello , booktitle=. Tree-based Fitted Q-iteration for Multi-Objective Markov Decision problems , year=

[31] [31]

I Don’t Think So

“I Don’t Think So”: Summarizing Policy Disagreements for Agent Comparison , author=. Proceedings of the AAAI Conference on Artificial Intelligence , volume=

[32] [32]

Proceedings of the AAAI Conference on Artificial Intelligence , volume=

Deepsynth: Automata synthesis for automatic task segmentation in deep reinforcement learning , author=. Proceedings of the AAAI Conference on Artificial Intelligence , volume=

[33] [33]

2018 IEEE/RSJ International Conference on Intelligent Robots and Systems (IROS) , pages=

Establishing appropriate trust via critical states , author=. 2018 IEEE/RSJ International Conference on Intelligent Robots and Systems (IROS) , pages=. 2018 , organization=

2018

[34] [34]

IJCAI: proceedings of the conference , volume=

Exploring computational user models for agent policy summarization , author=. IJCAI: proceedings of the conference , volume=. 2019 , organization=

2019

[35] [35]

Proceedings of the AAAI Conference on Artificial Intelligence , volume=

Iterative bounding mdps: Learning interpretable policies via non-interpretable methods , author=. Proceedings of the AAAI Conference on Artificial Intelligence , volume=

[36] [36]

Proceedings of the International Conference on Automated Planning and Scheduling , volume=

Tldr: Policy summarization for factored ssp problems using temporal abstractions , author=. Proceedings of the International Conference on Automated Planning and Scheduling , volume=

[37] [37]

International conference on machine learning , pages=

Graying the black box: Understanding dqns , author=. International conference on machine learning , pages=. 2016 , organization=

2016

[38] [38]

Proceedings of the AAAI Conference on Artificial Intelligence , volume=

Generation of policy-level explanations for reinforcement learning , author=. Proceedings of the AAAI Conference on Artificial Intelligence , volume=

[39] [39]

International conference on machine learning , pages=

Prediction-guided multi-objective reinforcement learning for continuous robot control , author=. International conference on machine learning , pages=. 2020 , organization=

2020

[40] [40]

Computers & Operations Research , volume=

Multi-objective optimization models for patient allocation during a pandemic influenza outbreak , author=. Computers & Operations Research , volume=. 2014 , publisher=

2014

[41] [41]

GitHub repository , howpublished =

Leurent, Edouard , title =. GitHub repository , howpublished =. 2018 , publisher =

2018

[42] [42]

Roijers and Frans A

Zuzanna Osika and Jazmin ZatarainSalazar and Diederik M. Roijers and Frans A. Oliehoek and Pradeep K. Murukannaiah , title =. Proceedings of the 32nd International Joint Conference on Artificial Intelligence , series =. 2023 , address =

2023

[43] [43]

Coello , booktitle=

Falcón-Cardona, Jesús Guillermo and Ishibuchi, Hisao and Coello, Carlos A. Coello , booktitle=. Riesz s-energy-based Reference Sets for Multi-Objective optimization , year=

[44] [44]

and Bazzan, Ana L

Alegre, Lucas N. and Bazzan, Ana L. C. and Roijers, Diederik M. and Now\'. Sample-Efficient Multi-Objective Learning via Generalized Policy Improvement Prioritization , year =. Proceedings of the 2023 International Conference on Autonomous Agents and Multiagent Systems , pages =

2023

[45] [45]

ACM Comput

Milani, Stephanie and Topin, Nicholay and Veloso, Manuela and Fang, Fei , title =. ACM Comput. Surv. , month =. 2023 , publisher =. doi:10.1145/3616864 , abstract =

work page doi:10.1145/3616864 2023

[46] [46]

and Vamplew, Peter and Whiteson, Shimon and Dazeley, Richard , title =

Roijers, Diederik M. and Vamplew, Peter and Whiteson, Shimon and Dazeley, Richard , title =. J. Artif. Int. Res. , month =. 2013 , issue_date =

2013

[47] [47]

Alegre and Florian Felten and El-Ghazali Talbi and Gr

Lucas N. Alegre and Florian Felten and El-Ghazali Talbi and Gr. Proceedings of the 34th Benelux Conference on Artificial Intelligence BNAIC/Benelearn 2022 , year =

2022

[48] [48]

and Nowé, Ann , booktitle=

Van Moffaert, Kristof and Drugan, Madalina M. and Nowé, Ann , booktitle=. Scalarized multi-objective reinforcement learning: Novel design techniques , year=

[49] [49]

Pareto-Set Analysis: Biobjective Clustering in Decision and Objective Spaces , url =

Ulrich, Tamara , doi =. Pareto-Set Analysis: Biobjective Clustering in Decision and Objective Spaces , url =. 2013 , bdsk-url-1 =. https://onlinelibrary.wiley.com/doi/pdf/10.1002/mcda.1477 , journal =

work page doi:10.1002/mcda.1477 2013

[50] [50]

Coit and Alexandra Brintrup and Anupong Wannakrairot and Ajith Kumar Parlikad , doi =

Sanyapong Petchrompo and David W. Coit and Alexandra Brintrup and Anupong Wannakrairot and Ajith Kumar Parlikad , doi =. A review of Pareto pruning methods for multi-objective optimization , url =. Computers & Industrial Engineering , keywords =. 2022 , bdsk-url-1 =

2022

[51] [51]

Ng and Kalyanmoy Deb , doi =

Sunith Bandaru and Amos H.C. Ng and Kalyanmoy Deb , doi =. Data mining methods for knowledge discovery in multi-objective optimization: Part A - Survey , url =. Expert Systems with Applications , keywords =. 2017 , bdsk-url-1 =

2017

[52] [52]

Journal of Building Engineering , volume=

Multi-objective optimization methodology for net zero energy buildings , author=. Journal of Building Engineering , volume=. 2018 , publisher=

2018

[53] [53]

IEEE Transactions on Industrial Electronics , volume=

Multiobjective gas turbine engine controller design using genetic algorithms , author=. IEEE Transactions on Industrial Electronics , volume=. 1996 , publisher=

1996

[54] [54]

Evolutionary computation , volume=

Multi-objective genetic algorithms: Problem difficulties and construction of test problems , author=. Evolutionary computation , volume=. 1999 , publisher=

1999

[55] [55]

Felten, Florian and Alegre, Lucas N. and Now. A Toolkit for Reliable Benchmarking and Research in Multi-Objective Reinforcement Learning , booktitle =

[56] [56]

Machine Learning , year=

Hypervolume indicator and dominance reward based multi-objective Monte-Carlo Tree Search , author=. Machine Learning , year=

[57] [57]

Quinn, J. D. and Reed, P. M. and Giuliani, M. and Castelletti, A. , title =. Water Resources Research , volume =. doi:https://doi.org/10.1029/2018WR024177 , url =. https://agupubs.onlinelibrary.wiley.com/doi/pdf/10.1029/2018WR024177 , abstract =

work page doi:10.1029/2018wr024177

[58] [58]

IEEE Transactions on Transportation Electrification , year=

Multi-Objective Battery Charging Strategy Based on Deep Reinforcement Learning , author=. IEEE Transactions on Transportation Electrification , year=

[59] [59]

Automation in Construction , volume=

Multi-objective reinforcement learning for autonomous drone navigation in urban areas with wind zones , author=. Automation in Construction , volume=. 2024 , publisher=

2024

[60] [60]

IEEE Access , year=

Multi-Objective Reinforcement Learning for Power Allocation in Massive MIMO Networks: A Solution to Spectral and Energy Trade-Offs , author=. IEEE Access , year=

[61] [61]

Autonomous Agents and Multi-Agent Systems , volume=

A practical guide to multi-objective reinforcement learning and planning , author=. Autonomous Agents and Multi-Agent Systems , volume=. 2022 , publisher=

2022

[62] [62]

Is Conditional Generative Modeling all you need for Decision-Making?

Anurag Ajay and Yilun Du and Abhi Gupta and Joshua B. Tenenbaum and Tommi S. Jaakkola and Pulkit Agrawal , title =. CoRR , volume =. 2022 , url =. doi:10.48550/ARXIV.2211.15657 , eprinttype =. 2211.15657 , timestamp =

work page internal anchor Pith review Pith/arXiv arXiv doi:10.48550/arxiv.2211.15657 2022

[63] [63]

Proceedings of the 36th International Conference on Neural Information Processing Systems , articleno =

Carroll, Micah and Paradise, Orr and Lin, Jessy and Georgescu, Raluca and Sun, Mingfei and Bignell, David and Milani, Stephanie and Hofmann, Katja and Hausknecht, Matthew and Dragan, Anca and Devlin, Sam , title =. Proceedings of the 36th International Conference on Neural Information Processing Systems , articleno =. 2022 , isbn =

2022

[64] [64]

Proceedings of the 24th International Conference on Autonomous Agents and Multiagent Systems , pages =

Ge, Zichang and Chen, Changyu and Sinha, Arunesh and Varakantham, Pradeep , title =. Proceedings of the 24th International Conference on Autonomous Agents and Multiagent Systems , pages =. 2025 , isbn =

2025

[65] [65]

ECAI , year=

Navigating Trade-offs: Policy Summarization for Multi-Objective Reinforcement Learning , author=. ECAI , year=

[66] [66]

International Conference on Learning Representations , year=

Evolutionary diversity optimization with clustering-based selection for reinforcement learning , author=. International Conference on Learning Representations , year=

[67] [67]

arXiv preprint arXiv:1802.06971 , year=

A survey on trajectory clustering analysis , author=. arXiv preprint arXiv:1802.06971 , year=

Pith/arXiv arXiv

[68] [68]

Information systems , volume=

Time-series clustering--a decade review , author=. Information systems , volume=. 2015 , publisher=

2015

[69] [69]

Machine Learning , author =

Empirical evaluation methods for multiobjective reinforcement learning algorithms , volume =. Machine Learning , author =. 2011 , pages =. doi:10.1007/s10994-010-5232-5 , abstract =

work page doi:10.1007/s10994-010-5232-5 2011

[70] [70]

Metaheuristics-based

Felten, Florian and Danoy, Grégoire and Talbi, El-Ghazali and Bouvry, Pascal , year =. Metaheuristics-based. Proceedings of the 14th. doi:10.5220/0010989100003116 , language =

work page doi:10.5220/0010989100003116

[71] [71]

Autonomous Agents and Multi-Agent Systems , author =

Scalar reward is not enough: a response to. Autonomous Agents and Multi-Agent Systems , author =. 2022 , keywords =. doi:10.1007/s10458-022-09575-5 , abstract =

work page doi:10.1007/s10458-022-09575-5 2022

[72] [72]

Marius Z

Abhishek Vivekanandan and Christian Hubschneider and J. Marius Z. Contrast. CoRR , volume =. 2025 , url =. doi:10.48550/ARXIV.2506.02571 , eprinttype =. 2506.02571 , timestamp =

work page doi:10.48550/arxiv.2506.02571 2025

[73] [73]

Yanchuan Chang and Jianzhong Qi and Yuxuan Liang and Egemen Tanin , title =. 39th. 2023 , url =. doi:10.1109/ICDE55515.2023.00224 , timestamp =

work page doi:10.1109/icde55515.2023.00224 2023

[74] [74]

Learning Options via Compression , url =

Yiding Jiang and Evan Zheran Liu and others , bibsource =. Learning Options via Compression , url =. Adv. Neural Inf. Process. Syst. (NIPS) , timestamp =

[75] [75]

Gomez and Lukasz Kaiser and Illia Polosukhin , bibsource =

Ashish Vaswani and Noam Shazeer and Niki Parmar and Jakob Uszkoreit and Llion Jones and Aidan N. Gomez and Lukasz Kaiser and Illia Polosukhin , bibsource =. Attention is All you Need , url =. Adv. Neural Inf. Process. Syst. (NIPS) , pages =

[76] [76]

Trading positional complexity vs deepness in coordinate networks , year =

Zheng, Jianqiao and Ramasinghe, Sameera and others , booktitle =. Trading positional complexity vs deepness in coordinate networks , year =. doi:http://dx.doi.org/10.1007/978-3-031-19812-0_9 , organization =

work page doi:10.1007/978-3-031-19812-0_9

[77] [77]

Learnable Fourier Features for Multi-dimensional Spatial Positional Encoding , url =

Yang Li and Si Si and others , bibsource =. Learnable Fourier Features for Multi-dimensional Spatial Positional Encoding , url =. Adv. Neural Inf. Process. Syst. (NIPS) , pages =

[78] [78]

Srinivasan and others , bibsource =

Matthew Tancik and Pratul P. Srinivasan and others , bibsource =. Fourier Features Let Networks Learn High Frequency Functions in Low Dimensional Domains , url =. Adv. Neural Inf. Process. Syst. (NIPS) , timestamp =

[79] [79]

S im CSE : Simple Contrastive Learning of Sentence Embeddings

Gao, Tianyu and Yao, Xingcheng and Chen, Danqi , booktitle =. doi:10.18653/v1/2021.emnlp-main.552 , pages =

work page doi:10.18653/v1/2021.emnlp-main.552 2021

[80] [80]

Oord, Aaron van den and Li, Yazhe and Vinyals, Oriol , title =