Recognition: unknown
Arena as Offline Reward: Efficient Fine-Grained Preference Optimization for Diffusion Models
Pith reviewed 2026-05-08 14:12 UTC · model grok-4.3
The pith
ArenaPO estimates absolute quality gaps from pairwise preferences using Gaussian model capabilities to serve as fine-grained offline rewards for diffusion model optimization.
A machine-rendered reading of the paper's core claim, the machinery that carries it, and where it could break.
Core claim
ArenaPO treats each image as a sample from its model's capability distribution and uses truncated normal inference conditioned on observed preferences to compute the absolute quality gap between paired images, which then acts as the reward signal in preference optimization training without requiring an explicit reward model.
What carries the argument
Model Arena with Gaussian capability distributions and truncated normal latent-variable inference to estimate absolute quality gaps from pairwise preferences.
If this is right
- ArenaPO achieves more precise optimization than binary DPO by using continuous quality gap rewards.
- The method requires no online reward model training or inference, reducing computational overhead.
- Models trained with ArenaPO show improved performance on standard preference datasets like Pick-a-Pic v2 and HPD v3.
- Offline computation of rewards allows seamless integration into existing DPO pipelines.
Where Pith is reading between the lines
- This technique might generalize to other domains where pairwise preferences are available, such as text generation, by applying similar Gaussian modeling.
- The probabilistic nature of the inference could provide uncertainty estimates for rewards, potentially improving training stability.
- If the quality gap estimates align with human perception across diverse prompts, it could scale preference learning with less annotation effort.
Load-bearing premise
Modeling each model's capability as a Gaussian distribution and estimating quality gaps with truncated normal inference on pairwise preferences produces accurate absolute rewards.
What would settle it
A direct comparison where humans rate images generated by ArenaPO-trained models versus DPO-trained models on the same prompts, checking if the preference rates match the predicted quality gaps.
Figures
read the original abstract
Reinforcement learning from human feedback (RLHF) effectively promotes preference alignment of text-to-image (T2I) diffusion models. To improve computational efficiency, direct preference optimization (DPO), which avoids explicit reward modeling, has been widely studied. However, its reliance on binary feedback limits it to coarse-grained modeling on chosen-rejected pairs, resulting in suboptimal optimization. In this paper, we propose ArenaPO, which leverages Arena scores as offline rewards to provide refined feedback, thus achieving efficient and fine-grained optimization without a reward model. This enables ArenaPO to benefit from both the rich rewards of traditional RLHF and the efficiency of DPO. Specifically, we first construct a model Arena in which each model's capability is represented as a Gaussian distribution, and infer these capabilities by traversing the annotated pairwise preferences. Each output image is treated as a sample from the corresponding capability distribution. Then, for a image pair, conditioned on the two capability distributions and the observed pairwise preference, the absolute quality gap is estimated using latent-variable inference based on truncated normal distribution, which serves as fine-grained feedback during training. It does not require a reward model and can be computed offline, thus introducing no additional training overhead. We conduct ArenaPO training on Pick-a-Pic v2 and HPD v3 datasets, showing that ArenaPO consistently outperforms existing baselines.
Editorial analysis
A structured set of objections, weighed in public.
Referee Report
Summary. The manuscript proposes ArenaPO for fine-grained preference optimization of text-to-image diffusion models. It constructs a model Arena in which each model's capability is represented as a Gaussian distribution inferred by traversing annotated pairwise preferences. Each output image is treated as a sample from its model's capability distribution. For any image pair, the absolute quality gap is then estimated via latent-variable inference under a truncated normal distribution conditioned on the two capability distributions and the observed preference; this gap serves as an offline reward signal in a DPO-style training procedure. The method is evaluated on the Pick-a-Pic v2 and HPD v3 datasets and is reported to outperform existing baselines while requiring no separate reward model and incurring no additional training overhead.
Significance. If the inferred quality gaps prove faithful to underlying human-perceived differences and supply information beyond the binary preference labels, ArenaPO would usefully combine the richness of RLHF-style rewards with the computational efficiency of DPO. The offline, reward-model-free design is a concrete practical strength that could scale to larger diffusion-model alignment pipelines.
major comments (3)
- [§3.2] §3.2 (latent-variable inference): The absolute quality gap is obtained by conditioning a truncated normal on the same pairwise preference labels that were already used to fit the per-model Gaussian parameters. No held-out human correlation study or ablation that isolates the contribution of the continuous gap versus the binary label is reported; without such evidence the claim that the procedure supplies genuinely fine-grained, non-circular feedback remains unverified and load-bearing for the central advantage over standard DPO.
- [§4.1] §4.1 (experimental protocol): Both Arena construction and subsequent ArenaPO training are performed on the identical Pick-a-Pic v2 and HPD v3 splits. This shared data usage risks inflating apparent gains; a clear separation (e.g., Arena built on a disjoint subset) or cross-dataset transfer results are needed to substantiate that the inferred rewards generalize.
- [§3.1] §3.1 (Gaussian capability model): The assumption that model capabilities follow symmetric Gaussian distributions is introduced without diagnostic checks against the empirical distribution of pairwise outcomes. If the true capability distribution is skewed or multimodal, the downstream truncated-normal gap estimates become biased; a sensitivity analysis or alternative distributional assumption would be required to bound the robustness of the reward signal.
minor comments (3)
- [Abstract] Abstract: 'a image pair' should read 'an image pair'.
- [Figure 1] Figure 1 (Arena diagram): The arrow labeled 'traversing pairwise preferences' does not indicate whether the inference is performed once globally or per training batch; clarify the computational graph.
- [§4.3] §4.3 (baseline comparison): The reported metrics lack error bars or the number of random seeds; adding these would strengthen the claim of consistent outperformance.
Simulated Author's Rebuttal
Thank you for the detailed and constructive feedback. We appreciate the opportunity to clarify and strengthen our manuscript. Below, we provide point-by-point responses to the major comments and outline the revisions we will make.
read point-by-point responses
-
Referee: [§3.2] §3.2 (latent-variable inference): The absolute quality gap is obtained by conditioning a truncated normal on the same pairwise preference labels that were already used to fit the per-model Gaussian parameters. No held-out human correlation study or ablation that isolates the contribution of the continuous gap versus the binary label is reported; without such evidence the claim that the procedure supplies genuinely fine-grained, non-circular feedback remains unverified and load-bearing for the central advantage over standard DPO.
Authors: We thank the referee for highlighting this important point. The inference procedure is designed to extract a continuous quality gap from the binary preference by modeling the latent capabilities as Gaussians and using truncated normal conditioning, which is not merely reusing the label but deriving a magnitude estimate based on the distributional assumptions. However, we acknowledge that empirical validation isolating this contribution is necessary. In the revised version, we will add an ablation study that compares ArenaPO using the inferred continuous gaps against a variant that uses only binary preferences (equivalent to standard DPO). We will also attempt to correlate the inferred gaps with any available human rating data on held-out pairs from the datasets to provide additional evidence of faithfulness. revision: yes
-
Referee: [§4.1] §4.1 (experimental protocol): Both Arena construction and subsequent ArenaPO training are performed on the identical Pick-a-Pic v2 and HPD v3 splits. This shared data usage risks inflating apparent gains; a clear separation (e.g., Arena built on a disjoint subset) or cross-dataset transfer results are needed to substantiate that the inferred rewards generalize.
Authors: This is a valid concern regarding potential data leakage or overfitting to the specific splits. To address it, we will revise the experimental protocol to construct the Arena model using a disjoint subset of the training data (e.g., 50% split for Arena construction and the remaining for preference optimization). We will report the performance under this separated setting. Additionally, we will include cross-dataset transfer experiments, such as building the Arena on Pick-a-Pic v2 and evaluating ArenaPO on HPD v3, and vice versa, to demonstrate generalization of the inferred rewards. revision: yes
-
Referee: [§3.1] §3.1 (Gaussian capability model): The assumption that model capabilities follow symmetric Gaussian distributions is introduced without diagnostic checks against the empirical distribution of pairwise outcomes. If the true capability distribution is skewed or multimodal, the downstream truncated-normal gap estimates become biased; a sensitivity analysis or alternative distributional assumption would be required to bound the robustness of the reward signal.
Authors: We agree that the Gaussian assumption should be validated. In the revised manuscript, we will include diagnostic plots comparing the empirical distribution of pairwise preference outcomes to the fitted Gaussian model. Furthermore, we will conduct a sensitivity analysis by experimenting with alternative distributions, such as a Laplace distribution or a Gaussian mixture model, and report the impact on the final performance metrics to bound the robustness of our approach. revision: yes
Circularity Check
No circularity: parametric inference produces continuous rewards from binary preferences without reducing to input by construction
full rationale
The paper's core chain models each model's capability as a Gaussian inferred from annotated pairwise preferences, treats images as samples from those distributions, and then applies latent-variable inference under a truncated normal (conditioned on the observed preference) to estimate an absolute quality gap used as the offline reward. This is an explicit statistical estimation procedure that maps binary labels to continuous values via modeling assumptions; it does not equate the output gap to the input preference by definition or by fitting a parameter that is then renamed as the target. No self-citations, uniqueness theorems, or ansatzes are invoked to justify the steps. The subsequent DPO-style training on the diffusion model uses these precomputed gaps as an independent objective, making the derivation self-contained against external validation of the Gaussian/truncated-normal model.
Axiom & Free-Parameter Ledger
free parameters (2)
- per-model Gaussian mean and variance
- parameters of the truncated-normal inference
axioms (2)
- domain assumption Each model's capability can be faithfully represented by a single Gaussian distribution over image quality.
- domain assumption The absolute quality gap between two images is a latent variable whose posterior can be recovered via truncated-normal inference conditioned on the observed preference.
invented entities (1)
-
Model Arena with per-model Gaussian capability distributions
no independent evidence
Reference graph
Works this paper leans on
-
[1]
A general theoretical paradigm to understand learning from human preferences
Mohammad Gheshlaghi Azar, Zhaohan Daniel Guo, Bilal Piot, Remi Munos, Mark Rowland, Michal Valko, and Daniele Calandriello. A general theoretical paradigm to understand learning from human preferences. InInternational Conference on Artificial Intelligence and Statistics, pages 4447–4455. PMLR, 2024
2024
-
[2]
Improving image generation with better captions
James Betker, Gabriel Goh, Li Jing, Tim Brooks, Jianfeng Wang, Linjie Li, Long Ouyang, Juntang Zhuang, Joyce Lee, Yufei Guo, et al. Improving image generation with better captions. Computer Science. https://cdn. openai. com/papers/dall-e-3. pdf, 2(3):8, 2023
2023
-
[3]
Flux.https://github.com/black-forest-labs/flux, 2024
Black Forest Labs. Flux.https://github.com/black-forest-labs/flux, 2024
2024
-
[4]
Rank analysis of incomplete block designs: I
Ralph Allan Bradley and Milton E Terry. Rank analysis of incomplete block designs: I. the method of paired comparisons.Biometrika, 39(3/4):324–345, 1952
1952
-
[5]
Getting it right: Improving spatial consistency in text-to-image models
Agneet Chatterjee, Gabriela Ben Melech Stan, Estelle Aflalo, Sayak Paul, Dhruba Ghosh, Tejas Gokhale, Ludwig Schmidt, Hannaneh Hajishirzi, Vasudev Lal, Chitta Baral, et al. Getting it right: Improving spatial consistency in text-to-image models. InEuropean Conference on Computer Vision, pages 204–222. Springer, 2024
2024
-
[6]
PixArt-$\alpha$: Fast Training of Diffusion Transformer for Photorealistic Text-to-Image Synthesis
Junsong Chen, Jincheng Yu, Chongjian Ge, Lewei Yao, Enze Xie, Yue Wu, Zhongdao Wang, James Kwok, Ping Luo, Huchuan Lu, et al. Pixart-alpha: Fast training of diffusion transformer for photorealistic text-to-image synthesis.arXiv preprint arXiv:2310.00426, 2023
work page internal anchor Pith review arXiv 2023
-
[7]
Deep reinforcement learning from human preferences.Advances in neural information processing systems, 30, 2017
Paul F Christiano, Jan Leike, Tom Brown, Miljan Martic, Shane Legg, and Dario Amodei. Deep reinforcement learning from human preferences.Advances in neural information processing systems, 30, 2017
2017
-
[8]
Diffusion models in vision: A survey.IEEE transactions on pattern analysis and machine intelligence, 45 (9):10850–10869, 2023
Florinel-Alin Croitoru, Vlad Hondru, Radu Tudor Ionescu, and Mubarak Shah. Diffusion models in vision: A survey.IEEE transactions on pattern analysis and machine intelligence, 45 (9):10850–10869, 2023
2023
-
[9]
Minghao Fu, Guo-Hua Wang, Tianyu Cui, Qing-Guo Chen, Zhao Xu, Weihua Luo, and Kaifu Zhang. Diffusion-sdpo: Safeguarded direct preference optimization for diffusion models.arXiv preprint arXiv:2511.03317, 2025
-
[10]
Denoising diffusion probabilistic models.Advances in neural information processing systems, 33:6840–6851, 2020
Jonathan Ho, Ajay Jain, and Pieter Abbeel. Denoising diffusion probabilistic models.Advances in neural information processing systems, 33:6840–6851, 2020
2020
-
[11]
Margin- aware preference optimization for aligning diffusion models without reference
Jiwoo Hong, Sayak Paul, Noah Lee, Kashif Rasul, James Thorne, and Jongheon Jeong. Margin- aware preference optimization for aligning diffusion models without reference. InFirst Work- shop on Scalable Optimization for Efficient and Adaptive Foundation Models, 2024
2024
-
[12]
John wiley & sons, 1995
Norman L Johnson, Samuel Kotz, and Narayanaswamy Balakrishnan.Continuous univariate distributions, volume 2, volume 2. John wiley & sons, 1995
1995
-
[13]
Elucidating the design space of diffusion-based generative models.Advances in neural information processing systems, 35: 26565–26577, 2022
Tero Karras, Miika Aittala, Timo Aila, and Samuli Laine. Elucidating the design space of diffusion-based generative models.Advances in neural information processing systems, 35: 26565–26577, 2022
2022
-
[14]
Scalable ranked preference optimization for text-to-image generation
Shyamgopal Karthik, Huseyin Coskun, Zeynep Akata, Sergey Tulyakov, Jian Ren, and Anil Kag. Scalable ranked preference optimization for text-to-image generation. InProceedings of the IEEE/CVF International Conference on Computer Vision, pages 18399–18410, 2025
2025
-
[15]
Pick-a-pic: An open dataset of user preferences for text-to-image generation.Advances in neural information processing systems, 36:36652–36663, 2023
Yuval Kirstain, Adam Polyak, Uriel Singer, Shahbuland Matiana, Joe Penna, and Omer Levy. Pick-a-pic: An open dataset of user preferences for text-to-image generation.Advances in neural information processing systems, 36:36652–36663, 2023
2023
-
[16]
Binxu Li, Minkai Xu, Jiaqi Han, Meihua Dang, and Stefano Ermon. Divergence minimization preference optimization for diffusion model alignment.arXiv preprint arXiv:2507.07510, 2025
-
[17]
Align- ing diffusion models by optimizing human utility.Advances in Neural Information Processing Systems, 37:24897–24925, 2024
Shufan Li, Konstantinos Kallidromitis, Akash Gokul, Yusuke Kato, and Kazuki Kozuka. Align- ing diffusion models by optimizing human utility.Advances in Neural Information Processing Systems, 37:24897–24925, 2024. 10
2024
-
[18]
K-sort eval: Efficient preference evaluation for visual genera- tion via corrected vlm-as-a-judge
Zhikai Li, Xuewen Liu, Wangbo Zhao, Pan Du, Kaicheng Zhou, Qingyi Gu, Yang You, Zhen Dong, Kurt Keutzer, et al. K-sort eval: Efficient preference evaluation for visual genera- tion via corrected vlm-as-a-judge. InThe Fourteenth International Conference on Learning Representations
-
[19]
K-sort arena: Efficient and reliable benchmarking for generative models via k-wise human preferences
Zhikai Li, Xuewen Liu, Dongrong Joe Fu, Jianquan Li, Qingyi Gu, Kurt Keutzer, and Zhen Dong. K-sort arena: Efficient and reliable benchmarking for generative models via k-wise human preferences. InProceedings of the Computer Vision and Pattern Recognition Conference, pages 9131–9141, 2025
2025
-
[20]
Flow Matching for Generative Modeling
Yaron Lipman, Ricky TQ Chen, Heli Ben-Hamu, Maximilian Nickel, and Matt Le. Flow matching for generative modeling.arXiv preprint arXiv:2210.02747, 2022
work page internal anchor Pith review arXiv 2022
-
[21]
Flow Straight and Fast: Learning to Generate and Transfer Data with Rectified Flow
Xingchao Liu, Chengyue Gong, and Qiang Liu. Flow straight and fast: Learning to generate and transfer data with rectified flow.arXiv preprint arXiv:2209.03003, 2022
work page internal anchor Pith review arXiv 2022
-
[22]
Hpsv3: Towards wide-spectrum hu- man preference score
Yuhang Ma, Xiaoshi Wu, Keqiang Sun, and Hongsheng Li. Hpsv3: Towards wide-spectrum hu- man preference score. InProceedings of the IEEE/CVF International Conference on Computer Vision, pages 15086–15095, 2025
2025
-
[23]
Scalable diffusion models with transformers
William Peebles and Saining Xie. Scalable diffusion models with transformers. InProceedings of the IEEE/CVF international conference on computer vision, pages 4195–4205, 2023
2023
-
[24]
SDXL: Improving Latent Diffusion Models for High-Resolution Image Synthesis
Dustin Podell, Zion English, Kyle Lacey, Andreas Blattmann, Tim Dockhorn, Jonas Müller, Joe Penna, and Robin Rombach. Sdxl: Improving latent diffusion models for high-resolution image synthesis.arXiv preprint arXiv:2307.01952, 2023
work page internal anchor Pith review arXiv 2023
-
[25]
Learning transferable visual models from natural language supervision
Alec Radford, Jong Wook Kim, Chris Hallacy, Aditya Ramesh, Gabriel Goh, Sandhini Agarwal, Girish Sastry, Amanda Askell, Pamela Mishkin, Jack Clark, et al. Learning transferable visual models from natural language supervision. InInternational conference on machine learning, pages 8748–8763. PmLR, 2021
2021
-
[26]
Direct preference optimization: Your language model is secretly a reward model
Rafael Rafailov, Archit Sharma, Eric Mitchell, Christopher D Manning, Stefano Ermon, and Chelsea Finn. Direct preference optimization: Your language model is secretly a reward model. Advances in neural information processing systems, 36:53728–53741, 2023
2023
-
[27]
High- resolution image synthesis with latent diffusion models
Robin Rombach, Andreas Blattmann, Dominik Lorenz, Patrick Esser, and Björn Ommer. High- resolution image synthesis with latent diffusion models. InProceedings of the IEEE/CVF conference on computer vision and pattern recognition, pages 10684–10695, 2022
2022
-
[28]
LAION-Aesthetics: Predicting the aesthetic quality of images.https://laion.ai/blog/laion-aesthetics/, 2022
Christoph Schuhmann, Romain Vencu, Romai Beaumont, et al. LAION-Aesthetics: Predicting the aesthetic quality of images.https://laion.ai/blog/laion-aesthetics/, 2022
2022
-
[29]
Proximal Policy Optimization Algorithms
John Schulman, Filip Wolski, Prafulla Dhariwal, Alec Radford, and Oleg Klimov. Proximal policy optimization algorithms.arXiv preprint arXiv:1707.06347, 2017
work page internal anchor Pith review arXiv 2017
-
[30]
Freeu: Free lunch in diffusion u-net
Chenyang Si, Ziqi Huang, Yuming Jiang, and Ziwei Liu. Freeu: Free lunch in diffusion u-net. InProceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pages 4733–4743, 2024
2024
-
[31]
Generative modeling by estimating gradients of the data distribution.Advances in neural information processing systems, 32, 2019
Yang Song and Stefano Ermon. Generative modeling by estimating gradients of the data distribution.Advances in neural information processing systems, 32, 2019
2019
-
[32]
Score-Based Generative Modeling through Stochastic Differential Equations
Yang Song, Jascha Sohl-Dickstein, Diederik P Kingma, Abhishek Kumar, Stefano Ermon, and Ben Poole. Score-based generative modeling through stochastic differential equations.arXiv preprint arXiv:2011.13456, 2020
work page internal anchor Pith review arXiv 2011
-
[33]
Learning to summarize with human feedback
Nisan Stiennon, Long Ouyang, Jeffrey Wu, Daniel Ziegler, Ryan Lowe, Chelsea V oss, Alec Radford, Dario Amodei, and Paul F Christiano. Learning to summarize with human feedback. Advances in neural information processing systems, 33:3008–3021, 2020
2020
-
[34]
A law of comparative judgment.Psychological Review, 34(4):273–286, 1927
LL Thurstone. A law of comparative judgment.Psychological Review, 34(4):273–286, 1927. 11
1927
-
[35]
Diffusion model alignment using direct preference optimization
Bram Wallace, Meihua Dang, Rafael Rafailov, Linqi Zhou, Aaron Lou, Senthil Purushwalkam, Stefano Ermon, Caiming Xiong, Shafiq Joty, and Nikhil Naik. Diffusion model alignment using direct preference optimization. InProceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pages 8228–8238, 2024
2024
-
[36]
Xiaoshi Wu, Yiming Hao, Keqiang Sun, Yixiong Chen, Feng Zhu, Rui Zhao, and Hongsheng Li. Human preference score v2: A solid benchmark for evaluating human preferences of text-to-image synthesis.arXiv preprint arXiv:2306.09341, 2023
work page internal anchor Pith review arXiv 2023
-
[37]
Imagereward: Learning and evaluating human preferences for text-to-image generation
Jiazheng Xu, Xiao Liu, Yuchen Wu, Yuxuan Tong, Qinkai Li, Ming Ding, Jie Tang, and Yuxiao Dong. Imagereward: Learning and evaluating human preferences for text-to-image generation. Advances in Neural Information Processing Systems, 36:15903–15935, 2023
2023
-
[38]
DanceGRPO: Unleashing GRPO on Visual Generation
Zeyue Xue, Jie Wu, Yu Gao, Fangyuan Kong, Lingting Zhu, Mengzhao Chen, Zhiheng Liu, Wei Liu, Qiushan Guo, Weilin Huang, et al. Dancegrpo: Unleashing grpo on visual generation. arXiv preprint arXiv:2505.07818, 2025
work page internal anchor Pith review arXiv 2025
-
[39]
Diffusion models: A comprehensive survey of methods and applications.ACM computing surveys, 56(4):1–39, 2023
Ling Yang, Zhilong Zhang, Yang Song, Shenda Hong, Runsheng Xu, Yue Zhao, Wentao Zhang, Bin Cui, and Ming-Hsuan Yang. Diffusion models: A comprehensive survey of methods and applications.ACM computing surveys, 56(4):1–39, 2023
2023
-
[40]
Scaling Autoregressive Models for Content-Rich Text-to-Image Generation
Jiahui Yu, Yuanzhong Xu, Jing Yu Koh, Thang Luong, Gunjan Baid, Zirui Wang, Vijay Vasudevan, Alexander Ku, Yinfei Yang, Burcu Karagol Ayan, et al. Scaling autoregressive models for content-rich text-to-image generation.arXiv preprint arXiv:2206.10789, 2(3):5, 2022
work page internal anchor Pith review arXiv 2022
-
[41]
Self-rewarding language models
Weizhe Yuan, Richard Yuanzhe Pang, Kyunghyun Cho, Xian Li, Sainbayar Sukhbaatar, Jing Xu, and Jason E Weston. Self-rewarding language models. InForty-first International Conference on Machine Learning, 2024
2024
-
[42]
Dspo: Direct score preference optimization for diffusion model alignment
Huaisheng Zhu, Teng Xiao, and Vasant G Honavar. Dspo: Direct score preference optimization for diffusion model alignment. InThe Thirteenth International Conference on Learning Representations, 2025. 12 A Derivation of Bayesian Updating Suppose that the observation D indicates that model M1 outperforms model M2. The latent per- formance of model follows a ...
2025
-
[43]
we did not observe stable improvement when training from public implementations on Pick-a-Pic
and θ2 ∼ N(µ 2, σ2 2). β represents the uncer- tainty in a model’s single-instance performance. By Bayes’ theorem, the joint posterior distribution can be written as: P(θ 1, θ2 |D)∝ϕ θ1 −µ 1 σ1 ϕ θ2 −µ 2 σ2 Φ θ1 −θ 2p β2 1 +β 2 2 ! . Marginalizing outθ 2 yields the posterior ofθ 1: P(θ 1 |D)∝ϕ θ1 −µ 1 σ1 Φ θ1 −µ 2p β2 1 +β 2 2 +σ 2 2 ! . The posterior mea...
discussion (0)
Sign in with ORCID, Apple, or X to comment. Anyone can read and Pith papers without signing in.