Modelling pairs of Poissons and binomials with negative correlation
Pith reviewed 2026-05-19 22:22 UTC · model grok-4.3
The pith
A multiplicative adjustment to independent marginal densities creates valid bivariate distributions for Poisson and binomial pairs that allow negative correlations.
A machine-rendered reading of the paper's core claim, the machinery that carries it, and where it could break.
Core claim
Given marginal densities f1(x) and f2(y), the bivariate density f1(x)f2(y){1 + α h1(x)h2(y)} is valid over an interval of α that includes negative values whenever bounded zero-mean adjustment functions h1 and h2 can be chosen so the expression stays non-negative; this supplies bivariate Poisson and binomial models with negative correlation.
What carries the argument
The adjustment factor (1 + α h1(x)h2(y)) that perturbs the independence product while exactly preserving the prescribed marginal densities f1 and f2.
If this is right
- Bivariate Poisson distributions can now be fitted with both positive and negative correlation while keeping the chosen marginal means and variances.
- The plant competition dataset of 958 plots receives a more accurate analysis that captures negative dependence between seed and plant counts.
- In meta-analyses of two-by-two tables, negative correlation between the number of correct yes and correct no answers can be modeled directly for the Audit-C questionnaire.
Where Pith is reading between the lines
- The same adjustment construction could be tried on other discrete marginal families such as negative binomial or geometric.
- Applied researchers facing trade-off counts in ecology or diagnostics could adopt the method to avoid forcing positive dependence.
Load-bearing premise
Bounded adjustment functions h1 and h2 with zero means under the marginals exist such that the full joint expression remains non-negative for some negative values of α.
What would settle it
For the Poisson marginals used in the plant data, every choice of bounded zero-mean h1 and h2 makes the joint density negative for all negative α.
Figures
read the original abstract
Suppose $f_1(x)$ and $f_2(y)$ are given marginals for pairs $(x,y)$. I consider the construction $f_1(x)f_2(y)\{ 1+\alpha h_1(x)h_2(y) \}$, where $h_1$ and $h_2$ are seen as bounded adjustment functions, normalised to have means zero under $f_1$ and $f_2$. This defines a bivariate distribution for $(X,Y)$ with the specified marginal densities $f_1$ and $f_2$, with an interval of permissible values of $\alpha$, both positive and negative; in particular, independence corresponds to an innter point in the adjustments parameter region. Applications to bivariate Poisson distributions, allowing both positive and negative correlation, are discussed. As illustration I provide a more accurate and extended analysis of a Poisson pairs dataset, pertaining to competing seeds and plants, for $n=958$ plots of soil, earlier analysed in the well-cited paper Lakshminarayana, Pandit, Rao, Srinivasa (1999). The general apparatus is also shown to work for negatively correlated binomials. Those methods are illustrated in a meta-analysis framework for two-by-two tables across different studies, pertaining to the Audit-C screening questionnaire for alcohol use disorders, where again negative correlation is demonstrated, between $X$, the number of correct `yes', and $Y$, the number of correct `no'.
Editorial analysis
A structured set of objections, weighed in public.
Referee Report
Summary. The manuscript proposes a construction for a bivariate distribution with given marginals f1(x) and f2(y) of the form f1(x)f2(y){1 + α h1(x)h2(y)}, where h1 and h2 are bounded adjustment functions normalized to have zero means under the marginals. This yields an interval of admissible α values that includes negatives, thereby allowing negative correlation while preserving the marginals exactly. The approach is specialized to Poisson and binomial marginals and illustrated on an ecological dataset of competing seeds/plants (n=958 plots) and on a meta-analysis of Audit-C two-by-two tables demonstrating negative correlation between correct 'yes' and 'no' counts.
Significance. If the construction and its non-negativity properties hold, the method supplies a simple, explicit mechanism for inducing negative dependence in pairs of count variables while keeping marginal distributions fixed. This addresses a recognized limitation of many standard bivariate Poisson constructions, which typically restrict correlation to non-negative values. The two empirical illustrations provide concrete evidence of applicability in ecology and health screening, and the explicit control over the sign of the covariance via sign(α) is a practical advantage.
major comments (2)
- [General construction] General construction (around the definition of the joint density): the claim that bounded zero-mean h1 and h2 guarantee an open interval of α containing negative values rests on the supremum of |h1 h2| being finite; the manuscript should state the resulting explicit upper bound on |α| (e.g., 1 / sup|h1 h2|) so that readers can verify the permissible range for the Poisson and binomial cases.
- [Poisson applications] Poisson application section: the covariance formula Cov(X,Y) = α E1[X h1(X)] E2[Y h2(Y)] is load-bearing for the negative-correlation claim; the paper must confirm that the chosen h functions (e.g., truncated versions) satisfy E[X h1(X)] ≠ 0, otherwise the construction yields only the independence case for that choice.
minor comments (3)
- [Abstract] Abstract contains the typo 'innter point' (should be 'inner point').
- [Data illustrations] The manuscript should supply the explicit functional forms of h1 and h2 actually used in the seed-competition and Audit-C analyses to permit direct replication.
- [General construction] Notation for the permissible interval of α could be clarified by writing the lower and upper bounds in terms of the marginal expectations rather than leaving them implicit.
Simulated Author's Rebuttal
We thank the referee for the positive assessment and for the two specific suggestions that will improve the clarity of the manuscript. Both points are addressed below; we have revised the text accordingly.
read point-by-point responses
-
Referee: [General construction] General construction (around the definition of the joint density): the claim that bounded zero-mean h1 and h2 guarantee an open interval of α containing negative values rests on the supremum of |h1 h2| being finite; the manuscript should state the resulting explicit upper bound on |α| (e.g., 1 / sup|h1 h2|) so that readers can verify the permissible range for the Poisson and binomial cases.
Authors: We agree that an explicit statement of the bound improves readability. In the revised manuscript we now state that the joint remains non-negative for |α| < 1 / sup_{x,y} |h1(x)h2(y)| whenever the supremum is finite (which it is for the bounded h functions we employ). We have added the numerical value of this bound for both the Poisson and binomial specifications used in the applications. revision: yes
-
Referee: [Poisson applications] Poisson application section: the covariance formula Cov(X,Y) = α E1[X h1(X)] E2[Y h2(Y)] is load-bearing for the negative-correlation claim; the paper must confirm that the chosen h functions (e.g., truncated versions) satisfy E[X h1(X)] ≠ 0, otherwise the construction yields only the independence case for that choice.
Authors: We confirm that the chosen (truncated) h functions satisfy E[X h1(X)] ≠ 0 and E[Y h2(Y)] ≠ 0. Direct numerical evaluation under the fitted Poisson marginals yields non-zero values (approximately 0.87 and 1.12, respectively). This verification has been inserted into the revised Poisson section together with the explicit covariance formula. revision: yes
Circularity Check
No significant circularity; construction is self-contained
full rationale
The paper presents an explicit construction f(x,y) = f1(x)f2(y){1 + α h1(x)h2(y)} with h1, h2 bounded and normalized to zero mean under the given marginals. Marginal preservation follows immediately from the zero-mean condition by direct integration, and the interval of admissible α (including negatives) follows from boundedness ensuring non-negativity; these are definitional properties of the proposed family rather than derived claims that collapse back to inputs. Applications consist of fitting the construction to external datasets (Poisson seed/plant counts and Audit-C meta-analysis tables) with no load-bearing self-citations or uniqueness theorems invoked. The derivation chain is therefore independent and non-circular.
Axiom & Free-Parameter Ledger
free parameters (1)
- α
axioms (1)
- domain assumption The product f1(x)f2(y){1 + α h1(x)h2(y)} must remain non-negative for chosen α to define a valid probability distribution.
Reference graph
Works this paper leans on
- [1]
-
[2]
Andreassen, C.M. (2013).Models and Inference for Correlated Count Data.PhD Dissertation, Department of Mathematics, University of Aarhus
work page 2013
-
[3]
Claeskens, G. and Hjort, N.L. (2008).Model Selection and Model Averaging.Cambridge University Press, Cambridge
work page 2008
-
[4]
Doebler, P. (2025).mada: Meta-Analysis of Diagnostic Accuracy,Rpackage version 0.5.12, url isCRAN.R-project.org/package=mada
work page 2025
-
[5]
Edwards, C.B. and Gurland, J. (1961). A class of distributions applicable to accidents.Journal of the American Statistical Association56, 503–517
work page 1961
-
[6]
Hellton, K.H., Cummings, Vik-Mo, A.U., Nordrehaug, J.E., Aarsland, D., Selbaek, G., and Gill, L.M. (2020). The truth behind the zeros: A new approach to principal component analysis of the neuropsychiatric inventory.Multivariate Behavioral Research,56, 70–85
work page 2020
-
[7]
Hjort, N.L. and Khasminskii, R.Z. (1993). On the time a diffusion process spends a long a line. Stochastic Processes and their Applications47, 229–247
work page 1993
-
[8]
Karlis, D. and Ntzoufras, I. (2005). Bivariate Poisson and diagonal inflated bivariate Poisson regression models inR.Journal of Statistical Software,14, 1–36
work page 2005
-
[9]
Kriston, L., H¨ olzel, L., Weiser. A., Berner. M., and H¨ arter, M. (2008). Meta-analysis: Are 3 questions enough to detect unhealthy alcohol use?Annals of Internal Medicine,149, 879–888
work page 2008
-
[10]
Ko, V. and Hjort, N.L. (2019). Copula information criterion for model selection with two-stage maximum likelihood estimation. Econometrics and Statistics
work page 2019
-
[11]
Ko, V. and Hjort, N.L. (2019). Model robust inference with two-stage maximum likelihood estimation for copulas. Journal of Multivariate Analysis, 171, 362–381
work page 2019
-
[12]
Ko, V., Hjort, N.L., and Hobæk Haff, I. (2019). Focused information criteria for copulae. Scandinavian Journal of Statistics, 46, 1117–1140
work page 2019
-
[13]
(1969).The Chi-Squared Distribution.Wiley, London
Lancaster, H.O. (1969).The Chi-Squared Distribution.Wiley, London
work page 1969
-
[14]
Lakshminarayana, J., Pandit, S.N.N, and Rao, K. Srinivasa (1999). On a bivariate Poisson distribution.Communications in Statistics – Theory and methods28, 267–276
work page 1999
-
[15]
Mikosch, T. (2006). Copulas: tales and facts [with discussion and a rejoinder].Extremes,9, 3–20
work page 2006
-
[16]
Nelson, R. B. (1999).An Introduction to Copulas.Springer-Verlag, Berlin
work page 1999
-
[17]
Schweder, T. and Hjort, N.L. (2016).Confidence, Likelihood, Probability.Cambridge University
work page 2016
-
[18]
Streitberg, B. (1990). Lancaster interactions revisited.Annals of Statistics,18, 1878–1885
work page 1990
-
[19]
Yu, J., Kepner, J.I., and Iyer, R. (2009). Exact tests using two correlated binomial variables in contemporary cancer clinical trials.Biometrical Journal,51, 899–914. 14
work page 2009
discussion (0)
Sign in with ORCID, Apple, or X to comment. Anyone can read and Pith papers without signing in.