pith. machine review for the scientific record. sign in

arxiv: 1610.00378 · v2 · submitted 2016-10-03 · 💻 cs.AI

Recognition: unknown

Improving Accuracy and Scalability of the PC Algorithm by Maximizing P-value

Authors on Pith no claims yet
classification 💻 cs.AI
keywords algorithmaccuracyp-valuebidirectedbuhlmannchoosecollidersconditioning
0
0 comments X
read the original abstract

A number of attempts have been made to improve accuracy and/or scalability of the PC (Peter and Clark) algorithm, some well known (Buhlmann, et al., 2010; Kalisch and Buhlmann, 2007; 2008; Zhang, 2012, to give some examples). We add here one more tool to the toolbox: the simple observation that if one is forced to choose between a variety of possible conditioning sets for a pair of variables, one should choose the one with the highest p-value. One can use the CPC (Conservative PC, Ramsey et al., 2012) algorithm as a guide to possible sepsets for a pair of variables. However, whereas CPC uses a voting rule to classify colliders versus noncolliders, our proposed algorithm, PC-Max, picks the conditioning set with the highest p-value, so that there are no ambiguities. We combine this with two other optimizations: (a) avoiding bidirected edges in the orientation of colliders, and (b) parallelization. For (b) we borrow ideas from the PC-Stable algorithm (Colombo and Maathuis, 2014). The result is an algorithm that scales quite well both in terms of accuracy and time, with no risk of bidirected edges.

This paper has not been read by Pith yet.

discussion (0)

Sign in with ORCID, Apple, or X to comment. Anyone can read and Pith papers without signing in.

Forward citations

Cited by 2 Pith papers

Reviewed papers in the Pith corpus that reference this work. Sorted by Pith novelty score.

  1. Fourier Feature Methods for Nonlinear Causal Discovery: FFML Scoring, TRFF Scoring, and FFCI Testing in Mixed Data

    stat.ML 2026-05 unverdicted novelty 7.0

    FFML approximates GP marginal likelihood scores and FFCI provides an RFF-based CI test for mixed data, enabling scalable nonlinear causal discovery with empirical gains over baselines.

  2. Fourier Feature Methods for Nonlinear Causal Discovery: FFML Scoring, TRFF Scoring, and FFCI Testing in Mixed Data

    stat.ML 2026-05 unverdicted novelty 6.0

    FFML, TRFF, and FFCI are practical RFF-based approximations that replace expensive GP kernel matrices with finite feature maps, delivering competitive precision-recall trade-offs for score-based and constraint-based c...