pith. sign in

arxiv: 2605.18662 · v1 · pith:VVZDF2AFnew · submitted 2026-05-18 · 💻 cs.LG

Efficient and Noise-Tolerant PAC Learning of Multiclass Linear Classifiers

Pith reviewed 2026-05-20 11:36 UTC · model grok-4.3

classification 💻 cs.LG
keywords PAC learningmulticlass classifierslinear classifiersnasty noisenoise tolerancecomputational efficiencymargin condition
0
0 comments X

The pith

There is an efficient PAC learning algorithm for multiclass linear classifiers that tolerates constant nasty noise with sample complexity O(k squared times (d log d plus log k)).

A machine-rendered reading of the paper's core claim, the machinery that carries it, and where it could break.

The paper shows that multiclass linear classifiers can be learned from data that has been maliciously corrupted at a constant rate. It gives an algorithm based on cluster-based pruning and hinge loss minimization that succeeds with high probability using only O(k^2 (d log d + log k)) samples. This holds when the input marginal is a mixture of bounded-variance distributions and the data satisfies a margin condition. A reader would care because previous efficient noise-tolerant algorithms were known only for binary classification, and this extends the result to any fixed number of classes with quadratic dependence on k.

Core claim

Under the stated distributional assumptions, there exists a computationally efficient algorithm that PAC learns the multiclass linear classifier h_w where h_w(x) = arg max_{y in [k]} w_y · x using at most O(k^2 (d log d + log k)) samples even in the presence of constant-rate nasty noise. The algorithm combines a cluster-based pruning scheme to remove noisy points with a standard multiclass hinge loss minimization program.

What carries the argument

Cluster-based pruning scheme followed by multiclass hinge loss minimization that removes corrupted examples while preserving the margin condition.

If this is right

  • The algorithm achieves PAC learning for multiclass linear classifiers under nasty noise for k >= 3, which was previously open.
  • The sample complexity depends quadratically on the number of classes k.
  • The result holds even in the binary case and is stronger than prior binary results.
  • Learning succeeds with high probability if the number of samples meets the stated bound.

Where Pith is reading between the lines

These are editorial extensions of the paper, not claims the author makes directly.

  • If the margin condition can be weakened, the algorithm might apply to a broader class of distributions.
  • Similar pruning techniques could potentially be adapted to other loss functions or noise models in multiclass settings.
  • The quadratic dependence on k suggests that for very large k the method may need further improvement for practicality.

Load-bearing premise

The input points come from a mixture of bounded-variance distributions and the true labels satisfy a margin condition relative to the classifier.

What would settle it

A concrete counterexample distribution that is a mixture of bounded-variance distributions satisfying the margin condition for which no efficient algorithm can PAC learn the classifier with o(k^2 d log d) samples under constant nasty noise would falsify the claim.

read the original abstract

Noise-tolerant PAC learning of linear models has been of central interests in machine learning community since the last century. In recent years, many computationally-efficient algorithms have been proposed for the problem of learning linear threshold functions under multiple noise models. Yet, when the problem is considered under multiclass learning settings, i.e. when the number of classes $k$ is at least $3$, it is unknown whether there exist computationally-efficient PAC learning algorithms when the data sets are maliciously corrupted. In this paper, we consider that the marginal distribution is a mixture of bounded variance distributions and the data sets satisfy a margin condition at the same time. We show that there exists a computationally-efficient algorithm that PAC learns multiclass linear classifiers $\{h_w:x\mapsto \arg\max_{y\in[k]}w_y\cdot x, x\in \mathbb{R}^d, w\in\mathbb{R}^{kd}\}$ using at most $O(k^2\cdot (d\log d+\log k))$ samples even under a constant rate of nasty noise. Our algorithm consists of two main ingredients: a cluster-based pruning scheme and a standard multiclass hinge loss minimization program. Even in the special case of binary setting, i.e. $k=2$, our result is strictly stronger than all prior works.

Editorial analysis

A structured set of objections, weighed in public.

Desk editor's note, referee report, simulated authors' rebuttal, and a circularity audit. Tearing a paper down is the easy half of reading it; the pith above is the substance, this is the friction.

Referee Report

1 major / 2 minor

Summary. The paper claims there exists a computationally efficient algorithm for PAC learning of multiclass linear classifiers h_w : x ↦ arg max_{y∈[k]} w_y · x under constant-rate nasty noise. The setting assumes the marginal distribution is a mixture of bounded-variance distributions and the data satisfy a margin condition. The algorithm proceeds in two stages—cluster-based pruning to remove noisy points followed by multiclass hinge-loss minimization—and achieves sample complexity O(k² (d log d + log k)). The result is presented as the first efficient algorithm for k ≥ 3 and strictly stronger than prior binary (k=2) results.

Significance. If the central claims hold, the work closes an important gap by supplying the first poly-time noise-tolerant PAC learner for multiclass linear classifiers with favorable dependence on both dimension and number of classes. The explicit two-stage construction (pruning plus hinge minimization) and the strengthening of the binary case are concrete strengths. The result is internally consistent with the stated mixture-plus-margin assumptions and supplies a falsifiable algorithmic procedure together with an explicit sample bound.

major comments (1)
  1. [§3, Theorem 3.1] §3, Theorem 3.1: the stated sample bound O(k² (d log d + log k)) does not visibly include the usual 1/ε and log(1/δ) factors; if these are absorbed into the O-notation or stated separately in the full theorem, the dependence should be written explicitly so that optimality relative to the binary case can be verified.
minor comments (2)
  1. [§2.2] §2.2: the precise definition of the margin condition (minimum distance to the decision boundary) should be restated when it is first used in the pruning analysis to avoid forward references.
  2. [Figure 1] Figure 1: the caption should indicate which panels correspond to the pruning step versus the hinge-minimization step so that the two-stage flow is immediately clear.

Simulated Author's Rebuttal

1 responses · 0 unresolved

We thank the referee for their positive evaluation of our work and for the recommendation of minor revision. We address the major comment point by point below.

read point-by-point responses
  1. Referee: [§3, Theorem 3.1] §3, Theorem 3.1: the stated sample bound O(k² (d log d + log k)) does not visibly include the usual 1/ε and log(1/δ) factors; if these are absorbed into the O-notation or stated separately in the full theorem, the dependence should be written explicitly so that optimality relative to the binary case can be verified.

    Authors: We agree that the dependence on ε and δ should be stated explicitly rather than being absorbed into the O-notation. We will update the statement of Theorem 3.1 (and the corresponding discussion in the abstract and introduction) to make the full sample complexity, including the factors of 1/ε and log(1/δ), visible. This will allow readers to easily compare the dependence on d and k with prior binary results. The revision will be included in the next version of the paper. revision: yes

Circularity Check

0 steps flagged

No significant circularity; derivation is self-contained

full rationale

The paper constructs a new two-stage algorithm (cluster-based pruning followed by multiclass hinge-loss minimization) and proves its PAC sample bound O(k²(d log d + log k)) under the explicit assumptions of a mixture-of-bounded-variance marginal and a margin condition. No equation or theorem reduces a claimed prediction to a fitted parameter by construction, no load-bearing uniqueness theorem is imported from the authors' prior work, and the binary-case strengthening is presented as a direct consequence of the same construction rather than a renaming or self-referential fit. The argument therefore remains independent of its own outputs.

Axiom & Free-Parameter Ledger

0 free parameters · 2 axioms · 0 invented entities

The result rests on standard PAC-learning assumptions plus two domain-specific conditions (mixture of bounded-variance distributions and margin condition) that are not derived inside the paper.

axioms (2)
  • domain assumption marginal distribution is a mixture of bounded-variance distributions
    Invoked in the abstract as the distributional setting required for the algorithm to succeed.
  • domain assumption data sets satisfy a margin condition
    Stated together with the mixture assumption as the joint precondition for the PAC guarantee.

pith-pipeline@v0.9.0 · 5755 in / 1302 out tokens · 31573 ms · 2026-05-20T11:36:27.497100+00:00 · methodology

discussion (0)

Sign in with ORCID, Apple, or X to comment. Anyone can read and Pith papers without signing in.

Lean theorems connected to this paper

Citations machine-checked in the Pith Canon. Every link opens the source theorem in the public Lean library.

What do these tags mean?
matches
The paper's claim is directly supported by a theorem in the formal canon.
supports
The theorem supports part of the paper's argument, but the paper may add assumptions or extra steps.
extends
The paper goes beyond the formal theorem; the theorem is a base layer rather than the whole result.
uses
The paper appears to rely on the theorem as machinery.
contradicts
The paper's claim conflicts with a theorem or certificate in the canon.
unclear
Pith found a possible connection, but the passage is too broad, indirect, or ambiguous to say the theorem truly supports the claim.

Reference graph

Works this paper leans on

99 extracted references · 99 canonical work pages

  1. [1]

    Van Den Burg and Patrick J.F

    Gerrit J.J. Van Den Burg and Patrick J.F. Groenen , title =. Journal of Machine Learning Research , year =

  2. [2]

    Feiping Nie and Zhezheng Hao and Rong Wang , title =. Proceedings of the Thirty-Eighth AAAI Conference on Artificial Intelligence and Thirty-Sixth Conference on Innovative Applications of Artificial Intelligence and Fourteenth Symposium on Educational Advances in Artificial Intelligence , year =

  3. [3]

    Neural Process

    Zhiqiang Zhang and Zeqian Xu and Junyan Tan and Hui Zou , title =. Neural Process. Lett. , year =

  4. [4]

    2008 3rd International Symposium on Communications, Control and Signal Processing , pages=

    Using robust features with multi-class SVMs to classify noisy sounds , author=. 2008 3rd International Symposium on Communications, Control and Signal Processing , pages=. 2008 , organization=

  5. [5]

    Custering with mixtures of log-concave distributions , journal =

    Chang, George and Walther, Guenther , year =. Custering with mixtures of log-concave distributions , journal =

  6. [6]

    Proceedings of the 34th Annual Conference on Neural Information Processing Systems , year =

    Kunal Talwar , title =. Proceedings of the 34th Annual Conference on Neural Information Processing Systems , year =

  7. [7]

    Kane , title =

    Ilias Diakonikolas and Daniel M. Kane , title =. 61st Annual Symposium on Foundations of Computer Science , year =

  8. [8]

    Proceedings of the 27th Annual Conference on Neural Information Processing Systems , pages =

    Praneeth Netrapalli and Prateek Jain and Sujay Sanghavi , title =. Proceedings of the 27th Annual Conference on Neural Information Processing Systems , pages =

  9. [9]

    Stephen Boyd and Lieven Vandenberghe , title =

  10. [10]

    Kane and Alistair Stewart , title =

    Ilias Diakonikolas and Daniel M. Kane and Alistair Stewart , title =. CoRR , volume =

  11. [11]

    Hopkins and Jerry Li , title =

    Samuel B. Hopkins and Jerry Li , title =. Proceedings of the 50th Annual

  12. [12]

    Proceedings of the 54th Annual

    Allen Liu and Jerry Li , title =. Proceedings of the 54th Annual

  13. [13]

    Kothari and Jacob Steinhardt and David Steurer , title =

    Pravesh K. Kothari and Jacob Steinhardt and David Steurer , title =. Proceedings of the 50th Annual

  14. [14]

    Proceedings of the 42nd International Conference on Machine Learning , year =

    Ilias Diakonikolas and Mingchen Ma and Lisheng Ren and Christos Tzamos , title =. Proceedings of the 42nd International Conference on Machine Learning , year =

  15. [15]

    Kane and Jasper C

    Ilias Diakonikolas and Daniel M. Kane and Jasper C. H. Lee and Thanasis Pittas , title =. Proceedings of the 2025 Annual

  16. [16]

    Journal of Machine Learning Research , year =

    Yutong Wang and Clayton Scott , title =. Journal of Machine Learning Research , year =

  17. [17]

    Proceedings of the 7th European Symposium on Artificial Neural Networks , year=

    Jason Weston and Chris Watkins , title=. Proceedings of the 7th European Symposium on Artificial Neural Networks , year=

  18. [18]

    Proceedings of the 38th International Conference on Machine Learning , year =

    Spencer Frei and Yuan Cao and Quanquan Gu , title =. Proceedings of the 38th International Conference on Machine Learning , year =

  19. [19]

    Bandit Multiclass Linear Classification: Efficient Algorithms for the Separable Case , booktitle =

    Alina Beygelzimer and D. Bandit Multiclass Linear Classification: Efficient Algorithms for the Separable Case , booktitle =

  20. [20]

    Proceedings of The 36th International Conference on Algorithmic Learning Theory , year =

    Shen, Jie , title =. Proceedings of The 36th International Conference on Algorithmic Learning Theory , year =

  21. [21]

    Kane and Vasilis Kontonis and Christos Tzamos and Nikos Zarifis , title =

    Ilias Diakonikolas and Daniel M. Kane and Vasilis Kontonis and Christos Tzamos and Nikos Zarifis , title =. Proceedings of the 65th Annual Symposium on Foundations of Computer Science , year =

  22. [22]

    Kane and Lisheng Ren , title =

    Ilias Diakonikolas and Giannis Iakovidis and Daniel M. Kane and Lisheng Ren , title =. Proceedings of the 39th Annual Conferencec on Information Processsing Systems , year =

  23. [23]

    The 38th Annual Conference on Learning Theory , series =

    Nikos Zarifis and Puqian Wang and Ilias Diakonikolas and Jelena Diakonikolas , title =. The 38th Annual Conference on Learning Theory , series =

  24. [24]

    Kane and Nikos Zarifis , title =

    Ilias Diakonikolas and Giannis Iakovidis and Daniel M. Kane and Nikos Zarifis , title =. 66th Annual Symposium on Foundations of Computer Science , year =

  25. [25]

    Understanding Machine Learning: From Theory to Algorithms , publisher =

    Shai Shalev. Understanding Machine Learning: From Theory to Algorithms , publisher =

  26. [26]

    Kakade and Shai Shalev

    Sham M. Kakade and Shai Shalev. Efficient Bandit Algorithms for Online Multiclass Prediction , booktitle =

  27. [27]

    Laird , title =

    Dana Angluin and Philip D. Laird , title =. Machine Learning , volume =

  28. [28]

    Kearns , title =

    Michael J. Kearns , title =. Journal of the ACM , volume =

  29. [29]

    The Power of Localization for Efficiently Learning Linear Separators with Noise , journal =

    Pranjal Awasthi and Maria. The Power of Localization for Efficiently Learning Linear Separators with Noise , journal =

  30. [30]

    Kane and Alistair Stewart , title =

    Ilias Diakonikolas and Daniel M. Kane and Alistair Stewart , title =. Proceedings of the 50th Annual

  31. [31]

    Proceedings of the 33rd Annual Conference on Neural Information Processing Systems , year =

    Ilias Diakonikolas and Themis Gouleakis and Christos Tzamos , title =. Proceedings of the 33rd Annual Conference on Neural Information Processing Systems , year =

  32. [32]

    Proceedings of the 33rd annual conference on Learning Theory , year =

    Ilias Diakonikolas and Vasilis Kontonis and Christos Tzamos and Nikos Zarifis , title =. Proceedings of the 33rd annual conference on Learning Theory , year =

  33. [33]

    Proceedings of the 34th Annual Conference on Neural Information Processing Systems , year =

    Ilias Diakonikolas and Vasilis Kontonis and Christos Tzamos and Nikos Zarifis , title =. Proceedings of the 34th Annual Conference on Neural Information Processing Systems , year =

  34. [34]

    Kane and Vasilis Kontonis and Christos Tzamos and Nikos Zarifis , title =

    Ilias Diakonikolas and Daniel M. Kane and Vasilis Kontonis and Christos Tzamos and Nikos Zarifis , title =. Proceedings of the 53rd Annual

  35. [35]

    Kane and Vasilis Kontonis and Christos Tzamos and Nikos Zarifis , title =

    Ilias Diakonikolas and Daniel M. Kane and Vasilis Kontonis and Christos Tzamos and Nikos Zarifis , title =. Proceedings of the 54th Annual

  36. [36]

    Klivans and Yishay Mansour and Rocco A

    Adam Tauman Kalai and Adam R. Klivans and Yishay Mansour and Rocco A. Servedio , title =. 46th Annual Symposium on Foundations of Computer Science , year =

  37. [37]

    Kane and Thanasis Pittas and Nikos Zarifis , title =

    Ilias Diakonikolas and Daniel M. Kane and Thanasis Pittas and Nikos Zarifis , title =. Proceedings of the 34th annual conference on Learning Theory , year =

  38. [38]

    Proceedings of the 40th International Conference on Machine Learning , year =

    Ilias Diakonikolas and Daniel Kane and Lisheng Ren , title =. Proceedings of the 40th International Conference on Machine Learning , year =

  39. [39]

    Proceedings of the 36th Annual Conference on Neural Information Processing Systems , year =

    Ilias Diakonikolas and Daniel Kane and Pasin Manurangsi and Lisheng Ren , title =. Proceedings of the 36th Annual Conference on Neural Information Processing Systems , year =

  40. [40]

    Lipton and Yu

    Zachary C. Lipton and Yu. Detecting and Correcting for Label Shift with Black Box Predictors , booktitle =

  41. [41]

    Dhillon and Pradeep Ravikumar and Ambuj Tewari , title =

    Nagarajan Natarajan and Inderjit S. Dhillon and Pradeep Ravikumar and Ambuj Tewari , title =. Proceedings of the 27th Annual Conference on Neural Information Processing Systems , year =

  42. [42]

    IEEE Transactions on Neural Networks and Learning Systems , year =

    Ruxin Wang and Tongliang Liu and Dacheng Tao , title =. IEEE Transactions on Neural Networks and Learning Systems , year =

  43. [43]

    Learning from Noisy Labels with Deep Neural Networks: A Survey , journal =

    Hwanjun Song and Minseok Kim and Dongmin Park and Yooju Shin and Jae. Learning from Noisy Labels with Deep Neural Networks: A Survey , journal =

  44. [44]

    Lipton , title =

    Saurabh Garg and Yifan Wu and Sivaraman Balakrishnan and Zachary C. Lipton , title =. Proceedings of the 34th Annual Conference on Neural Information Processing Systems , year =

  45. [45]

    Conference on Computer Vision and Pattern Recognition , year =

    Giorgio Patrini and Alessandro Rozza and Aditya Krishna Menon and Richard Nock and Lizhen Qu , title =. Conference on Computer Vision and Pattern Recognition , year =

  46. [46]

    Lee and Shivani Agarwal , title =

    Mingyuan Zhang and Jane H. Lee and Shivani Agarwal , title =. Proceedings of the 38th International Conference on Machine Learning , year =

  47. [47]

    Williamson , title =

    Brendan van Rooyen and Robert C. Williamson , title =. Journal of Machine Learning Research , volume =

  48. [48]

    Proceedings of the 34th annual conference on Learning Theory , year =

    Dimitris Fotakis and Alkis Kalavasis and Vasilis Kontonis and Christos Tzamos , title =. Proceedings of the 34th annual conference on Learning Theory , year =

  49. [49]

    Learning from Partial Labels , journal =

    Timoth. Learning from Partial Labels , journal =

  50. [50]

    Proceedings of the 13th International Conference on Information Processing Systems , year =

    John Platt and Nello Cristianini and John Shawe-Taylor , title =. Proceedings of the 13th International Conference on Information Processing Systems , year =

  51. [51]

    A Comparison of Methods for Multiclass Support Vector Machines , journal =

    Chih. A Comparison of Methods for Multiclass Support Vector Machines , journal =

  52. [52]

    Sathiya Keerthi , title =

    Kaibo Duan and S. Sathiya Keerthi , title =. Proceedings of 6th International Workshop on Multiple Classifier Systems , pages =

  53. [53]

    Bartlett , title =

    Ambuj Tewari and Peter L. Bartlett , title =. Journal of Machine Learning Research , volume =

  54. [54]

    Extreme Learning Machine for Regression and Multiclass Classification , journal =

    Guang. Extreme Learning Machine for Regression and Multiclass Classification , journal =

  55. [55]

    Proceedings of the 21st annual international conference on Chinese control and decision conference , year =

    Slim Ben Chaabane and Mohammad Hijji and Rafika Harrabi and Hassene Seddik , title =. Proceedings of the 21st annual international conference on Chinese control and decision conference , year =

  56. [56]

    Vempala , title =

    John Dunagan and Santosh S. Vempala , title =. Proceedings of the thirty-sixth annual

  57. [57]

    Vempala and Ying Xiao , title =

    Vitaly Feldman and Elena Grigorescu and Lev Reyzin and Santosh S. Vempala and Ying Xiao , title =. Journal of the ACM , year =

  58. [58]

    Statistical Query Algorithms for Mean Vector Estimation and Stochastic Convex Optimization , booktitle =

    Vitaly Feldman and Crist. Statistical Query Algorithms for Mean Vector Estimation and Stochastic Convex Optimization , booktitle =

  59. [59]

    Annals of Statistics , volume=

    Pascal Massart and Elodie Nedelec , title =. Annals of Statistics , volume=

  60. [60]

    Proceedings of the 36th annual conference on Learning Theory , year =

    Stefan Tiegel , title =. Proceedings of the 36th annual conference on Learning Theory , year =

  61. [61]

    Proceedings of the 35th annual conference on Learning Theory , year =

    Rajai Nasser and Stefan Tiegel , title =. Proceedings of the 35th annual conference on Learning Theory , year =

  62. [62]

    Frieze and Ravi Kannan and Santosh S

    Avrim Blum and Alan M. Frieze and Ravi Kannan and Santosh S. Vempala , title =. Algorithmica , volume =

  63. [63]

    and Kane, D

    Diakonikolas, I. and Kane, D. and Tzamos, C. , title =. Proceedings of the 35th Annual Conference on Neural Information Processing Systems , year =

  64. [64]

    and Tzamos, C

    Diakonikolas, I. and Tzamos, C. and Kane, D. M. , title =. Proceedings of the 55th Annual

  65. [65]

    and Kumar, H

    Ghosh, A. and Kumar, H. and Sastry, P. , title =. AAAI , year =

  66. [66]

    Bshouty and Nadav Eiron and Eyal Kushilevitz , editor =

    Nader H. Bshouty and Nadav Eiron and Eyal Kushilevitz , editor =. Proceedings of the 10th International Conference on Algorithmic Learning Theory , series =

  67. [67]

    Bshouty and Nadav Eiron and Eyal Kushilevitz , title =

    Nader H. Bshouty and Nadav Eiron and Eyal Kushilevitz , title =. Theor. Comput. Sci. , volume =. 2002 , url =

  68. [68]

    Proceedings of the 48th International Conference on Machine Learning , series =

    Shiwei Zeng and Jie Shen , title =. Proceedings of the 48th International Conference on Machine Learning , series =

  69. [69]

    Servedio , title =

    Guy Blanc and Yizhi Huang and Tal Malkin and Rocco A. Servedio , title =. Proceedings of the 37th Annual

  70. [70]

    Proceedings of the 38th International Conference on Machine Learning , series =

    Jie Shen , title =. Proceedings of the 38th International Conference on Machine Learning , series =

  71. [71]

    The Power of Localization for Efficiently Learning Linear Separators with Malicious Noise , journal =

    Pranjal Awasthi and Maria. The Power of Localization for Efficiently Learning Linear Separators with Malicious Noise , journal =

  72. [72]

    Long and Rocco A

    Philip M. Long and Rocco A. , title =. Mach. Learn. , volume =. 2010 , url =

  73. [73]

    Kane and Puqian Wang and Nikos Zarifis , title =

    Ilias Diakonikolas and Jelena Diakonikolas and Daniel M. Kane and Puqian Wang and Nikos Zarifis , title =. Proceedings of the 36th Conference on Learning Theory , series =

  74. [74]

    Williamson , title =

    Yishay Mansour and Richard Nock and Robert C. Williamson , title =. Proceedings of the 40th International Conference on Machine Learning , pages =

  75. [75]

    Kearns and Ming Li , title =

    Michael J. Kearns and Ming Li , title =

  76. [76]

    Bach and Eli Upfal , title =

    Alessio Mazzetto and Cyrus Cousins and Dylan Sam and Stephen H. Bach and Eli Upfal , title =. Proceedings of the 38th International Conference on Machine Learning , pages =

  77. [77]

    CoRR , volume =

    Ilias Diakonikolas and Vasilis Kontonis and Christos Tzamos and Nikos Zarifis , title =. CoRR , volume =

  78. [78]

    Kane and Thanasis Pittas and Nikos Zarifis , title =

    Ilias Diakonikolas and Daniel M. Kane and Thanasis Pittas and Nikos Zarifis , title =. Proceedings of the 34th Conference on Learning Theory , series =

  79. [79]

    Kane and Pasin Manurangsi and Lisheng Ren , title =

    Ilias Diakonikolas and Daniel M. Kane and Pasin Manurangsi and Lisheng Ren , title =. Proceedings of the 40th International Conference on Machine Learning , series =. 2023 , url =

  80. [80]

    Journal of Machine Learning Research , volume =

    Koby Crammer and Yoram Singer , title =. Journal of Machine Learning Research , volume =

Showing first 80 references.