We combine a new data model, where the random classification is subjected to rather weak restrictions which in turn are based on the Mammen-Tsybakov [E. Mammen and A.B. Tsybakov, Ann. Statis. 27 (1999) 1808-1829; A.B. Tsybakov, Ann. Statis. 32 (2004) 135-166.] small margin conditions, and the statistical query (SQ) model due to Kearns [M.J. Kearns, J. ACM 45 (1998) 983-1006] to what we refer to as PAC + SQ model. We generalize the class conditional constant noise (CCCN) model introduced by Decatur [S.E. Decatur, in ICML '97: Proc. of the Fourteenth Int. Conf. on Machine Learn. Morgan Kaufmann Publishers Inc. San Francisco, CA, USA (1997) 83-91] to the noise model orthogonal to a set of query functions. We show that every polynomial time PAC + SQ learning algorithm can be efficiently simulated provided that the random noise rate is orthogonal to the query functions used by the algorithm given the target concept. Furthermore, we extend the constant-partition classification noise (CPCN) model due to Decatur [S.E. Decatur, in ICML '97: Proc. of the Fourteenth Int. Conf. on Machine Learn. Morgan Kaufmann Publishers Inc. San Francisco, CA, USA (1997) 83-91] to what we call the constant-partition piecewise orthogonal (CPPO) noise model. We show how statistical queries can be simulated in the CPPO scenario, given the partition is known to the learner. We show how to practically use PAC + SQ simulators in the noise model orthogonal to the query space by presenting two examples from bioinformatics and software engineering. This way, we demonstrate that our new noise model is realistic.

Keywords: PAC learning with classification noise, Mammen−Tsybakov small margin conditions, statistical queries, noise model orthogonal to a set of query functions, bioinformatics, software engineering

@article{ITA_2014__48_2_209_0, author = {Brodag, Thomas and Herbold, Steffen and Waack, Stephan}, title = {A {Generalized} {Model} of {PAC} {Learning} and its {Applicability}}, journal = {RAIRO - Theoretical Informatics and Applications - Informatique Th\'eorique et Applications}, pages = {209--245}, publisher = {EDP-Sciences}, volume = {48}, number = {2}, year = {2014}, doi = {10.1051/ita/2014005}, mrnumber = {3302485}, language = {en}, url = {http://www.numdam.org/articles/10.1051/ita/2014005/} }

TY - JOUR AU - Brodag, Thomas AU - Herbold, Steffen AU - Waack, Stephan TI - A Generalized Model of PAC Learning and its Applicability JO - RAIRO - Theoretical Informatics and Applications - Informatique Théorique et Applications PY - 2014 SP - 209 EP - 245 VL - 48 IS - 2 PB - EDP-Sciences UR - http://www.numdam.org/articles/10.1051/ita/2014005/ DO - 10.1051/ita/2014005 LA - en ID - ITA_2014__48_2_209_0 ER -

%0 Journal Article %A Brodag, Thomas %A Herbold, Steffen %A Waack, Stephan %T A Generalized Model of PAC Learning and its Applicability %J RAIRO - Theoretical Informatics and Applications - Informatique Théorique et Applications %D 2014 %P 209-245 %V 48 %N 2 %I EDP-Sciences %U http://www.numdam.org/articles/10.1051/ita/2014005/ %R 10.1051/ita/2014005 %G en %F ITA_2014__48_2_209_0

Brodag, Thomas; Herbold, Steffen; Waack, Stephan. A Generalized Model of PAC Learning and its Applicability. RAIRO - Theoretical Informatics and Applications - Informatique Théorique et Applications, Volume 48 (2014) no. 2, pp. 209-245. doi : 10.1051/ita/2014005. http://www.numdam.org/articles/10.1051/ita/2014005/

[1] Instance-based learning algorithms. Machine Learn. (1991) 37-66.

and ,[2] Learning from noisy examples. Machine Learn. 2 (1988) 343-370.

and ,[3] http://httpd.apache.org/ (2011).

[4] Noise Tolerant Algorithms for Learning and Searching, Ph.D. thesis. MIT (1995).

,[5] Specification and Simulation of Statistical Query Algorithms for Efficiency and Noise Tolerance. J. Comput. Syst. Sci. 56 (1998) 191-208. | MR | Zbl

and ,[6] Model selection and error estimation. Machine Learn. 48 (2002) 85-113. | Zbl

, and ,[7] Convexity, classification, and risk bounds. J. Amer. Stat. Assoc. 1001 (2006) 138-156. | MR | Zbl

, and ,[8] Rademacher and Gaussian complexities: Risk bounds and structural results, in 14th COLT and 5th EuroCOLT (2001) 224-240. | MR | Zbl

and ,[9] Rademacher and Gaussian complexities: Risk bounds and structural results. J. Mach. Learn. Res. (2002) 463-482. | MR | Zbl

and ,[10] Learnabilty and the Vapnik−Chervonenkis dimension. J. ACM 36 (1989) 929-969. | MR | Zbl

, , and ,[11] Introduction to statistical learning theory, in Adv. Lect. Machine Learn. (2003) 169-207. | Zbl

, and ,[12] Introduction to statistical learning theory, in Adv. Lect. Machine Learn., vol. 3176 of Lect. Notes in Artificial Intelligence. Springer, Heidelberg (2004) 169-207. | Zbl

, and ,[13] PAC-Lernen zur Insolvenzerkennung und Hotspot-Identifikation, Ph.D. thesis, Ph.D. Programme in Computer Science of the Georg-August University School of Science GAUSS (2008).

,[14] Online learning of noisy data. IEEE Trans. Inform. Theory 57 (2011) 7907-7931. | MR

, and ,[15] Learning in hybrid noise environments using statistical queries, in Fifth International Workshop on Artificial Intelligence and Statistics. Lect. Notes Statis. Springer (1993).

,[16] Statistical Queries and Faulty PAC Oracles. COLT (1993) 262-268.

,[17] Efficient Learning from Faulty Data, Ph.D. thesis. Harvard University (1995). | MR

,[18] PAC learning with constant-partition classification noise and applications to decision tree induction, in ICML '97: Proc. of the Fourteenth Int. Conf. on Machine Learn. Morgan Kaufmann Publishers Inc. San Francisco, CA, USA (1997) 83-91.

,[19] On learning from noisy and incomplete examples, in COLT (1995) 353-360.

and ,[20] A Probabilistic Theory of Pattern Recognition. Springer, New York (1997). | MR | Zbl

, and ,[21] http://www.eclipse.org/jdt/ (2011).

[22] http://www.eclipe.org/platform/ (2011).

[23] Software metrics: a rigorous and practical approach. PWS Publishing Co. Boston, MA, USA (1997). | Zbl

and ,[24] Can pac learning algorithms tolerate random attribute noise? Algorithmica 14 (1995) 70-84. | MR | Zbl

and ,[25] Protein-protein interactions coupling of structurally conserved residues and of hot spots across interfaces. implications for docking. Structure 12 (2004) 1027-1036.

, and ,[26] Quantifying inductive bias: AI learning algorithms and Valiant's learning framework. Artificial Intelligence 36 (1988) 177-221. | MR | Zbl

,[27] Equivalence of models for polynomial learnability. Inform. Comput. 95 (1991) 129-161. | MR | Zbl

, , and ,[28] Calculation and optimization of thresholds for sets of software metrics. Empirical Software Engrg. (2011) 1-30. 10.1007/s10664-011-9162-z.

, and ,[29] International Organization of Standardization (ISO) and International Electro-technical Commission (ISEC), Geneva, Switzerland. Software engineering - Product quality, Parts 1-4 (2001-2004).

[30] Estimating continuous distributions in bayesian classifiers, In Proc. of the Eleventh Conf. on Uncertainty in Artificial Intelligence. Morgan Kaufmann (1995) 338-345.

and ,[31] Efficient noise-tolerant learning from statistical queries. J. ACM 45 (1998) 983-1006. | MR | Zbl

,[32] Learning in the presence of malicious errors. SIAM J. Comput. 22 (1993) 807-837. | MR | Zbl

and ,[33] Efficient Distribution-Free Learning of Probabilistic Concepts. J. Comput. Syst. Sci. 48 (1994) 464-497. | MR | Zbl

and ,[34] Rademacher penalties and structural risk minimization. IEEE Trans. Inform. Theory 47 (2001) 1902-1914. | MR | Zbl

,[35] Smooth discrimination analysis. Ann. Statis. 27 (1999) 1808-1829. | MR | Zbl

and ,[36] Some applications of concentration inequalities to statistics. Annales de la Faculté des Sciences de Toulouse, volume spécial dédiaé` Michel Talagrand (2000) 245-303. | Numdam | MR | Zbl

,[37] Rademacher averages and phase transitions in Glivenko-Cantelli classes. IEEE Trans. Inform. Theory 48 (2002) 1977-1991. | MR | Zbl

,[38] Hot spots - A review of the protein-protein interface determinant amino-acid residues. Proteins: Structure, Function, and Bioinformatics, 68 (2007) 803-812.

, and ,[39] A study of the effect of different types of noise on the precision of supervised learning techniques. Artif. Intell. Rev. 33 (2010) 275-306.

, and ,[40] ISIS: interaction sites identified from sequence. Bioinform. 23 (2007) 13-16.

and ,[41] Protein-protein interaction hotspots carved into sequences. PLoS Comput. Biol. 3 (2007).

and ,[42] Fast training of support vector machines using sequential minimal optimization, in Advances in kernel methods. Edited by B. Schölkopf, Ch.J.C. Burges and A.J. Smola. MIT Press, Cambridge, MA, USA (1999) 185-208.

,[43] C4.5: programs for machine learning. Morgan Kaufmann Publishers Inc., San Francisco, CA, USA (1993).

,[44] CN = CPCN, in ICML '06: Proc. of the 23rd int. Conf. Machine learn. ACM New York, NY, USA (2006) 721-728.

, and ,[45] Learning with Kernels. MIT Press (2002).

and ,[46] Asedb: a database of alanine mutations and their effects on the free energy of binding in protein interactions. Bioinformatics 17 (2001) 284-285.

and ,[47] Optimal aggregation of classifiers in statistical learning. Ann. Statis. 32 (2004) 135-166. | MR | Zbl

,[48] A theory of learnability. Communic. ACM 27 (1984) 1134-1142. | Zbl

,[49] Learning disjunctions of conjunctions, in Proc. of 9th Int. Joint Conf. Artificial Int. (1985) 560-566.

,*Cited by Sources: *