Classification tasks are frequent in many applications in science and engineering. A wide variety of statistical learning methods exist to deal with these problems. However, in many industrial applications, the number of available samples to train and construct a classifier is scarce and this has an impact on the classifications performances. In this work, we consider the case in which some a priori information on the system is available in form of a mathematical model. In particular, a set of numerical simulations of the system can be integrated to the experimental dataset. The main question we address is how to integrate them systematically in order to improve the classification performances. The method proposed is based on Nearest Neighbours and on the notion of Hausdorff distance between sets. Some theoretical results and several numerical studies are proposed.
Keywords: Classification, Hausdorff distance, nearest neighbors
@article{M2AN_2021__55_5_2259_0,
author = {Lombardi, Damiano and Raphel, Fabien},
title = {A method to enrich experimental datasets by means of numerical simulations in view of classification tasks},
journal = {ESAIM: Mathematical Modelling and Numerical Analysis },
pages = {2259--2291},
year = {2021},
publisher = {EDP-Sciences},
volume = {55},
number = {5},
doi = {10.1051/m2an/2021060},
mrnumber = {4323407},
zbl = {07477245},
language = {en},
url = {https://www.numdam.org/articles/10.1051/m2an/2021060/}
}
TY - JOUR AU - Lombardi, Damiano AU - Raphel, Fabien TI - A method to enrich experimental datasets by means of numerical simulations in view of classification tasks JO - ESAIM: Mathematical Modelling and Numerical Analysis PY - 2021 SP - 2259 EP - 2291 VL - 55 IS - 5 PB - EDP-Sciences UR - https://www.numdam.org/articles/10.1051/m2an/2021060/ DO - 10.1051/m2an/2021060 LA - en ID - M2AN_2021__55_5_2259_0 ER -
%0 Journal Article %A Lombardi, Damiano %A Raphel, Fabien %T A method to enrich experimental datasets by means of numerical simulations in view of classification tasks %J ESAIM: Mathematical Modelling and Numerical Analysis %D 2021 %P 2259-2291 %V 55 %N 5 %I EDP-Sciences %U https://www.numdam.org/articles/10.1051/m2an/2021060/ %R 10.1051/m2an/2021060 %G en %F M2AN_2021__55_5_2259_0
Lombardi, Damiano; Raphel, Fabien. A method to enrich experimental datasets by means of numerical simulations in view of classification tasks. ESAIM: Mathematical Modelling and Numerical Analysis , Tome 55 (2021) no. 5, pp. 2259-2291. doi: 10.1051/m2an/2021060
[1] , and , Instance-based learning algorithms. Mach. Learn. 6 (1991) 37–66. | DOI
[2] and , An improved adaptive sampling scheme for the construction of explicit boundaries. Struct. Multi. Optim. 42 (2010) 517–529. | DOI
[3] and , Simulation-based anomaly detection and damage localization: an application to structural health monitoring. Comput. Methods Appl. Mech. Eng. 363 (2020) 112896. | MR | DOI
[4] , , and , Classification algorithms using adaptive partitioning. Ann. Stat. 42 (2014) 2141–2163. | MR | Zbl
[5] and , Comparison of instance selection and construction methods with various classifiers. Appl. Sci. 10 (2020) 3933. | DOI
[6] , A geometric approach to non-parametric density estimation. Pattern Recogn. 40 (2007) 134–140. | Zbl | DOI
[7] , and , Minimal model for human ventricular action potentials in tissue. J. Theor. Biol. 253 (2008) 544–560. | MR | Zbl | DOI
[8] , and , Stratification for scaling up evolutionary prototype selection. Pattern Recognit. Lett. 26 (2005) 953–963. | DOI
[9] , A review of some non-parametric methods of density estimation. IMA J. Appl. Math. 20 (1977) 335–354. | MR | Zbl | DOI
[10] and , Introduction to Boolean Algebras. Springer Science & Business Media (2008). | MR | Zbl
[11] , , , , , , and , Generative adversarial networks. Preprint arXiv:1406.2661 (2014).
[12] and , Calibration of imperfect mathematical models by multiple sources of data with measurement bias. Preprint arXiv:1810.11664 (2018).
[13] , and , The conditional nearest neighbor algorithm for classification and class probability estimation. Peer J. Comput. Sci. 5 (2019) e194. | DOI
[14] , The condensed nearest neighbor rule (corresp.). IEEE Trans. Inf. Theory 14 (1968) 515–516. | DOI
[15] , , , and , Combining field data and computer simulations for calibration and prediction. SIAM J. Sci. Comput. 26 (2004) 448–466. | MR | Zbl | DOI
[16] , Classification of AE transients based on numerical simulations of composite laminates. Ndt & e Int. 36 (2003) 319–329. | DOI
[17] , and , Learning using privileged information: SVM+ and weighted SVM. Neural Netw. 53 (2014) 95–108. | Zbl | DOI
[18] and , A greedy dimension reduction method for classification problems (2019) https://hal.inria.fr/hal-02280502.
[19] , Hit miss networks with applications to instance selection. J. Mach. Learn. Res. 9 (2008) 997–1017. | MR | Zbl
[20] , , , and , A combined simulation and machine learning approach for image-based force classification during robotized intravitreal injections. In: International Conference on Medical Image Computing and Computer-assisted Intervention. Springer, Cham (2018 September) 12–20.
[21] , and , Building intelligent alarm systems by combining mathematical models and inductive machine learning techniques Part 2 – sensitivity analysis. Int. J. Bio-Med. Comput. 42 (1996) 165–179. | DOI
[22] and , A review of learning vector quantization classifiers. Neural Comput. App. 25 (2014) 511–524. | DOI
[23] , , and , Visual domain adaptation: a survey of recent advances. IEEE Signal Process. Mag. 32 (2015) 53–69. | DOI
[24] , , , , , , , , , and , Scikit-learn: machine learning in python. J. Mach. Learn. Res. 12 (2011) 2825–2830. | MR | Zbl
[25] and , A non-parametric method to determine basic probability assignment based on kernel density estimation. IEEE Access 6 (2018) 73509–73519. | DOI
[26] , , and , An algorithm for a selective nearest neighbor decision rule (corresp.). IEEE Trans. Inf. Theory 21 (1975) 665–669. | Zbl | DOI
[27] , , and , Fully convolutional networks for structural health monitoring through multivariate time series classification. Adv. Model. Simul. Eng. Sci. 7 (2020) 1–31.
[28] , and , A survey of multi-source domain adaptation. Inf. Fusion 24 (2015) 84–92. | DOI
[29] , , and , Simulation-based classification; a model-order-reduction approach for structural health monitoring. Arch. Comput. Methods Eng. 25 (2018) 23–45. | MR | Zbl | DOI
[30] , Assessment of the adequacy of mathematical models. Agric. Syst. 89 (2006) 225–247. | DOI
[31] , An experiment with the edited nearest-nieghbor rule. IEEE Trans. Syst. Man Cybern. 6 (1976) 448–452. | MR | Zbl
[32] , and , A stochastic approach to Wilson’s editing algorithm. In: Iberian Conference on Pattern Recognition and Image Analysis. Springer, Berlin, Heidelberg (2005 June) 35–42.
[33] and , Deep visual domain adaptation: a survey. Neurocomputing 312 (2018) 135–153. | DOI
[34] and , Instance pruning techniques. In: Vol. 97 of ICML (1997, July) 400–411.
[35] and , Reduction techniques for instance-based learning algorithms. Mach. Learn. 38 (2000) 257–286. | Zbl | DOI
[36] , , and , Domain adaptation under target and conditional shift. In: International Conference on Machine Learning. PMLR (2013, May) 819–827.
Cité par Sources :





