This paper presents a biased random-key genetic algorithm for k-medoids clustering problem. A novel heuristic operator was implemented and combined with a parallelized local search procedure. Experiments were carried out with fifty literature data sets with small, medium, and large sizes, considering several numbers of clusters, showed that the proposed algorithm outperformed eight other algorithms, for example, the classics PAM and CLARA algorithms. Furthermore, with the results of a linear integer programming formulation, we found that our algorithm obtained the global optimal solutions for most cases and, despite its stochastic nature, presented stability in terms of quality of the solutions obtained and the number of generations required to produce such solutions. In addition, considering the solutions (clusterings) produced by the algorithms, a relative validation index (average silhouette) was applied, where, again, was observed that our method performed well, producing cluster with a good structure.
Keywords: $$-Medoids, Optimization, metaheuristics, Rio de Janeiro, Brazilmathematical programming
@article{RO_2022__56_4_3137_0,
author = {Brito, Jose Andre and Semaan, Gustavo and Fadel, Augusto},
title = {The effective {BRKGA} algorithm for the $k$-medoids clustering problem},
journal = {RAIRO. Operations Research},
pages = {3137--3153},
year = {2022},
publisher = {EDP-Sciences},
volume = {56},
number = {4},
doi = {10.1051/ro/2022141},
mrnumber = {4474769},
zbl = {1500.90055},
language = {en},
url = {https://www.numdam.org/articles/10.1051/ro/2022141/}
}
TY - JOUR AU - Brito, Jose Andre AU - Semaan, Gustavo AU - Fadel, Augusto TI - The effective BRKGA algorithm for the $k$-medoids clustering problem JO - RAIRO. Operations Research PY - 2022 SP - 3137 EP - 3153 VL - 56 IS - 4 PB - EDP-Sciences UR - https://www.numdam.org/articles/10.1051/ro/2022141/ DO - 10.1051/ro/2022141 LA - en ID - RO_2022__56_4_3137_0 ER -
%0 Journal Article %A Brito, Jose Andre %A Semaan, Gustavo %A Fadel, Augusto %T The effective BRKGA algorithm for the $k$-medoids clustering problem %J RAIRO. Operations Research %D 2022 %P 3137-3153 %V 56 %N 4 %I EDP-Sciences %U https://www.numdam.org/articles/10.1051/ro/2022141/ %R 10.1051/ro/2022141 %G en %F RO_2022__56_4_3137_0
Brito, Jose Andre; Semaan, Gustavo; Fadel, Augusto. The effective BRKGA algorithm for the $k$-medoids clustering problem. RAIRO. Operations Research, Tome 56 (2022) no. 4, pp. 3137-3153. doi: 10.1051/ro/2022141
[1] and , A new global optimization algorithm for diameter minimization clusteringIn: Proceedings of Global Optimization Workshop (GOW16), Portugal, edited by . University of Minho/Algoritmi Research Centre. (2016) 171–174.
[2] , and , An optimization algorithm applied to the one-dimensional stratification problem. Surv. Methodol. 45 (2019) 295–315.
[3] , and , On the efficiency of evolutionary fuzzy clustering. J. Heuristics 15 (2009) 43–75. | Zbl | DOI
[4] , and , Improved search strategies and extensions to k-medoids-based clustering algorithms. Int. J. Bus. Intell. Data Min. 3 (2008) 212–231.
[5] , Network and Discrete Location: Models, Algorithms, and Applications, 2nd edition. John Wiley & Sons (2013). | MR | Zbl
[6] , and , Minimization of the number of iterations in k-medoids clustering with purity algorithm. Rev. Intell. Artif. 35 (2021) 193–199.
[7] , , and , Microaggregation heuristic applied to statistical disclosure control. Inf. Sci. 548 (2021) 37–55. | MR | DOI
[8] , Genetic Algorithms and Grouping Problems. John Wiley & Sons (1998).
[9] , A biased random-key genetic algorithm for data clustering. Math. Biosci. 245 (2013) 76–85. | MR | Zbl | DOI
[10] and , Computers and Intractability: A Guide to the Theory of NP-Completeness. W.H. Freeman and Company (1979). | MR | Zbl
[11] and , Biased random-key genetic algorithms for combinatorial optimization. J. Heuristics 17 (2011) 487–525. | DOI
[12] , Clustering overview and applications. In: Encyclopedia of Database Systems, edited by and . Springer, Boston, MA (2009). | DOI
[13] , , and , Multivariate Data Analysis, 8th edition. Cengage Learning (2018).
[14] , Optimum location of switching centers and the absolute centers and medians of a graph. Oper. Res. 12 (1964) 450–459. | Zbl | DOI
[15] and , Clarans: a method for clustering objects for spatial datamining. IEEE Trans. Knowl. Data Eng. 14 (2002) 1003–1016. | DOI
[16] , and , Data Mining: Concepts and Techniques, 4th edition. Morgan Kaufmann Publishers 2022).
[17] and , Cluster analysis and mathematical programming. Math. Program. 79 (1997) 191–215. | MR | Zbl | DOI
[18] , and , An improved version of -medoid algorithm using CRO. Modern Appl. Sci. 12 (2018) 116–127. | DOI
[19] and , Applied Multivariate Statistical Analysis, 6th edition. Pearson (2018). | MR
[20] and , Finding Groups in Data: An Introduction to Cluster Analysis. Wiley-Interscience (1990). | MR | Zbl | DOI
[21] , , and , A BRKGA-de algorithm for parallel-batching scheduling with deterioration and learning effects on parallel machines under preventive maintenance consideration. Ann. Math. Artif. Intell. 88 (2020) 237–267. | MR | DOI
[22] , and , Location Science, 2nd edition. Springer (2019). | Zbl | MR
[23] , and , On -medoid clustering of large data sets with the aid of a genetic algorithm: background, feasibility and comparison. Anal. Chim. Acta 282 (1993) 647–669. | DOI
[24] , and , Handbook of Heuristics, 1st edition. Springer (2018). | Zbl
[25] and , On the complexity of some common geometric location problems. SIAM J. Comput. 13 (1984) 182–196. | MR | Zbl | DOI
[26] , and , Investigation of a new grasp- based clustering algorithm applied to biological data. Comput. Oper. Res. 37 (2010) 1381–1388. | Zbl | DOI
[27] , and , A hybrid heuristic for the -medoids clustering problem. In: GECCO’12: Proceedings of the 14th Annual Conference on Genetic and Evolutionary Computation. Association for Computing Machinery, New York, NY, USA (2012) 417–424.
[28] and , The capacitated centred clustering problem. Comput. Oper. Res. 33 (2006) 1639–1663. | Zbl | DOI
[29] , , , and , Capacitated clustering problems applied to the layout of it-teams in software factories. Ann. Oper. Res. (2020). DOI: . | DOI | MR | Zbl
[30] , and , A comparison of two hybrid methods for constrained clustering problems. Appl. Soft Comput. 54 (2017) 256–266. | DOI
[31] , and , A study of some fuzzy cluster validity indices, genetic clustering and application to pixel classification. Fuzzy Sets Syst. 155 (2005) 191–214. | MR | DOI
[32] and , A simple and fast algorithm for -medoids clustering. Expert Syst. App. 36 (2009) 3336–3341. | DOI
[33] , and , -prototype algorithm for clustering large data sets with categorical values to established product segmentation. In: Proceedings of Data Analytics and Management, edited by , , , and . Springer, Singapore (2022) 343–353. | DOI
[34] , Cluster analysis and mathematical programming. J. Am. Stat. Assoc. 66 (1971) 622–626. | Zbl | DOI
[35] and , Faster -medoids clustering: Improving the pam, clara, and clarans algorithms. In: Similarity Search and Applications, edited by , , and . Springer International Publishing, Cham (2019) 171–187. | DOI
[36] and , Fast and eager -medoids clustering: runtime improvement of the pam, clara, and clarans algorithms. Inf. Syst. 101 (2021) 101804. | DOI
[37] , Algoritmos para o Problema de Agrupamento Automático. Ph.D. thesis, Federal Fluminense University (2013).
[38] , , , , , and , A brief history of heuristics: from bounded rationality to intractability. IEEE Latin Am. Trans. 18 (2021) 1975–1986. | DOI
[39] and , A hybrid algorithm for -medoid clustering of large data sets. In: Proceedings of the 2004 Congress on Evolutionary Computation (IEEE Cat. No.04TH8753) 1 (2004) 77–82. | DOI
[40] and , A genetic -medoids clustering algorithm. J. Heuristics 12 (2006) 447–466. | DOI
[41] , and , PAMAE: parallel -medoids clustering with high accuracy and efficiency. In: Proceedings of the 23rd ACM SIGKDD International Conference on Knowledge Discovery and Data Mining. Association for Computing Machinery, New York, NY, USA, (2017) 1087–1096.
[42] and , On the virtues of parameterized uniform crossover. In: Proceedings of the Fourth International Conference on Genetic Algorithms. (1991) 230–236.
[43] and , Fuzzy kernel -medoids clustering algorithm for uncertain data objects. Pattern Anal. App. 24 (2021) 1287–1302. | DOI
[44] and , A genetic approach to the automatic clustering problem. Pattern Recogn. 34 (2001) 415–424. | Zbl | DOI
[45] and , Near-optimal large-scale -medoids clustering. Inf. Sci. 545 (2021) 344–362. | MR | Zbl | DOI
[46] and , A parallel heuristic for a -medoids clustering problem with unfixed number of clusters. In: 2019 42nd International Convention on Information and Communication Technology, Electronics and Microelectronics (MIPRO) (2019) 1116–1120. | DOI
[47] , and , A new partitioning around medoids algorithm. J. Stat. Comput. Simul. 78 (2003) 575–584. | MR | Zbl | DOI
[48] , Integer programming and theory of grouping. J. Am. Stat. Assoc. 64 (1969) 506–517. | Zbl | DOI
[49] and , A fast -medoids clustering algorithm for image segmentation based object recognition. J. Rob. Autom. 4 (2021) 202–211.
[50] , Integer Programming, 2nd edition. Wiley (2020). | MR
[51] and , comprehensive survey of clustering algorithms. Ann. Data Sci. 2 (2015) 165–193. | DOI
[52] and , Survey clustering algorithms. IEEE Trans. Neural Netw. 16 (2005) 645–678. | DOI
[53] , , and , An improved -medoids algorithm based on step increasing and optimizing. Expert Syst. App. 92 (2018) 464–473. | DOI
[54] , and , Ranked -medoids: a fast and accurate rank-based partitioning algorithm for clustering large datasets. Knowl.-Based Syst. 39 (2013) 133–143. | DOI
[55] and , A new and efficient -medoid algorithm for spatial clustering. Lecture Notes Comput. Sci. 3482 (2005) 181–189. | DOI
[56] , and , Evolutionary multi-objective automatic clustering enhanced with quality metrics and ensemble strategy. Knowl. Based Syst. 188 (2020) 105018. | DOI
Cité par Sources :





