Heuristic approach applied to the optimum stratification problem
RAIRO. Operations Research, Tome 55 (2021) no. 2, pp. 979-996

The problem of finding an optimal sample stratification has been extensively studied in the literature. In this paper, we propose a heuristic optimization method for solving the univariate optimum stratification problem to minimize the sample size for a given precision level. The method is based on the variable neighborhood search metaheuristic, which was combined with an exact method. Numerical experiments were performed over a dataset of 24 instances, and the results of the proposed algorithm were compared with two very well-known methods from the literature. Our results outperformed 94% of the considered cases. Besides, we developed an enumeration algorithm to find the optimal global solution in some populations and scenarios, which enabled us to validate our metaheuristic method. Furthermore, we find that our algorithm obtained the optimal global solutions for the vast majority of the cases.

Reçu le :
Accepté le :
Première publication :
Publié le :
DOI : 10.1051/ro/2021051
Classification : 90C59, 62D05
Keywords: Sampling, stratification, VNDS, exact methods
@article{RO_2021__55_2_979_0,
     author = {Andr\'e Brito, Jos\'e and de Lima, Leonardo and Henrique Gonz\'alez, Pedro and Oliveira, Breno and Maculan, Nelson},
     title = {Heuristic approach applied to the optimum stratification problem},
     journal = {RAIRO. Operations Research},
     pages = {979--996},
     year = {2021},
     publisher = {EDP-Sciences},
     volume = {55},
     number = {2},
     doi = {10.1051/ro/2021051},
     mrnumber = {4254327},
     language = {en},
     url = {https://www.numdam.org/articles/10.1051/ro/2021051/}
}
TY  - JOUR
AU  - André Brito, José
AU  - de Lima, Leonardo
AU  - Henrique González, Pedro
AU  - Oliveira, Breno
AU  - Maculan, Nelson
TI  - Heuristic approach applied to the optimum stratification problem
JO  - RAIRO. Operations Research
PY  - 2021
SP  - 979
EP  - 996
VL  - 55
IS  - 2
PB  - EDP-Sciences
UR  - https://www.numdam.org/articles/10.1051/ro/2021051/
DO  - 10.1051/ro/2021051
LA  - en
ID  - RO_2021__55_2_979_0
ER  - 
%0 Journal Article
%A André Brito, José
%A de Lima, Leonardo
%A Henrique González, Pedro
%A Oliveira, Breno
%A Maculan, Nelson
%T Heuristic approach applied to the optimum stratification problem
%J RAIRO. Operations Research
%D 2021
%P 979-996
%V 55
%N 2
%I EDP-Sciences
%U https://www.numdam.org/articles/10.1051/ro/2021051/
%R 10.1051/ro/2021051
%G en
%F RO_2021__55_2_979_0
André Brito, José; de Lima, Leonardo; Henrique González, Pedro; Oliveira, Breno; Maculan, Nelson. Heuristic approach applied to the optimum stratification problem. RAIRO. Operations Research, Tome 55 (2021) no. 2, pp. 979-996. doi: 10.1051/ro/2021051

[1] S. Baillargeon and L. Rivest, The construction of stratified designs in R with the package stratification. Surv. Methodol. 37 (2011) 53–65.

[2] M. Ballin and G. Barcaroli, Joint determination of optimal stratification and sample allocation using genetic algorithm. Surv. Methodol. 39 (2013) 369–393.

[3] J. A. M. Brito, L. Ochi, F. M. T. Montenegro and N. Maculan, An iterative local search approach applied to the optimal stratification problem. Int. Trans. Oper. Res. 17 (2010) 753–764. | MR | Zbl | DOI

[4] J. A. M. Brito, P. L. N. Silva, G. S. Semaan and N. Maculan, Integer programming formulations applied to optimal allocation in stratified sampling. Surv. Methodol. 41 (2015) 427–442.

[5] J. Brito, G. Semaan, A. Fadel and L. Brito, An optimization approach applied to the optimal stratification problem. Commun. Stat. Simul. Comput. 46 (2017) 4491–4451. | MR

[6] J. Brito, T. Veiga and P. Silva, An optimisation algorithm applied to the one-dimensional stratification problem. Surv. Methodol. 45 (2019) 295–315.

[7] R. Chambers and R. Dunstan, Estimating distribution functions from survey data. Biometrika 73 (1986) 597–604. | MR | Zbl | DOI

[8] W. G. Cochran, Samppling Techniques, 3rd edition. Wiley Series in Probability and Statistics (2007). | MR | Zbl

[9] T. Dalenius, The problem of optimum stratification. Skandinavisk Aktuarietidskrift 1950 (1950) 203–213. | MR | Zbl

[10] T. Dalenius and J. Hodges, Minimum variance stratification. J. Am. Stat. Assoc. 285 (1959) 88–101. | Zbl | DOI

[11] F. Danish, A mathematical programming approach for obtaining optimum strata boundaries using two auxiliary variables under proportional allocation. Stat. Trans. New Ser. 19 (2018) 507–526.

[12] F. Danish and S. Rizvi, Optimum stratification in bivariate auxiliary variables under neyman allocation. J. Mod. Appl. Stat. Methods 17 (2018) 2–24. | DOI

[13] F. Danish, S. Rizvi, M. Jeelani and J. Reshi, Obtaining strata boundaries under proportional allocation with varying cost of every unit. Pak. J. Stat. Oper. Res. 13 (2017) 567. | MR | DOI

[14] F. Danish, S. Rizvi, M. K. Sharma, M. I. Jeelani, B. Kumar and Q. F. Dar, Optimum stratification for two stratifying variables. Rev. Invest. Oper. 40 (2019) 562–573. | MR

[15] G. Ekman, An approximation useful in univariate stratification. Ann. Math. Stat. 30 (1959) 219–229. | MR | Zbl | DOI

[16] G. Glasser, On the complete coverage of large units in a statistical study. Rev. Int. Stat. Inst. 30 (1962) 28–32. | MR | Zbl | DOI

[17] F. Glover and G. A. Kochenberger, Handbook of Metaheuristics. Springer (2003). | MR | Zbl | DOI

[18] P. Gunning and J. M. Horgan, A new algorithm for the construction of stratum boundaries in skewed populations. Surv. Methodol. 30 (2004) 159–166.

[19] J. Han, M. Kamber and J. Pei, Data Mining: Concepts and Techniques, 3rd edition. Morgan Kaufmann Publishers Inc., San Francisco, CA, USA (2011). | Zbl

[20] P. Hansen and N. Mladenović, Variable neighborhood search: principles and applications. Eur. J. Oper. Res. 130 (2001) 449–467. | MR | Zbl | DOI

[21] P. Hansen, N. Mladenović and D. Perez-Brito, Variable neighborhood decomposition search. J. Heuristics 7 (2001) 335–350. | Zbl | DOI

[22] D. Hedlin, A procedure for stratification by an extended Ekman rule. J. Official Stat. 16 (2000) 15–29.

[23] M. A. Hidiroglou, The construction of a self-representing stratum of large units in survey design. Am. Stat. 40 (1986) 27–31. | MR | DOI

[24] M. Hidiroglou and M. Kozak, Stratification of skewed populations: a comparison of optimisation-based versus approximate methods. Int. Stat. Rev. 86 (2018) 87–105. | MR | DOI

[25] T. Keskintürk and S. Er, A genetic algorithm approach to determine stratum boundaries and sample sizes of each stratum in stratified sampling. Comput. Stat. Data Anal. 52 (2007) 53–67. | MR | Zbl | DOI

[26] M. Khan, N. Nand and N. Ahmad, Determining the optimum strata boundary points using dynamic programming. Surv. Methodol. 34 (2008) 205–214.

[27] L. Kish, Survey Sampling. Wiley New York, Chichester (1965). | Zbl

[28] M. Kozak, Optimal stratification using random search method in agricultural surveys. Stat. Trans. 6 (2004) 797–806.

[29] M. Kozak, Multivariate sample allocation: application of a random search method. Stat. Trans. 7 (2006) 889–900.

[30] M. Kozak, Comparison of random search method and genetic algorithm for stratification. Commun. Stat. Simul. Comput. 43 (2014) 249–253. | MR | Zbl | DOI

[31] M. Kozak and M. R. Verma, Geometric versus optimization approach to stratification: a comparison of efficiency. Surv. Methodol. 32 (2006) 157–163.

[32] M. Kozak, M. R. Verma and A. Zieliński, Modern approach to optimum stratification: review and perspectives. Stat. Trans. 8 (2007) 223–250.

[33] P. Lavallée and M. A. Hidiroglou, On the stratification of skewed populations. Surv. Methodol. 14 (1988) 33–43.

[34] B. Lednicki and R. Wieczorkowski, Optimal stratification and sample allocation between subpopulations and strata. Stat. Trans. 6 (2003) 287–305.

[35] J. Lisic, H. Sang, Z. Zhu and S. Zimmer, Optimal stratification and allocation for the june agricultural survey. J. Official Stat. 34 (2018) 121–148. | DOI

[36] S. L. Lohr, Sampling: Design and Analysis, 2nd edition. Chapman and Hall/CRC (2019). | Zbl | DOI

[37] D. Rao, M. Khan and K. Reddy, Optimum stratification of a skewed population. Int. J. Math. Comput. Sci. 8 (2014) 492–495.

[38] K. Reddy and M. Khan, Optimal stratification in stratified designs using weibull-distributed auxiliary information. Commun. Stat. Theory Methods 48 (2019) 3136–3152. | MR | DOI

[39] K. Reddy and M. Khan, stratifyR: an R package for optimal stratification and sample allocation for univariate populations. Aust. New Zealand J. Stat. 62 (2020) 383–405. | MR | DOI

[40] K. Reddy, M. Khan and S. Khan, Optimum strata boundaries and sample sizes in health surveys using auxiliary variables. PLoS ONE 13 (2018) e0194787. | DOI

[41] L. Rivest, A generalization of the Lavallé and Hidiroglou algorithm for stratification in business surveys. Surv. Methodol. 28 (2002) 191–198.

[42] S. Ross, A First Course in Probability, 10th edition. Pearson (2018). | Zbl

[43] V. Sethi, A note on the optimum stratification of populations for estimating the population means. Aust. J. Stat. 5 (1963) 20–33. | Zbl | DOI

Cité par Sources :