dc.contributor.author | Flores-Villamil C.A.; Luna-García H.; Ramírez-Villegas M.; Espino-Salinas C.H.; Mauricio-González A.; Arceo-Olague J.G. | |
dc.date.accessioned | 2025-04-28T22:09:14Z | |
dc.date.available | 2025-04-28T22:09:14Z | |
dc.date.created | 2025 | |
dc.identifier.isbn | 978-303180016-0 | |
dc.identifier.issn | 18650929 | |
dc.identifier.uri | http://hdl.handle.net/11407/8811 | |
dc.description | Schools and school population are prone to safety and health risks due to their proximity to hazardous features, either natural or infrastructural. In Mexico there is an official standard that points out which features to consider when selecting the location for building a school, and how proximal these features can be to the school’s location in order to not represent a threat. This work focused on applying both geospatial analysis and unsupervised machine learning techniques to detect the hazardous features per school in Mexico, and group these schools as per the data patterns themselves. For this, a data set containing Mexico’s schools and the proximal hazardous features for each school was built by spatially combining multiple official data sets. After that, the K-Modes partitional clustering machine learning algorithm was used with the created dataset. Multiple clustering models were built with this algorithm by testing various K values (number of clusters), and their clustering quality was measured with internal clustering evaluation metrics. The clustering model with the highest quality was the one that grouped Mexico’s schools into 11 clusters, each one indicating the most common(s) hazardous feature(s) to the schools of each cluster. The evaluation metrics results for this model were: Silhouette Score (0.72), Calinski-Harabasz Index (32863.96), Davies-Bouldin Index (0.72) indicating strong clustering results, with well-separated and cohesive clusters. The study was carried out by following the Cross Industry Standard Process for Data Mining (CRISP-DM) methodology, consisting in multiple phases and tasks for data mining projects execution. © The Author(s), under exclusive license to Springer Nature Switzerland AG 2025. | |
dc.language.iso | eng | |
dc.publisher | ANTACOM A.C.; SIP-IPN; UPIITA-IPN | |
dc.relation.isversionof | https://www.scopus.com/inward/record.uri?eid=2-s2.0-85214094797&doi=10.1007%2f978-3-031-80017-7_6&partnerID=40&md5=39df252f180ce64e93cec29edd3b6a73 | |
dc.source | Communications in Computer and Information Science | |
dc.source | Commun. Comput. Info. Sci. | |
dc.source | Scopus | |
dc.subject | Clustering | |
dc.subject | Data mining | |
dc.subject | Geospatial analysis | |
dc.subject | Machine learning | |
dc.subject | Schools | |
dc.subject | Unsupervised learning | |
dc.subject | Adversarial machine learning | |
dc.subject | Contrastive learning | |
dc.subject | Federated learning | |
dc.subject | Unsupervised learning | |
dc.subject | Clustering model | |
dc.subject | Clusterings | |
dc.subject | Data set | |
dc.subject | Evaluation metrics | |
dc.subject | Geo-spatial analysis | |
dc.subject | Machine-learning | |
dc.subject | Me-xico | |
dc.subject | Safety and healths | |
dc.subject | School | |
dc.subject | Unsupervised machine learning | |
dc.subject | Health risks | |
dc.subject | Clustering | |
dc.subject | Data mining | |
dc.subject | Geospatial analysis | |
dc.subject | Machine learning | |
dc.subject | Schools | |
dc.subject | Unsupervised learning | |
dc.title | School Clustering Through Machine Learning and Geospatial Analysis | |
dc.type | Conference paper | |
dc.rights.accessrights | info:eu-repo/semantics/restrictedAccess | |
dc.publisher.program | Ingeniería de Sistemas | |
dc.type.spa | Documento de conferencia | |
dc.identifier.doi | 10.1007/978-3-031-80017-7_6 | |
dc.relation.citationvolume | 2298 CCIS | |
dc.relation.citationstartpage | 86 | |
dc.relation.citationendpage | 104 | |
dc.publisher.faculty | Facultad de Ingenierías | |
dc.affiliation | Flores-Villamil C.A., Universidad Autonoma de Zacatecas, Zacatecas, Zacatecas, Mexico | |
dc.affiliation | Luna-García H., Universidad Autonoma de Zacatecas, Zacatecas, Zacatecas, Mexico | |
dc.affiliation | Ramírez-Villegas M., Universidad de Medellin, Antioquia, Medellin, Colombia | |
dc.affiliation | Espino-Salinas C.H., Universidad Autonoma de Zacatecas, Zacatecas, Zacatecas, Mexico | |
dc.affiliation | Mauricio-González A., Universidad Autonoma de Zacatecas, Zacatecas, Zacatecas, Mexico | |
dc.affiliation | Arceo-Olague J.G., Universidad Autonoma de Zacatecas, Zacatecas, Zacatecas, Mexico | |
dc.relation.references | Amram O., Abernethy R., Brauer M., Davies H., Allen R.W., Proximity of public elementary schools to major roads in Canadian urban areas, Int. J. Health Geogr, (2011) | |
dc.relation.references | Grineski S.E., Clark S.E., Collins T.W., School-based exposure to hazardous air pollutants and grade point average: A multi-level study. Environ, Res, 147, pp. 164-171, (2016) | |
dc.relation.references | Kweon B.-S., Mohai P., Lee S., Sametshaw A.M., Proximity of public schools to major highways and industrial facilities, and students’ school performance and health hazards. Environ. Plan B Urban Anal, City Sci, 45, 2, pp. 312-329, (2018) | |
dc.relation.references | Garcia-Zarate M.A., Arellano E., Villada-Canela M., Barajas-Carrillo V.W., Risk scenarios, based on the IDLH of benzene, The Population near Gas Stations of Three Cities in Baja California, Mexico. J. Earth Environ. Sci. 4(1), 1–7, (2016) | |
dc.relation.references | Pacheco Martinez N.J., Evaluación del impacto de la infraestructura física educativa en la educación, RIDE Revista Iberoamericana Para La Investigación Y El Desarrollo Educativo, (2021) | |
dc.relation.references | Espinosa-Zuniga J.J., Aplicación de metodología CRISP-DM para segmentación geográfica de una base de datos pública, Ingeniería Investigación Y Tecnología, Vol, 1, (2020) | |
dc.relation.references | Xia J., Wang J., Chen H., Zhuang J., Cao Z., Chen P., An unsupervised machine learning approach to evaluate sports facilities condition in primary school, Plos One, (2022) | |
dc.relation.references | Rahim R., Santoso J.T., Jumini S., Bhawika G.W., Susilo D., Unsupervised data mining technique for clustering library in Indonesia. Libr. Philos, Pract, (2021) | |
dc.relation.references | Chrusciel J., Et al., Making sense of the French public hospital system: A network-based approach to hospital clustering using unsupervised learning methods. BMC Health Serv. Res, XXI(1244), (2021) | |
dc.relation.references | Inventario nacional de fenómenos geológicos, Escala 1, 250 | |
dc.relation.references | Pedregosa F., Et al., Scikit-learn: Machine learning in Python, J. Mach. Learn. Res, 12, pp. 2825-2830, (2011) | |
dc.relation.references | Balusamy B., Abirami N., Kadry S., Gandomi A.H., Big Data: Concepts, (2021) | |
dc.relation.references | de Vries N.J., Olech L.P., Moscato P., Introducing clustering with a focus in marketing and consumer analysis, Business and Consumer Analytics: New Ideas. Springer, (2019) | |
dc.relation.references | Sarkar D., Bali R., Sharma T., Practical Machine Learning with Python, Apress, (2018) | |
dc.type.version | info:eu-repo/semantics/publishedVersion | |
dc.identifier.reponame | reponame:Repositorio Institucional Universidad de Medellín | |
dc.identifier.repourl | repourl:https://repository.udem.edu.co/ | |
dc.identifier.instname | instname:Universidad de Medellín | |
dc.contributor.event | 5th Latin American Conference on Geographical Information Systems, GIS-LATAM 2024 | |