Show simple item record

dc.contributor.authorFlores-Villamil C.A.; Luna-García H.; Ramírez-Villegas M.; Espino-Salinas C.H.; Mauricio-González A.; Arceo-Olague J.G.
dc.date.accessioned2025-04-28T22:09:14Z
dc.date.available2025-04-28T22:09:14Z
dc.date.created2025
dc.identifier.isbn978-303180016-0
dc.identifier.issn18650929
dc.identifier.urihttp://hdl.handle.net/11407/8811
dc.descriptionSchools and school population are prone to safety and health risks due to their proximity to hazardous features, either natural or infrastructural. In Mexico there is an official standard that points out which features to consider when selecting the location for building a school, and how proximal these features can be to the school’s location in order to not represent a threat. This work focused on applying both geospatial analysis and unsupervised machine learning techniques to detect the hazardous features per school in Mexico, and group these schools as per the data patterns themselves. For this, a data set containing Mexico’s schools and the proximal hazardous features for each school was built by spatially combining multiple official data sets. After that, the K-Modes partitional clustering machine learning algorithm was used with the created dataset. Multiple clustering models were built with this algorithm by testing various K values (number of clusters), and their clustering quality was measured with internal clustering evaluation metrics. The clustering model with the highest quality was the one that grouped Mexico’s schools into 11 clusters, each one indicating the most common(s) hazardous feature(s) to the schools of each cluster. The evaluation metrics results for this model were: Silhouette Score (0.72), Calinski-Harabasz Index (32863.96), Davies-Bouldin Index (0.72) indicating strong clustering results, with well-separated and cohesive clusters. The study was carried out by following the Cross Industry Standard Process for Data Mining (CRISP-DM) methodology, consisting in multiple phases and tasks for data mining projects execution. © The Author(s), under exclusive license to Springer Nature Switzerland AG 2025.
dc.language.isoeng
dc.publisherANTACOM A.C.; SIP-IPN; UPIITA-IPN
dc.relation.isversionofhttps://www.scopus.com/inward/record.uri?eid=2-s2.0-85214094797&doi=10.1007%2f978-3-031-80017-7_6&partnerID=40&md5=39df252f180ce64e93cec29edd3b6a73
dc.sourceCommunications in Computer and Information Science
dc.sourceCommun. Comput. Info. Sci.
dc.sourceScopus
dc.subjectClustering
dc.subjectData mining
dc.subjectGeospatial analysis
dc.subjectMachine learning
dc.subjectSchools
dc.subjectUnsupervised learning
dc.subjectAdversarial machine learning
dc.subjectContrastive learning
dc.subjectFederated learning
dc.subjectUnsupervised learning
dc.subjectClustering model
dc.subjectClusterings
dc.subjectData set
dc.subjectEvaluation metrics
dc.subjectGeo-spatial analysis
dc.subjectMachine-learning
dc.subjectMe-xico
dc.subjectSafety and healths
dc.subjectSchool
dc.subjectUnsupervised machine learning
dc.subjectHealth risks
dc.subjectClustering
dc.subjectData mining
dc.subjectGeospatial analysis
dc.subjectMachine learning
dc.subjectSchools
dc.subjectUnsupervised learning
dc.titleSchool Clustering Through Machine Learning and Geospatial Analysis
dc.typeConference paper
dc.rights.accessrightsinfo:eu-repo/semantics/restrictedAccess
dc.publisher.programIngeniería de Sistemas
dc.type.spaDocumento de conferencia
dc.identifier.doi10.1007/978-3-031-80017-7_6
dc.relation.citationvolume2298 CCIS
dc.relation.citationstartpage86
dc.relation.citationendpage104
dc.publisher.facultyFacultad de Ingenierías
dc.affiliationFlores-Villamil C.A., Universidad Autonoma de Zacatecas, Zacatecas, Zacatecas, Mexico
dc.affiliationLuna-García H., Universidad Autonoma de Zacatecas, Zacatecas, Zacatecas, Mexico
dc.affiliationRamírez-Villegas M., Universidad de Medellin, Antioquia, Medellin, Colombia
dc.affiliationEspino-Salinas C.H., Universidad Autonoma de Zacatecas, Zacatecas, Zacatecas, Mexico
dc.affiliationMauricio-González A., Universidad Autonoma de Zacatecas, Zacatecas, Zacatecas, Mexico
dc.affiliationArceo-Olague J.G., Universidad Autonoma de Zacatecas, Zacatecas, Zacatecas, Mexico
dc.relation.referencesAmram O., Abernethy R., Brauer M., Davies H., Allen R.W., Proximity of public elementary schools to major roads in Canadian urban areas, Int. J. Health Geogr, (2011)
dc.relation.referencesGrineski S.E., Clark S.E., Collins T.W., School-based exposure to hazardous air pollutants and grade point average: A multi-level study. Environ, Res, 147, pp. 164-171, (2016)
dc.relation.referencesKweon B.-S., Mohai P., Lee S., Sametshaw A.M., Proximity of public schools to major highways and industrial facilities, and students’ school performance and health hazards. Environ. Plan B Urban Anal, City Sci, 45, 2, pp. 312-329, (2018)
dc.relation.referencesGarcia-Zarate M.A., Arellano E., Villada-Canela M., Barajas-Carrillo V.W., Risk scenarios, based on the IDLH of benzene, The Population near Gas Stations of Three Cities in Baja California, Mexico. J. Earth Environ. Sci. 4(1), 1–7, (2016)
dc.relation.referencesPacheco Martinez N.J., Evaluación del impacto de la infraestructura física educativa en la educación, RIDE Revista Iberoamericana Para La Investigación Y El Desarrollo Educativo, (2021)
dc.relation.referencesEspinosa-Zuniga J.J., Aplicación de metodología CRISP-DM para segmentación geográfica de una base de datos pública, Ingeniería Investigación Y Tecnología, Vol, 1, (2020)
dc.relation.referencesXia J., Wang J., Chen H., Zhuang J., Cao Z., Chen P., An unsupervised machine learning approach to evaluate sports facilities condition in primary school, Plos One, (2022)
dc.relation.referencesRahim R., Santoso J.T., Jumini S., Bhawika G.W., Susilo D., Unsupervised data mining technique for clustering library in Indonesia. Libr. Philos, Pract, (2021)
dc.relation.referencesChrusciel J., Et al., Making sense of the French public hospital system: A network-based approach to hospital clustering using unsupervised learning methods. BMC Health Serv. Res, XXI(1244), (2021)
dc.relation.referencesInventario nacional de fenómenos geológicos, Escala 1, 250
dc.relation.referencesPedregosa F., Et al., Scikit-learn: Machine learning in Python, J. Mach. Learn. Res, 12, pp. 2825-2830, (2011)
dc.relation.referencesBalusamy B., Abirami N., Kadry S., Gandomi A.H., Big Data: Concepts, (2021)
dc.relation.referencesde Vries N.J., Olech L.P., Moscato P., Introducing clustering with a focus in marketing and consumer analysis, Business and Consumer Analytics: New Ideas. Springer, (2019)
dc.relation.referencesSarkar D., Bali R., Sharma T., Practical Machine Learning with Python, Apress, (2018)
dc.type.versioninfo:eu-repo/semantics/publishedVersion
dc.identifier.reponamereponame:Repositorio Institucional Universidad de Medellín
dc.identifier.repourlrepourl:https://repository.udem.edu.co/
dc.identifier.instnameinstname:Universidad de Medellín
dc.contributor.event5th Latin American Conference on Geographical Information Systems, GIS-LATAM 2024


Files in this item

FilesSizeFormatView

There are no files associated with this item.

This item appears in the following Collection(s)

Show simple item record