REPOSITORIO
INSTITUCIONAL

    • español
    • English
  • Site map
  • English 
    • español
    • English
  • Login
  • Artículos(current)
  • Libros
  • Tesis
  • Trabajos de grado
  • Documentos Institucionales
    • Actas
    • Acuerdos
    • Decretos
    • Resoluciones
  • Multimedia
  • Productos de investigación
  • Acerca de
View Item 
  •   Home
  • Artículos
  • Indexados Scopus
  • View Item
  •   Home
  • Artículos
  • Indexados Scopus
  • View Item
JavaScript is disabled for your browser. Some features of this site may not work without it.

School Clustering Through Machine Learning and Geospatial Analysis

Thumbnail
Share this
Date
2025
Author
Flores-Villamil C.A.; Luna-García H.; Ramírez-Villegas M.; Espino-Salinas C.H.; Mauricio-González A.; Arceo-Olague J.G.

Citación

       
TY - GEN T1 - School Clustering Through Machine Learning and Geospatial Analysis Y1 - 2025 UR - http://hdl.handle.net/11407/8811 PB - ANTACOM A.C.; SIP-IPN; UPIITA-IPN AB - Schools and school population are prone to safety and health risks due to their proximity to hazardous features, either natural or infrastructural. In Mexico there is an official standard that points out which features to consider when selecting the location for building a school, and how proximal these features can be to the school’s location in order to not represent a threat. This work focused on applying both geospatial analysis and unsupervised machine learning techniques to detect the hazardous features per school in Mexico, and group these schools as per the data patterns themselves. For this, a data set containing Mexico’s schools and the proximal hazardous features for each school was built by spatially combining multiple official data sets. After that, the K-Modes partitional clustering machine learning algorithm was used with the created dataset. Multiple clustering models were built with this algorithm by testing various K values (number of clusters), and their clustering quality was measured with internal clustering evaluation metrics. The clustering model with the highest quality was the one that grouped Mexico’s schools into 11 clusters, each one indicating the most common(s) hazardous feature(s) to the schools of each cluster. The evaluation metrics results for this model were: Silhouette Score (0.72), Calinski-Harabasz Index (32863.96), Davies-Bouldin Index (0.72) indicating strong clustering results, with well-separated and cohesive clusters. The study was carried out by following the Cross Industry Standard Process for Data Mining (CRISP-DM) methodology, consisting in multiple phases and tasks for data mining projects execution. © The Author(s), under exclusive license to Springer Nature Switzerland AG 2025. ER - @misc{11407_8811, author = {}, title = {School Clustering Through Machine Learning and Geospatial Analysis}, year = {2025}, abstract = {Schools and school population are prone to safety and health risks due to their proximity to hazardous features, either natural or infrastructural. In Mexico there is an official standard that points out which features to consider when selecting the location for building a school, and how proximal these features can be to the school’s location in order to not represent a threat. This work focused on applying both geospatial analysis and unsupervised machine learning techniques to detect the hazardous features per school in Mexico, and group these schools as per the data patterns themselves. For this, a data set containing Mexico’s schools and the proximal hazardous features for each school was built by spatially combining multiple official data sets. After that, the K-Modes partitional clustering machine learning algorithm was used with the created dataset. Multiple clustering models were built with this algorithm by testing various K values (number of clusters), and their clustering quality was measured with internal clustering evaluation metrics. The clustering model with the highest quality was the one that grouped Mexico’s schools into 11 clusters, each one indicating the most common(s) hazardous feature(s) to the schools of each cluster. The evaluation metrics results for this model were: Silhouette Score (0.72), Calinski-Harabasz Index (32863.96), Davies-Bouldin Index (0.72) indicating strong clustering results, with well-separated and cohesive clusters. The study was carried out by following the Cross Industry Standard Process for Data Mining (CRISP-DM) methodology, consisting in multiple phases and tasks for data mining projects execution. © The Author(s), under exclusive license to Springer Nature Switzerland AG 2025.}, url = {http://hdl.handle.net/11407/8811} }RT Generic T1 School Clustering Through Machine Learning and Geospatial Analysis YR 2025 LK http://hdl.handle.net/11407/8811 PB ANTACOM A.C.; SIP-IPN; UPIITA-IPN AB Schools and school population are prone to safety and health risks due to their proximity to hazardous features, either natural or infrastructural. In Mexico there is an official standard that points out which features to consider when selecting the location for building a school, and how proximal these features can be to the school’s location in order to not represent a threat. This work focused on applying both geospatial analysis and unsupervised machine learning techniques to detect the hazardous features per school in Mexico, and group these schools as per the data patterns themselves. For this, a data set containing Mexico’s schools and the proximal hazardous features for each school was built by spatially combining multiple official data sets. After that, the K-Modes partitional clustering machine learning algorithm was used with the created dataset. Multiple clustering models were built with this algorithm by testing various K values (number of clusters), and their clustering quality was measured with internal clustering evaluation metrics. The clustering model with the highest quality was the one that grouped Mexico’s schools into 11 clusters, each one indicating the most common(s) hazardous feature(s) to the schools of each cluster. The evaluation metrics results for this model were: Silhouette Score (0.72), Calinski-Harabasz Index (32863.96), Davies-Bouldin Index (0.72) indicating strong clustering results, with well-separated and cohesive clusters. The study was carried out by following the Cross Industry Standard Process for Data Mining (CRISP-DM) methodology, consisting in multiple phases and tasks for data mining projects execution. © The Author(s), under exclusive license to Springer Nature Switzerland AG 2025. OL Spanish (121)
Gestores bibliográficos
Refworks
Zotero
BibTeX
CiteULike
Metadata
Show full item record
Abstract
Schools and school population are prone to safety and health risks due to their proximity to hazardous features, either natural or infrastructural. In Mexico there is an official standard that points out which features to consider when selecting the location for building a school, and how proximal these features can be to the school’s location in order to not represent a threat. This work focused on applying both geospatial analysis and unsupervised machine learning techniques to detect the hazardous features per school in Mexico, and group these schools as per the data patterns themselves. For this, a data set containing Mexico’s schools and the proximal hazardous features for each school was built by spatially combining multiple official data sets. After that, the K-Modes partitional clustering machine learning algorithm was used with the created dataset. Multiple clustering models were built with this algorithm by testing various K values (number of clusters), and their clustering quality was measured with internal clustering evaluation metrics. The clustering model with the highest quality was the one that grouped Mexico’s schools into 11 clusters, each one indicating the most common(s) hazardous feature(s) to the schools of each cluster. The evaluation metrics results for this model were: Silhouette Score (0.72), Calinski-Harabasz Index (32863.96), Davies-Bouldin Index (0.72) indicating strong clustering results, with well-separated and cohesive clusters. The study was carried out by following the Cross Industry Standard Process for Data Mining (CRISP-DM) methodology, consisting in multiple phases and tasks for data mining projects execution. © The Author(s), under exclusive license to Springer Nature Switzerland AG 2025.
URI
http://hdl.handle.net/11407/8811
Collections
  • Indexados Scopus [2005]

Related items

Showing items related by title, author, creator and subject.

  • Thumbnail

    Towards Educational Sustainability: An AI System for Identifying and Preventing Student Dropout 

    Brand C E.J; Ramirez G.M; Diaz J; Moreira F. (Education Society of IEEE (Spanish Chapter)Ingeniería de SistemasFacultad de Ingenierías, 2024)
    The design and development of a web application to identify a high or low probability of student dropout at the National Learning Service (SENA) in Colombia, aiming to streamline the process of identifying and supporting ...
  • Thumbnail

    Understanding how collaborative governance mediates rural tourism and sustainable territory development: a systematic literature review 

    Valderrama E.-L; Polanco J.-A. (Education Society of IEEE (Spanish Chapter)Administración de EmpresasFacultad de Ciencias Económicas y Administrativas, 2024)
    The design and development of a web application to identify a high or low probability of student dropout at the National Learning Service (SENA) in Colombia, aiming to streamline the process of identifying and supporting ...
  • Thumbnail

    Toward Educational Sustainability: An AI System for Identifying and Preventing Student Dropout 

    Brand C E.J; Ramirez G.M; Diaz J; Moreira F. (Education Society of IEEE (Spanish Chapter)Ingeniería de SistemasFacultad de Ingenierías, 2024)
    The design and development of a web application to identify a high or low probability of student dropout at the National Learning Service (SENA) in Colombia, aiming to streamline the process of identifying and supporting ...
All of RI UdeMCommunities & CollectionsBy Issue DateAuthorsTitlesSubjectsThis CollectionBy Issue DateAuthorsTitlesSubjects
My AccountLoginRegister
Statistics GTMView statistics GTM
OFERTA ACADÉMICA
  • Oferta académica completa
  • Facultad de Derecho
  • Facultad de Comunicación
  • Facultad de Ingenierías
  • Facultad de Ciencias Económicas y Administrativas
  • Facultad de Ciencias Sociales y Humanas
  • Facultad de Ciencias Básicas
  • Facultad de Diseño
SERVICIOS
  • Teatro
  • Educación continuada
  • Centro de Idiomas
  • Consultorio Jurídico
  • Centro de Asesorías y Consultorías
  • Prácticas empresariales
  • Operadora Profesional de Certámenes
INVESTIGACIÓN
  • Biblioteca
  • Centros de investigación
  • Revistas científicas
  • Repositorio institucional
  • Universidad - Empresa - Estado - Sociedad

Universidad de Medellín - Teléfono: +57 (4) 590 4500 Ext. 11422 - Dirección: Carrera 87 N° 30 - 65 Medellín - Colombia - Suramérica
© Copyright 2012 ® Todos los Derechos Reservados
Contacto

 infotegra.com