Ítem
Acceso Abierto
Prueba de Concepto de Machine Learning para Identificar Empresas Colombianas Próximas a Exportar
| dc.contributor.advisor | Cruz Castro, Daniel Leonardo | |
| dc.creator | Sánchez Rojas, Jose Vicente | |
| dc.creator | Escárraga Vargas, Marco Jeisson | |
| dc.creator.degree | Magíster en Business Analytics | |
| dc.date.accessioned | 2025-08-06T12:25:58Z | |
| dc.date.available | 2025-08-06T12:25:58Z | |
| dc.date.created | 2025-07-18 | |
| dc.description | En un entorno jurídico altamente competitivo como el colombiano, donde las firmas líderes en propiedad intelectual enfrentan dificultades para captar nuevos clientes debido a la fidelidad de las empresas a sus proveedores legales, una firma legal con más de 70 años de trayectoria busca fortalecer su estrategia de crecimiento internacional mediante alianzas con firmas extranjeras. Esta estrategia se basa en identificar empresas colombianas con alto potencial de exportación para ofrecerles asesoría legal antes de que consoliden vínculos con firmas competidoras. El proyecto propone una solución analítica predictiva, basada en técnicas avanzadas de machine learning, para anticipar qué empresas, actualmente no exportadoras, iniciarán exportaciones en el siguiente año. Para ello, se construyó una base de datos integrada a partir de fuentes financieras (EMIS) y de comercio exterior (DIAN), aplicando ingeniería de características temporales y un riguroso proceso de limpieza y filtrado de datos. El modelo final se desarrolló mediante una estrategia de ensamblado (stacking) y validación cruzada estratificada, maximizando métricas como el recall y el AUC-PR en un contexto de fuerte desbalance de clases. Los resultados demostraron la viabilidad de identificar señales precursoras de exportación en datos históricos, permitiendo priorizar prospectos con alto potencial de internacionalización. Esta herramienta representa un aporte estratégico para la firma, al permitir la prospección de clientes basada en datos y no solo en intuición o relaciones preexistentes. Además, valida empíricamente una hipótesis inspirada en teorías de internacionalización empresarial y demuestra el valor de la analítica predictiva en contextos jurídicos, tradicionalmente analógicos. En conclusión, esta prueba de concepto sienta las bases para una futura implementación operativa que puede mejorar significativamente la eficiencia comercial y el posicionamiento global de la firma. | |
| dc.description.abstract | In a highly competitive legal environment such as Colombia’s, where leading intellectual property (IP) firms face challenges in acquiring new clients due to long-standing loyalty to incumbent legal providers, a law firm with over 70 years of experience seeks to strengthen its international growth strategy through alliances with foreign firms. This strategy is based on identifying Colombian companies with high export potential in order to approach them with legal advisory services before they form partnerships with competing firms. This project proposes a predictive analytical solution based on advanced machine learning techniques to anticipate which companies, currently non-exporters, are likely to begin exporting in the following year. To achieve this, a consolidated database was built using financial information (EMIS) and international trade data (DIAN), applying temporal feature engineering and a rigorous data cleaning and filtering process. The final model was developed using a stacked ensemble learning strategy and stratified cross-validation, maximizing metrics such as recall and AUC-PR in the context of a highly imbalanced classification problem. Results demonstrated the feasibility of detecting pre-export behavioral patterns in historical data, enabling the firm to prioritize high-potential prospects for internationalization. This tool represents a strategic advantage, allowing the firm to shift from intuition-based client acquisition to a data-driven prospecting approach. Furthermore, it empirically validates a hypothesis grounded in internationalization theory and showcases the value of predictive analytics in traditionally analog legal sectors. In conclusion, this proof of concept lays the groundwork for future operational implementation that could significantly improve the firm’s commercial efficiency and global positioning. | |
| dc.format.extent | 146 pp | |
| dc.format.mimetype | application/pdf | |
| dc.identifier.doi | https://doi.org/10.48713/10336_46218 | |
| dc.identifier.uri | https://repository.urosario.edu.co/handle/10336/46218 | |
| dc.language.iso | spa | |
| dc.publisher | Universidad del Rosario | |
| dc.publisher.department | Escuela de Administración | |
| dc.publisher.department | Escuela de Ingeniería, Ciencia y Tecnología | |
| dc.publisher.program | Maestría en Business Analytics | |
| dc.rights | Attribution-NonCommercial-NoDerivatives 4.0 International | * |
| dc.rights.accesRights | info:eu-repo/semantics/closedAccess | |
| dc.rights.acceso | Bloqueado (Texto referencial) | |
| dc.rights.uri | http://creativecommons.org/licenses/by-nc-nd/4.0/ | * |
| dc.source.bibliographicCitation | Aitken, B.; Harrison, A.; Lipsey, R. E. (1997) Spillovers, Foreign Investment, and Export Behavior. Vol. 43; No. 1-2; pp. 103 - 132; | |
| dc.source.bibliographicCitation | Andersson, S.; Gabrielsson, J.; Wictor, I. (2008) International entrepreneurship. Cheltenham, UK: Edward Elgar Publishing; | |
| dc.source.bibliographicCitation | Banco de la República (Colombia) (2023) Boletín de las Cuentas Nacionales Financieras por Sector Institucional – IV trimestre de 2024. Bogotá: Banco de la República; Disponible en: https://www.banrep.gov.co/es/publicaciones-investigaciones/boletin-cuentas-financieras/cuarto-trimestre-2024. | |
| dc.source.bibliographicCitation | Barney, J. (1991) Firm resources and sustained competitive advantage. Vol. 17; No. 1; pp. 99 - 120; | |
| dc.source.bibliographicCitation | Bellone, R.; Musso, F.; Nesta, L. (2010) Financial factors and export behaviour: A firm-level analysis. Vol. 19; No. 2; pp. 199 - 221; | |
| dc.source.bibliographicCitation | Bishop, C. M. (2006) Pattern recognition and machine learning. New York: Springer; | |
| dc.source.bibliographicCitation | Bonaccorsi, A. (1992) On the relationship between firm size and export intensity. Vol. 23; No. 4; pp. 673 - 695; | |
| dc.source.bibliographicCitation | Breiman, L. (2001) Random forests. Vol. 45; No. 1; pp. 5 - 32; Disponible en: 10.1023/A:1010933404324. | |
| dc.source.bibliographicCitation | Cassiman, B.; Golovko, E. (2011) Innovation and exports: Evidence from Spanish manufacturing firms. Vol. 40; No. 10; pp. 1414 - 1421; Disponible en: 10.1016/j.respol.2011.09.006. | |
| dc.source.bibliographicCitation | Chawla, N. V.; Bowyer, K. W.; Hall, L. O.; Kegelmeyer, W. P. (2002) SMOTE: Synthetic minority over-sampling technique. Vol. 16; pp. 321 - 357; | |
| dc.source.bibliographicCitation | Chen, T.; Guestrin, C. (2016) XGBoost: A scalable tree boosting system. pp. 785 - 794; San Francisco: ACM; | |
| dc.source.bibliographicCitation | Coase, R. H. (1937) The Nature of the Firm. Vol. 4; No. 16; pp. 386 - 405; Disponible en: 10.2307/2626876. | |
| dc.source.bibliographicCitation | Dirección de Impuestos y Aduanas Nacionales (2024) Bases Estadísticas de Comercio Exterior – Importaciones y Exportaciones. Consultado en: 2024/05/24. Disponible en: https://www.dian.gov.co/dian/cifras/Paginas/Bases-Estadisticas-de-Comercio-Exterior-Importaciones-y-Exportaciones.aspx. | |
| dc.source.bibliographicCitation | Dunning, J. H. (1980) Toward an eclectic theory of international production: Some empirical tests. Vol. 11; No. 1; pp. 9 - 31; | |
| dc.source.bibliographicCitation | Dunning, J. H. (1988) The eclectic paradigm of international production: A restatement and some possible extensions. Vol. 19; No. 1; pp. 1 - 31; | |
| dc.source.bibliographicCitation | EMIS (2024) Datos de las industrias de comercial, manufactura, científico y farmacéutico. Disponible en: https://www.emis.com/. | |
| dc.source.bibliographicCitation | Goodfellow, I.; Bengio, Y.; Courville, A. (2016) Deep learning. Cambridge: MIT Press; | |
| dc.source.bibliographicCitation | Greenaway, D.; Guariglia, A.; Kneller, R. (2007) Financial constraints, exporting behaviour and using export subsidies. Vol. 73; No. 1; pp. 197 - 212; | |
| dc.source.bibliographicCitation | Hansen, L. K.; Salamon, P. (1990) Neural network ensembles. Vol. 12; No. 10; pp. 993 - 1001; | |
| dc.source.bibliographicCitation | Hastie, T.; Tibshirani, R.; Friedman, J. (2009) The elements of statistical learning: Data mining, inference, and prediction. New York: Springer; | |
| dc.source.bibliographicCitation | He, H.; Garcia, E. A. (2009) Learning from imbalanced data. Vol. 21; No. 9; pp. 1263 - 1284; | |
| dc.source.bibliographicCitation | Helpman, E.; Melitz, M. J.; Yeaple, S. R. (2004) Export versus FDI with heterogeneous firms. Vol. 94; No. 1; pp. 300 - 316; | |
| dc.source.bibliographicCitation | Johanson, J.; Mattsson, L. G. (1988) Internationalization in industrial systems—A network approach. pp. 287 - 311; London: Croom Helm; | |
| dc.source.bibliographicCitation | Johanson, J.; Vahlne, J. E. (1977) The internationalization process of the firm—A model of knowledge development and increasing foreign market commitments. Vol. 8; No. 1; pp. 23 - 32; | |
| dc.source.bibliographicCitation | Johanson, J.; Vahlne, J. E. (2009) The Uppsala internationalization process model revisited: From liability of foreignness to liability of outsidership. Vol. 40; No. 9; pp. 1411 - 1431; | |
| dc.source.bibliographicCitation | Knight, G. A.; Cavusgil, S. T. (2004) Innovation, organizational capabilities, and the born-global firm. Vol. 35; No. 2; pp. 124 - 141; | |
| dc.source.bibliographicCitation | Leaders League (2024) Propiedad industrial. Disponible en: https://www.leadersleague.com/es/rankings/propiedad-industrial-marcas-registro-estudio-de-abogados-colombia-2024. | |
| dc.source.bibliographicCitation | Madhok, A. (1997) Cost, value and foreign market entry mode: The transaction and firm capabilities perspectives. Vol. 18; No. 1; pp. 39 - 61; | |
| dc.source.bibliographicCitation | Melitz, M. J. (2003) The impact of trade on intra-industry reallocations and aggregate industry productivity. Vol. 71; No. 6; pp. 1695 - 1725; | |
| dc.source.bibliographicCitation | Oviatt, B. M.; McDougall, P. P. (1994) Toward a theory of international new ventures. Vol. 25; No. 1; pp. 45 - 64; | |
| dc.source.bibliographicCitation | Powers, D. M. (2011) Evaluation: From precision, recall and F-measure to ROC, informedness, markedness and correlation. Vol. 2; No. 1; pp. 37 - 63; | |
| dc.source.bibliographicCitation | Rincón Olmos, D. A.; Cruz Castro, J. D. (2017) Ensambles de modelos de clasificación aplicados a la detección de ataques en redes neuronales. Bogotá: Universidad Distrital Francisco José de Caldas; Disponible en: https://repository.udistrital.edu.co/handle/11349/6727. | |
| dc.source.bibliographicCitation | Superintendencia de Industria y Comercio (2024) Estadísticas de propiedad industrial. Disponible en: https://www.sic.gov.co/estadisticas-propiedad-industrial. | |
| dc.source.bibliographicCitation | Vapnik, V. (1995) The nature of statistical learning theory. New York: Springer; | |
| dc.source.bibliographicCitation | Wagner, J. (2007) Exports and productivity: A survey of the evidence from firm-level data. Vol. 30; No. 1; pp. 60 - 82; | |
| dc.source.bibliographicCitation | Wernerfelt, B. (1984) A resource-based view of the firm. Vol. 5; No. 2; pp. 171 - 180; | |
| dc.source.bibliographicCitation | Williamson, O. E. (1975) Markets and hierarchies: Analysis and antitrust implications. New York: Free Press; | |
| dc.source.bibliographicCitation | Williamson, O. E. (1985) The economic institutions of capitalism. New York: Free Press; | |
| dc.source.bibliographicCitation | Wolpert, D. H. (1992) Stacked generalization. Vol. 5; No. 2; pp. 241 - 259; | |
| dc.source.instname | instname:Universidad del Rosario | |
| dc.source.reponame | reponame:Repositorio Institucional EdocUR | |
| dc.subject | Machine Learning | |
| dc.subject | Exportación | |
| dc.subject | Propiedad Intelectual | |
| dc.subject | Modelo Predictivo | |
| dc.subject | Captación de Clientes | |
| dc.subject | Analítica de Negocios | |
| dc.subject | Internacionalización Empresarial | |
| dc.subject.keyword | Machine Learning | |
| dc.subject.keyword | Exportation | |
| dc.subject.keyword | Intellectual Property | |
| dc.subject.keyword | Predictive Model | |
| dc.subject.keyword | Client Acquisition | |
| dc.subject.keyword | Business Analytics | |
| dc.subject.keyword | Business internationalization | |
| dc.title | Prueba de Concepto de Machine Learning para Identificar Empresas Colombianas Próximas a Exportar | |
| dc.title.TranslatedTitle | Proof of Concept of Machine Learning to Identify Colombian Companies Close to Exporting | |
| dc.type | masterThesis | |
| dc.type.hasVersion | info:eu-repo/semantics/acceptedVersion | |
| dc.type.spa | Tesis de maestría | |
| local.department.report | Escuela de Administración | |
| local.regiones | Bogotá |
Archivos
Bloque original
1 - 2 de 2
Cargando...
- Nombre:
- Bibliografia_Prueba_de_Concepto_de_Machine_Learning_Sanchez_Rojas_Jose_Vicente.ris
- Tamaño:
- 8.03 KB
- Formato:
- Descripción:
Cargando...
- Nombre:
- Prueba_de_Concepto_de_Machine_Learning_Sanchez_Rojas_Jose_Vicente_Marco_Jeisson_Escarraga_Vargas.pdf
- Tamaño:
- 2.02 MB
- Formato:
- Adobe Portable Document Format
- Descripción:



