Minería de opinión: un análisis en tiempo real de tweets para Zacatecas

César Alberto Collazos Ordóñez; Pedro Daniel Alaniz Lumbreras; Julián González Trinidad

DSpace Principal
→
Maestría en Ciencias del Procesamiento de la Información
→
*Tesis*-- M. en Ciencias del Proc. de la Info.
→
Ver ítem

dc.contributor	947649	es_ES
dc.contributor.advisor	Huizilopoztli Luna García	es_ES
dc.contributor.advisor	José María Celaya Padilla	es_ES
dc.contributor.author	César Alberto Collazos Ordóñez	es_ES
dc.contributor.author	Pedro Daniel Alaniz Lumbreras	es_ES
dc.contributor.author	Julián González Trinidad	es_ES
dc.coverage.spatial	Zacatecas, México	es_ES
dc.creator	Reveles Gómez, Luis Carlos
dc.date.accessioned	2022-02-03T19:26:17Z
dc.date.available	2022-02-03T19:26:17Z
dc.date.issued	2021-06-03
dc.identifier	info:eu-repo/semantics/publishedVersion	es_ES
dc.identifier.uri	http://ricaxcan.uaz.edu.mx/jspui/handle/20.500.11845/2900
dc.identifier.uri	http://dx.doi.org/10.48779/ricaxcan-20
dc.description	The Twitter social network has become an excellent tool to know in real time the opinions that users express on a great variety of topics. The formal analysis of the texts in tweets is the subject of numerous studies, derived from them, the emergence of emerging technologies such as Opinion Mining, where sentiment analysis is inert; which refers to the use of natural language processing to identify and extract subjective information from the texts [1]. By definition, sentiment analysis seeks to generate automatic tools capable of extracting subjective information to create structured and actionable knowledge [2]. In other words, this is a bulk document classification task automatically, depending on the positive or negative connotation of the language used in the document. This work focuses on performing sentiment analysis of Twitter comments georeferenced to the city of Zacatecas, such as a ranking of tweets tagged with their polarity, cleaning up the text of tweets, as well as extracting characteristics typical of the text don positive and negative polarity, using machine learning especially supervised learning algorithms to perform the classification. From the algorithms used it was obtained that Random Forest had a better accuracy by having 0.977, then Decision Trees with 0.9735 and SVM with 0.9551. With the results obtained it can be concluded that the improvement of the accuracy was achieved thanks to the features that were added, in addition it is shown that the supervised learning algorithms are classifying the tweets appropriately given the results obtained.	es_ES
dc.description.abstract	La red social Twitter se ha convertido en una excelente herramienta para conocer en tiempo real las opiniones que los usuarios expresan sobre una gran variedad de temas. El análisis formal de los textos en los tweets es objeto de numerosos estudios, derivado de ellos, se ha impulsado la aparición de tecnologías emergentes como la Minería de Opinión, donde está inerte el análisis de sentimientos; el cual se refiere al uso del procesamiento del lenguaje natural para identificar y extraer información subjetiva de los textos [1]. Por definición, el análisis de sentimientos busca generar herramientas automáticas capaces de extraer información subjetiva para crear conocimiento estructurado y procesable [2]. En otras palabras, se trata de una tarea de clasificación masiva de documentos de manera automática, en función de la connotación positiva o negativa del lenguaje utilizado en el documento. Este trabajo se centra en realizar análisis de sentimientos de comentarios de Twitter georreferenciado a la ciudad de Zacatecas, como una clasificación de los tweets etiquetados con su polaridad, realizando una limpieza del texto de los tweets, así como la extracción de características propias del texto como polaridad positiva y negativa, utilizando el machine learning en especial los algoritmos de aprendizaje supervisado para realizar la clasificación. De los algoritmos utilizados se obtuvo que Random Forest tuvo un mejor accuracy al tener 0.977, después Arboles de Decisión con 0.9735 y SVM con 0.9551. Con los resultados obtenidos se puede concluir que la mejora del accuracy se logró gracias a las características que se fueron agregando, además se demuestra que los algoritmos de aprendizaje supervisado están clasificando los tweets de manera adecuada dado los resultados obtenidos.	es_ES
dc.language.iso	spa	es_ES
dc.publisher	Universidad Autónoma de Zacatecas	es_ES
dc.relation.isbasedon	Maestro en Ciencias del Procesamiento de la Información	es_ES
dc.relation.uri	generalPublic	es_ES
dc.rights	Attribution-NonCommercial-NoDerivs 3.0 United States	*
dc.rights.uri	http://creativecommons.org/licenses/by-nc-nd/3.0/us/	*
dc.subject.classification	INGENIERIA Y TECNOLOGIA [7]	es_ES
dc.subject.other	Análisis y Procesamiento de Datos	es_ES
dc.subject.other	Inteligencia Artificial	es_ES
dc.subject.other	Tweets	es_ES
dc.subject.other	Zacatecas	es_ES
dc.title	Minería de opinión: un análisis en tiempo real de tweets para Zacatecas	es_ES
dc.type	info:eu-repo/semantics/masterThesis	es_ES