Please use this identifier to cite or link to this item: https://gnanaganga.inflibnet.ac.in:8443/jspui/handle/123456789/16755
Title: Machine Learning-Based Text Categorization With Bag of Words
Authors: Singh, Uday Kumar
Prabhu Shankar, B
Chinnaiyan, R
Jain, Neeraj
Keywords: And Word Cloud
Bag-Of-Words (Bow)
Feature Extraction
Machine Learning
Matlab
Natural Language Processing
Preprocessing
Support Vector Machine
Text Analytics Toolbox
Text Categorization
Vector Space Representation
Issue Date: 2024
Publisher: Lecture Notes in Electrical Engineering
Springer Science and Business Media Deutschland GmbH
Citation: Vol. 1194; pp. 577-587
Abstract: Text categorization, a fundamental task in natural language processing (NLP), plays a pivotal role in organizing and managing the ever-expanding volume of textual data across various domains. This research explores the machine learning applications techniques, with a specific emphasis on the Bag-of-Words (BOW) model, to automate the categorization of text documents. The BOW model is a straightforward yet effective representation method that transforms text data into numerical vectors, disregarding the word order and focusing solely on word frequency. BOW approach is to deal with the representation of text that can be used to any sort of the organization of text. This method is based on BOW concept, which measures the material available on Wikipedia, Gmail, Kaggle (https://www.kaggle.com/datasets), and other sites. The suggested approach is employed to create a Vector Space Representation, subsequently employed to educate a Support Vector Machine categorizer. The purpose is to arrange and gather document records from openly accessible datasets through social networking. The textual outcomes exhibit the contrast between the unprocessed data and the purified data exhibited on the word cloud.” © The Author(s), under exclusive license to Springer Nature Singapore Pte Ltd. 2024.
URI: https://doi.org/10.1007/978-981-97-2839-8_40
https://gnanaganga.inflibnet.ac.in:8443/jspui/handle/123456789/16755
ISBN: 9789819728381
ISSN: 1876-1100
Appears in Collections:Conference Papers

Files in This Item:
There are no files associated with this item.


Items in DSpace are protected by copyright, with all rights reserved, unless otherwise indicated.