Please use this identifier to cite or link to this item: https://gnanaganga.inflibnet.ac.in:8443/jspui/handle/123456789/16085
Full metadata record
DC FieldValueLanguage
dc.contributor.authorIndra Kumar, R-
dc.contributor.authorEzil Sam Leni, A-
dc.date.accessioned2024-07-22T03:50:47Z-
dc.date.available2024-07-22T03:50:47Z-
dc.date.issued2024-05-01-
dc.identifier.citation53p.en_US
dc.identifier.urihttps://gnanaganga.inflibnet.ac.in:8443/jspui/handle/123456789/16085-
dc.description.abstractIn the realm of computer vision and natural language processing, the task of generating descriptive captions for images has garnered significant attention. This project explores the efficacy of transformer-based encoder-decoder models in addressing the challenge of image captioning. Leveraging state-of-the-art techniques, we develop a novel approach to generate contextually relevant and coherent captions for a diverse range of visual content. Drawing upon a rich dataset of images and corresponding captions, we employ a transformer-based architecture to learn the intricate relationship between visual features and textual descriptions. Through extensive experimentation and evaluation, we assess the performance of our model in terms of caption quality, semantic coherence, and generalization across different domains. The project adopts a collaborative and interdisciplinary approach, bringing together expertise from computer vision, natural language processing, and machine learning. By leveraging insights from these diverse fields, we aim to push the boundaries of image captioning and pave the way for more advanced and nuanced understanding of visual content. The findings of this project hold promise for various applications, including accessibility, content indexing, and human-computer interaction. By enabling automated generation of descriptive captions, our model has the potential to enhance user experience, improve content accessibility, and facilitate richer interaction with visual data. As we navigate the frontier of image captioning with transformer-based encoder-decoder models, this project contributes to the ongoing dialogue in the field and underscores the transformative potential of artificial intelligence in bridging the gap between visual and textual modalities.en_US
dc.language.isoenen_US
dc.publisherAlliance College of Engineering and Design, Alliance Universityen_US
dc.relation.ispartofseriesCSE_G16_2024 [20030141CSE021]-
dc.subjectAutomated Image Captioningen_US
dc.subjectTransformer-Based Encoder-Decoder Modelsen_US
dc.subjectLeveraging State-Of-The-Art Techniquesen_US
dc.titleOptimize Image Caption Generation Techniquesen_US
dc.typeOtheren_US
Appears in Collections:Dissertations - Alliance College of Engineering & Design

Files in This Item:
File SizeFormat 
CSE_G16_2024.pdf
  Restricted Access
1.85 MBAdobe PDFView/Open Request a copy


Items in DSpace are protected by copyright, with all rights reserved, unless otherwise indicated.