Image Caption Generator with Music on Deep Learning Using Python


Ayesha Sen
Techno College of Engineering Agartala, Maheshkhala, Tripura, India
Sankha Subhra Debnath
Techno College of Engineering Agartala, Maheshkhala, Tripura, India


Background: Currently, most of the existing research relies on holistic methods, in which you can lose details of important aspects of the scene. Previous research has also focused only on images that people display on computer cameras. When you look at a picture in someone else's camera, you leave a note that this person no longer feels anything. To play music, you need to support a separate voice-activated device and speak the name of the specific song you want to hear.

Objective: This system requires you not to speak or store anything on your voice device. We show our image to the camera and the song will be played according to the generated text.  The goal of this work is to make a person positive, happy and inspire.

Methodology: Music exists in our daily life and has many uses. With the rapid increase in the number and size of music databases in recent years, the development of music accurate information retrieval (MIR) tools has    become an important topic in computer science. An image caption generator with music takes care of generating subtitles for specific images and playing songs based on computer vision captions.   Subtitling involves several steps: understanding the visual representation of objects, establishing relationships between   objects, and creating linguistic and semantically correct subtitle

Result and discussion: In particular, the proposed system outperforms existing techniques in processing mid-zone images in indoor scenes and reproducing music according to image scenes. Since human communication usually occurs through natural language, developing systems that play music and generate explanations that humans can understand is a challenging task in machine learning and artificial intelligence.

Future Work: This article focuses on recognition, recognition and writing through advanced learning. In the future, you can even use machine-readable quotes and strings when creating subtitles with music. Sometimes your computer can give you instructions when you need information about the cause of your grief.

January 28, 2022