The aim of image captioning research is to caption and annotate an image with a sentence that explains the image. It's not tough for humans but it is for machines, to make sense out of what is actually there but not seen. Facebook and Google, for example, use image recognition to monitor where you are, what you do, and other activities. In Proposed work, natural language processing and Deep Learning applied through Object Detection and Text Generation is used to generate description of an image automatically. This architecture includes a dense attention model with a CNN Encoder and RNN Decoder, an average BLEU score of 51.77 was observed. © 2021 IEEE.