A neural network to generate captions for an image using CNN and RNN with BEAM Search.
Examples
Image Credits : Towardsdatascience
Recommended System Requirements to train model.
- A good CPU and a GPU with atleast 8GB memory
- Atleast 8GB of RAM
- Active internet connection so that keras can download inceptionv3/vgg16 model weights
Required libraries for Python along with their version numbers used while making & testing of this project
- Python - 3.6.7
- Numpy - 1.16.4
- Tensorflow - 1.13.1
- Keras - 2.2.4
- nltk - 3.2.5
- PIL - 4.3.0
- Matplotlib - 3.0.3
- tqdm - 4.28.1
Here are some direct download links:
- Flickr8k_Dataset
- Flickr8k_text Download Link Credits: Jason Brownlee
Note: You can also build your model on online platforms like Google Colab or Kaggle or any other platforms if you do not have the required specifications, as I did.
- Show and Tell: A Neural Image Caption Generator - Oriol Vinyals, Alexander Toshev, Samy Bengio, Dumitru Erhan
- Where to put the Image in an Image Caption Generator - Marc Tanti, Albert Gatt, Kenneth P. Camilleri
- How to Develop a Deep Learning Photo Caption Generator from Scratch
