Image Captioning using Luong Attention and SentencePiece Tokenizer

  • The HTML version of the Jupyter Notebook can be accessed here.
  • The Youtube link for the demo can be found here.
Fig 1. Sequence to Sequence architecture from [4]
Fig. 2 Bahdanau Attention from [6]
Fig. 3. Luong Attention from [7].
  1. Szegedy, Christian, et al. “Rethinking the inception architecture for computer vision.” Proceedings of the IEEE conference on computer vision and pattern recognition. 2016.
  2. Kudo, Taku, and John Richardson. “Sentencepiece: A simple and language independent subword tokenizer and detokenizer for neural text processing.” arXiv preprint arXiv:1808.06226 (2018).
  3. Sutskever, Ilya, Oriol Vinyals, and Quoc V. Le. “Sequence to sequence learning with neural networks.” Advances in neural information processing systems 27 (2014): 3104–3112.
  5. Bahdanau, Dzmitry, Kyunghyun Cho, and Yoshua Bengio. “Neural machine translation by jointly learning to align and translate.” arXiv preprint arXiv:1409.0473 (2014).
  7. Luong, Minh-Thang, Hieu Pham, and Christopher D. Manning. “Effective approaches to attention-based neural machine translation.” arXiv preprint arXiv:1508.04025 (2015).




A graduate student from University of Texas at Arlington

Love podcasts or audiobooks? Learn on the go with our new app.

Recommended from Medium

CVPR 2020 Recap

Autonomous Mars Rover using Raspberry Pi, Arduino and Pi Camera

A Brief Introduction To Deep Neural Networks.

A Deep Understanding of Logistic Regression with Geometric, Probabilistic and Loss minimization…

I trained a network to speak like me

My impression of Stanford’s Tensorflow course: Assignment 1

Morphological Transformation, remove border border of a closed image

Neural Network in R

Get the Medium app

A button that says 'Download on the App Store', and if clicked it will lead you to the iOS App store
A button that says 'Get it on, Google Play', and if clicked it will lead you to the Google Play store
Preetham Ganesh

Preetham Ganesh

A graduate student from University of Texas at Arlington

More from Medium

Speech Emotion Recognition using Convolution Neural Networks

Training Time Prediction of deep learning applications in the cloud

AI and Deep Learning: A Guide to What It Is, What It Does, And How To Get Started?

Designing an Intelligent Chest X-ray Abnormalities Detection (ICXAD) System using AI