An image captioning model using ResNet-50 encoder + Transformer decoder trained on MS COCO. Served via FastAPI with a drag-and-drop frontend and Docker support for CPU deployment.
python docker natural-language-processing computer-vision deep-learning transformers cnn pytorch image-captioning resnet ms-coco uvicorn torchvision fastapi torch-vision encoder-decod
-
Updated
Mar 13, 2026 - Jupyter Notebook