"Visual Semantic Relatedness Dataset for Image Captioning. (arXiv:2301.08784v1 [cs.CL])" — A textual visual context dataset for captioning, in which the publicly available dataset COCO Captions has been extended with information about the scene (such as objects in the image).