Conversation

Fahim Farook

"Multimodal Event Transformer for Image-guided Story Ending Generation. (arXiv:2301.11357v1 [cs.CV])" — A multimodal event transformer, an event-based reasoning framework for image-guided story ending generation which constructs visual and semantic event graphs from story plots and ending image, and leverages event-based reasoning to reason and mine implicit information in a single modality.

Paper: http://arxiv.org/abs/2301.11357

#AI #CV #NewPaper #DeepLearning #MachineLearning

<<Find this useful? Please boost so that others can benefit too 🙂>>
Given a multi-sentence story pl…
0
3
2