"UPop: Unified and Progressive Pruning for Compressing Vision-Language Transformers. (arXiv:2301.13741v1 [cs.CV])" — A universal vison-language Transformer compression framework which can handle multiple generative and discriminative vision-language tasks such as Visual Reasoning, Image Caption, Visual Question Answer, Image-Text Retrieval, Text-Image Retrieval, and Image Classification.
Paper:
http://arxiv.org/abs/2301.13741#AI #CV #NewPaper #DeepLearning #MachineLearning<<Find this useful? Please boost so that others can benefit too 🙂>>
Comparison between the Mask-bas…