Posts
1411
Following
142
Followers
868
I'm a bit of an eclectic mess 🙂 I've been a programmer, journalist, editor, TV producer, and a few other things.

I'm currently working on my second novel which is complete, but is in the edit stage. I wrote my first novel over 20 years ago but then didn't write much till now.

I post about #Coding, #Flutter, #Writing, #Movies and #TV. I'll also talk about #Technology, #Gadgets, #MachineLearning, #DeepLearning and a few other things as the fancy strikes ...

Lived in: 🇱🇰🇸🇦🇺🇸🇳🇿🇸🇬🇲🇾🇦🇪🇫🇷🇪🇸🇵🇹🇶🇦🇨🇦

Fahim Farook

"RGB Arabic Alphabets Sign Language Dataset. (arXiv:2301.11932v1 [cs.CV])" — An Arabic Alphabet Sign Language (AASL) dataset comprising of 7,856 raw and fully labelled RGB images of the Arabic sign language alphabets which might be the first such publicly available dataset.

Paper: http://arxiv.org/abs/2301.11932

#AI #CV #NewPaper #DeepLearning #MachineLearning

<<Find this useful? Please boost so that others can benefit too 🙂>>
Sample images from the dataset
0
1
1

Fahim Farook

"Input Perturbation Reduces Exposure Bias in Diffusion Models. (arXiv:2301.11706v1 [cs.LG])" — An exploration of the fact that the the long sampling chain in Denoising Diffusion Probabilistic Models (DDPM) leads to an error accumulation phenomenon, which is similar to the exposure bias problem in autoregressive text generation.

Paper: http://arxiv.org/abs/2301.11706

#AI #CV #NewPaper #DeepLearning #MachineLearning

<<Find this useful? Please boost so that others can benefit too 🙂>>
The inputs and the prediction t…
0
1
0

Fahim Farook

"Image Restoration with Mean-Reverting Stochastic Differential Equations. (arXiv:2301.11699v1 [cs.LG])" — A stochastic differential equation (SDE) approach for general-purpose image restoration which can restore images without relying on any task-specific prior knowledge.

Paper: http://arxiv.org/abs/2301.11699

#AI #CV #NewPaper #DeepLearning #MachineLearning

<<Find this useful? Please boost so that others can benefit too 🙂>>
An overview of our proposed con…
0
1
0

Fahim Farook

"Accelerating Guided Diffusion Sampling with Splitting Numerical Methods. (arXiv:2301.11558v1 [cs.CV])" — A solution to speeding up guided diffusion image generation based on operator splitting methods, motivated by the finding that classical high-order numerical methods are unsuitable for the conditional function.

Paper: http://arxiv.org/abs/2301.11558

#AI #CV #NewPaper #DeepLearning #MachineLearning

<<Find this useful? Please boost so that others can benefit too 🙂>>
Generated samples of a classifi…
0
1
0

Fahim Farook

"3DShape2VecSet: A 3D Shape Representation for Neural Fields and Generative Diffusion Models. (arXiv:2301.11445v1 [cs.CV])" — A novel shape representation for neural fields designed for generative diffusion models, which can encode 3D shapes given as surface models or point clouds, and represents them as neural fields.

Paper: http://arxiv.org/abs/2301.11445

#AI #CV #NewPaper #DeepLearning #MachineLearning

<<Find this useful? Please boost so that others can benefit too 🙂>>
Left: Shape autoencoding result…
0
1
0

Fahim Farook

"Improving Cross-modal Alignment for Text-Guided Image Inpainting. (arXiv:2301.11362v1 [cs.CV])" — A model for text-guided image inpainting by improving cross-modal alignment (CMA) using cross-modal alignment distillation and in-sample distribution distillation.

Paper: http://arxiv.org/abs/2301.11362

#AI #CV #NewPaper #DeepLearning #MachineLearning

<<Find this useful? Please boost so that others can benefit too 🙂>>
Different categories of vision-…
0
1
0

Fahim Farook

"Rethinking 1x1 Convolutions: Can we train CNNs with Frozen Random Filters?. (arXiv:2301.11360v1 [cs.CV])" — An exploration into whether Convolutional Neural Networks (CNN) learning the weights of vast numbers of convolutional operators is really necessary.

Paper: http://arxiv.org/abs/2301.11360

#AI #CV #NewPaper #DeepLearning #MachineLearning

<<Find this useful? Please boost so that others can benefit too 🙂>>
Validation accuracy of LCResNet…
0
4
4

Fahim Farook

"Multimodal Event Transformer for Image-guided Story Ending Generation. (arXiv:2301.11357v1 [cs.CV])" — A multimodal event transformer, an event-based reasoning framework for image-guided story ending generation which constructs visual and semantic event graphs from story plots and ending image, and leverages event-based reasoning to reason and mine implicit information in a single modality.

Paper: http://arxiv.org/abs/2301.11357

#AI #CV #NewPaper #DeepLearning #MachineLearning

<<Find this useful? Please boost so that others can benefit too 🙂>>
Given a multi-sentence story pl…
0
3
2

Fahim Farook

"Animating Still Images. (arXiv:2209.10497v2 [cs.CV] UPDATED)" — A method for imparting motion to a still 2D image which uses deep learning to segment part of the image as the subject, uses in-paining to complete the background, and then adds animation to the subject.

Paper: http://arxiv.org/abs/2209.10497

#AI #CV #NewPaper #DeepLearning #MachineLearning

<<Find this useful? Please boost so that others can benefit too 🙂>>
Interactive segmentation: Green…
0
1
1

Fahim Farook

"VAuLT: Augmenting the Vision-and-Language Transformer for Sentiment Classification on Social Media. (arXiv:2208.09021v3 [cs.CV] UPDATED)" — An extension of the popular Vision-and-Language Transformer (ViLT) to improve performance on vision-and-language (VL) tasks that involve more complex text inputs than image captions while having minimal impact on training and inference efficiency.

Paper: http://arxiv.org/abs/2208.09021
Code: https://github.com/gchochla/vault

#AI #CV #NewPaper #DeepLearning #MachineLearning

<<Find this useful? Please boost so that others can benefit too 🙂>>
VAuLT propagates representation…
0
1
0

Fahim Farook

"SIViDet: Salient Image for Efficient Weaponized Violence Detection. (arXiv:2207.12850v4 [cs.CV] UPDATED)" — A new dataset that contains videos depicting weaponized violence, non-weaponized violence, and non-violent events; and a proposal for a novel data-centric method that arranges video frames into salient images while minimizing information loss for comfortable inference by SOTA image classifiers.

Paper: http://arxiv.org/abs/2207.12850
Code: https://github.com/Ti-Oluwanimi/Violence_Detection

#AI #CV #NewPaper #DeepLearning #MachineLearning

<<Find this useful? Please boost so that others can benefit too 🙂>>
Salient Image: A sequence of vi…
0
1
0

Fahim Farook

"BridgeTower: Building Bridges Between Encoders in Vision-Language Representation Learning. (arXiv:2206.08657v3 [cs.CV] UPDATED)" — A proposal for multiple bridge layers that build a connection between the top layers of uni-modal encoders and each layer of the cross-modal encoder.

Paper: http://arxiv.org/abs/2206.08657
Code: https://github.com/microsoft/BridgeTower

#AI #CV #NewPaper #DeepLearning #MachineLearning

<<Find this useful? Please boost so that others can benefit too 🙂>>
(a) – (d) are four categories o…
0
1
0

Fahim Farook

"Text-To-4D Dynamic Scene Generation. (arXiv:2301.11280v1 [cs.CV])" — A method for generating three-dimensional dynamic scenes from text descriptions which uses a 4D dynamic Neural Radiance Field (NeRF), which is optimized for scene appearance, density, and motion consistency by querying a Text-to-Video (T2V) diffusion-based model.

Paper: http://arxiv.org/abs/2301.11280

#AI #CV #NewPaper #DeepLearning #MachineLearning

<<Find this useful? Please boost so that others can benefit too 🙂>>
Samples generated by MAV3D alon…
0
1
1

Fahim Farook

"BiBench: Benchmarking and Analyzing Network Binarization. (arXiv:2301.11233v1 [cs.CV])" — A rigorously designed benchmark with in-depth analysis for network binarization where they scrutinize the requirements of binarization in the actual production and define evaluation tracks and metrics for a comprehensive investigation.

Paper: http://arxiv.org/abs/2301.11233

#AI #CV #NewPaper #DeepLearning #MachineLearning

<<Find this useful? Please boost so that others can benefit too 🙂>>
Evaluation tracks of BiBench. O…
0
1
0

Fahim Farook

"Improving Statistical Fidelity for Neural Image Compression with Implicit Local Likelihood Models. (arXiv:2301.11189v1 [eess.IV])" — A non-binary discriminator that is conditioned on quantized local image representations obtained via VQ-VAE autoencoders, for lossy image compression.

Paper: http://arxiv.org/abs/2301.11189

#AI #CV #NewPaper #DeepLearning #MachineLearning

<<Find this useful? Please boost so that others can benefit too 🙂>>
Comparison of distortion vs. st…
0
0
0

Fahim Farook

"Revisiting Temporal Modeling for CLIP-based Image-to-Video Knowledge Transferring. (arXiv:2301.11116v1 [cs.CV])" — A look at temporal modeling in the context of image-to-video knowledge transferring, which is the key point for extending image-text pretrained models to the video domain.

Paper: http://arxiv.org/abs/2301.11116

#AI #CV #NewPaper #DeepLearning #MachineLearning

<<Find this useful? Please boost so that others can benefit too 🙂>>
(a) Illustration of temporal mo…
0
1
0

Fahim Farook

"Explaining Visual Biases as Words by Generating Captions. (arXiv:2301.11104v1 [cs.LG])" — Diagnosing the potential biases in image classifiers by leveraging two types (generative and discriminative) of pre-trained vision-language models to describe the visual bias as a word.

Paper: http://arxiv.org/abs/2301.11104

#AI #CV #NewPaper #DeepLearning #MachineLearning

<<Find this useful? Please boost so that others can benefit too 🙂>>
Concept of the proposed bias-to…
0
1
0

Fahim Farook

"Vision-Language Models Performing Zero-Shot Tasks Exhibit Gender-based Disparities. (arXiv:2301.11100v1 [cs.CV])" — An exploration of the extent to which zero-shot vision-language models exhibit gender bias for different vision tasks.

Paper: http://arxiv.org/abs/2301.11100

#AI #CV #NewPaper #DeepLearning #MachineLearning

<<Find this useful? Please boost so that others can benefit too 🙂>>
(a) The average precision (AP) …
0
0
0

Fahim Farook

"simple diffusion: End-to-end diffusion for high resolution images. (arXiv:2301.11093v1 [cs.CV])" — Improve denoising diffusion for high resolution images while keeping the model as simple as possible and obtaining performance comparable to the latent diffusion-based approaches?

Paper: http://arxiv.org/abs/2301.11093

#AI #CV #NewPaper #DeepLearning #MachineLearning

<<Find this useful? Please boost so that others can benefit too 🙂>>
A dslr photo of a frog wearing…
0
1
1

Fahim Farook

"Rewarded meta-pruning: Meta Learning with Rewards for Channel Pruning. (arXiv:2301.11063v1 [cs.CV])" — A method to reduce the parameters and FLOPs for computational efficiency in deep learning models by introducing accuracy and efficiency coefficients to control the trade-off between the accuracy of the network and its computing efficiency.

Paper: http://arxiv.org/abs/2301.11063

#AI #CV #NewPaper #DeepLearning #MachineLearning

<<Find this useful? Please boost so that others can benefit too 🙂>>
Reward at varying Accuracies an…
0
1
0
Show older