Posts
1568
Following
138
Followers
877
I'm a bit of an eclectic mess ๐Ÿ™‚ I've been a programmer, journalist, editor, TV producer, and a few other things.

I'm currently working on my second novel which is complete, but is in the edit stage. I wrote my first novel over 20 years ago but then didn't write much till now.

I post about #Coding, #Flutter, #Writing, #Movies and #TV. I'll also talk about #Technology, #Gadgets, #MachineLearning, #DeepLearning and a few other things as the fancy strikes ...

Lived in: ๐Ÿ‡ฑ๐Ÿ‡ฐ๐Ÿ‡ธ๐Ÿ‡ฆ๐Ÿ‡บ๐Ÿ‡ธ๐Ÿ‡ณ๐Ÿ‡ฟ๐Ÿ‡ธ๐Ÿ‡ฌ๐Ÿ‡ฒ๐Ÿ‡พ๐Ÿ‡ฆ๐Ÿ‡ช๐Ÿ‡ซ๐Ÿ‡ท๐Ÿ‡ช๐Ÿ‡ธ๐Ÿ‡ต๐Ÿ‡น๐Ÿ‡ถ๐Ÿ‡ฆ๐Ÿ‡จ๐Ÿ‡ฆ

Fahim Farook

"Attend-and-Excite: Attention-Based Semantic Guidance for Text-to-Image Diffusion Models. (arXiv:2301.13826v1 [cs.CV])" โ€” A process which intervenes in the generative process of diffusion models on the fly during inference time to improve the faithfulness of the generated images to guide the model to refine the cross-attention units to attend to all subject tokens in the text prompt and strengthen - or excite - their activations, encouraging the model to generate all subjects described in the text prompt.

Paper: http://arxiv.org/abs/2301.13826
Code: https://github.com/AttendAndExcite/Attend-and-Excite

#AI #CV #NewPaper #DeepLearning #MachineLearning

<<Find this useful? Please boost so that others can benefit too ๐Ÿ™‚>>
Given a pre-trained text-to-imaโ€ฆ
0
1
0

Fahim Farook

"Grounding Language Models to Images for Multimodal Generation. (arXiv:2301.13823v1 [cs.CL])" โ€” An efficient method to ground pretrained text-only language models to the visual domain, enabling them to process and generate arbitrarily interleaved image-and-text data. This approach apparently works with any off-the-shelf language model.

Paper: http://arxiv.org/abs/2301.13823

#AI #CV #NewPaper #DeepLearning #MachineLearning

<<Find this useful? Please boost so that others can benefit too ๐Ÿ™‚>>
Our method grounds a language mโ€ฆ
1
1
0

Fahim Farook

"UPop: Unified and Progressive Pruning for Compressing Vision-Language Transformers. (arXiv:2301.13741v1 [cs.CV])" โ€” A universal vison-language Transformer compression framework which can handle multiple generative and discriminative vision-language tasks such as Visual Reasoning, Image Caption, Visual Question Answer, Image-Text Retrieval, Text-Image Retrieval, and Image Classification.

Paper: http://arxiv.org/abs/2301.13741

#AI #CV #NewPaper #DeepLearning #MachineLearning

<<Find this useful? Please boost so that others can benefit too ๐Ÿ™‚>>
Comparison between the Mask-basโ€ฆ
0
2
1

Fahim Farook

"DisDiff: Unsupervised Disentanglement of Diffusion Probabilistic Models. (arXiv:2301.13721v1 [cs.CV])" โ€” A new task to take advantage of the remarkable modeling ability of diffusion probabilistic models (DPM) using an unsupervised approach.

Paper: http://arxiv.org/abs/2301.13721

#AI #CV #NewPaper #DeepLearning #MachineLearning

<<Find this useful? Please boost so that others can benefit too ๐Ÿ™‚>>
Illustration of disentanglementโ€ฆ
0
3
1

Fahim Farook

"Learning Data Representations with Joint Diffusion Models. (arXiv:2301.13622v1 [cs.LG])" โ€” A joint diffusion model that simultaneously learns meaningful internal representations fit for both generative and predictive tasks and which has superior performance across various tasks, including generative modeling, semi-supervised classification, and domain adaptation.

Paper: http://arxiv.org/abs/2301.13622

#AI #CV #NewPaper #DeepLearning #MachineLearning

<<Find this useful? Please boost so that others can benefit too ๐Ÿ™‚>>
Data representation zt in a UNeโ€ฆ
0
1
0

Fahim Farook

"NP-Match: Towards a New Probabilistic Model for Semi-Supervised Learning. (arXiv:2301.13569v1 [cs.CV])" โ€” Adapting neural processes (NPs) for semi-supervised image classification tasks to arrive at a solution with much less computational overhead, which can save time at both the training and the testing phases.

Paper: http://arxiv.org/abs/2301.13569
Code: https://github.com/jianf-wang/np-match

#AI #CV #NewPaper #DeepLearning #MachineLearning

<<Find this useful? Please boost so that others can benefit too ๐Ÿ™‚>>
Overview of NP-Match: it contaiโ€ฆ
0
1
1

Fahim Farook

"Domain-Generalizable Multiple-Domain Clustering. (arXiv:2301.13530v1 [cs.LG])" โ€” Given unlabeled samples from multiple source domains, an attempt to learn a shared classifier that assigns the examples to various clusters by using the classifier for predicting cluster assignments in a previously unseen domain.

Paper: http://arxiv.org/abs/2301.13530

#AI #CV #NewPaper #DeepLearning #MachineLearning

<<Find this useful? Please boost so that others can benefit too ๐Ÿ™‚>>
Problem statement - given unlabโ€ฆ
0
1
0

Fahim Farook

"Fourier Sensitivity and Regularization of Computer Vision Models. (arXiv:2301.13514v1 [cs.CV])" โ€” A study of the frequency sensitivity characteristics of deep neural networks using a principled approach due to recent work showing that deep neural networks latch on to the Fourier statistics of training data and show increased sensitivity to Fourier-basis directions in the input.

Paper: http://arxiv.org/abs/2301.13514

#AI #CV #NewPaper #DeepLearning #MachineLearning

<<Find this useful? Please boost so that others can benefit too ๐Ÿ™‚>>
Computing Fourier-sensitivity. โ€ฆ
0
2
1

Fahim Farook

"Conversational Automated Program Repair" โ€” A method to help developers automatically generate patches for bugs using Large Language Models (LLMs) using a conversational approach for patch generation and validation.

Paper: https://arxiv.org/abs/2301.13246

#AI #NewPaper #MachineLearning #SoftwareEngineering

<<Find this useful? Please boost so that others can benefit too ๐Ÿ™‚>>
Overview of conversational APR โ€ฆ
0
0
1

Fahim Farook

"Continuous Spatiotemporal Transformers. (arXiv:2301.13338v1 [cs.LG])" โ€” A new transformer architecture that is designed for the modeling of continuous systems which guarantees a continuous and smooth output via optimization in Sobolev space.

Paper: http://arxiv.org/abs/2301.13338

#AI #CV #NewPaper #DeepLearning #MachineLearning

<<Find this useful? Please boost so that others can benefit too ๐Ÿ™‚>>
Diagram of CSTโ€™s workflow. (A) โ€ฆ
0
1
1

Fahim Farook

"ERA-Solver: Error-Robust Adams Solver for Fast Sampling of Diffusion Probabilistic Models. (arXiv:2301.12935v2 [cs.LG] UPDATED)" โ€” An error-robust Adams solver (ERA-Solver), which utilizes the implicit Adams numerical method that consists of a predictor and a corrector.

Paper: http://arxiv.org/abs/2301.12935
Note: The PDF for this version of the paper is currently not available

#AI #CV #NewPaper #DeepLearning #MachineLearning

<<Find this useful? Please boost so that others can benefit too ๐Ÿ™‚>>
We adopt the pretrained diffusiโ€ฆ
0
1
0

Fahim Farook

"PromptMix: Text-to-image diffusion models enhance the performance of lightweight networks. (arXiv:2301.12914v2 [cs.CV] UPDATED)" โ€” A method for artificially boosting the size of existing datasets, that can be used to improve the performance of lightweight networks.

Paper: http://arxiv.org/abs/2301.12914

#AI #CV #NewPaper #DeepLearning #MachineLearning

<<Find this useful? Please boost so that others can benefit too ๐Ÿ™‚>>
Overview of PromptMix
0
1
0

Fahim Farook

"MOAT: Alternating Mobile Convolution and Attention Brings Strong Vision Models. (arXiv:2210.01820v2 [cs.CV] UPDATED)" โ€” A family of neural networks that build on top of mobile convolution (i.e., inverted residual blocks) and attention which not only enhances the network representation capacity, but also produces better downsampled features.

Paper: http://arxiv.org/abs/2210.01820
Code: https://github.com/google-research/deeplab2

#AI #CV #NewPaper #DeepLearning #MachineLearning

<<Find this useful? Please boost so that others can benefit too ๐Ÿ™‚>>
Block comparison. (a) The MBConโ€ฆ
0
1
0

Fahim Farook

"DAG: Depth-Aware Guidance with Denoising Diffusion Probabilistic Models. (arXiv:2212.08861v2 [cs.CV] UPDATED)" โ€” A guidance method for diffusion models that uses estimated depth information derived from the rich intermediate representations of diffusion models.

Paper: http://arxiv.org/abs/2212.08861

#AI #CV #NewPaper #DeepLearning #MachineLearning

<<Find this useful? Please boost so that others can benefit too ๐Ÿ™‚>>
Qualitative comparisons of syntโ€ฆ
0
1
0

Fahim Farook

"Training-Free Structured Diffusion Guidance for Compositional Text-to-Image Synthesis. (arXiv:2212.05032v2 [cs.CV] UPDATED)" โ€” Improving the compositional skills of text-to-image models; specifically, obtainining more accurate attribute binding and better image compositions by incorporating linguistic structures with the diffusion guidance process based on the controllable properties of manipulating cross-attention layers in diffusion-based models.

Paper: http://arxiv.org/abs/2212.05032
Code: https://github.com/weixi-feng/structured-diffusion-guidance

#AI #CV #NewPaper #DeepLearning #MachineLearning

<<Find this useful? Please boost so that others can benefit too ๐Ÿ™‚>>
Three challenging phenomena in โ€ฆ
0
1
1

Fahim Farook

"Refining Generative Process with Discriminator Guidance in Score-based Diffusion Models. (arXiv:2211.17091v2 [cs.CV] UPDATED)" โ€” A generative SDE with score adjustment using an auxiliary discriminator with the goal of improving the original generative process of a pre-trained diffusion model by estimating the gap between the pre-trained score estimation and the true data score.

Paper: http://arxiv.org/abs/2211.17091

#AI #CV #NewPaper #DeepLearning #MachineLearning

<<Find this useful? Please boost so that others can benefit too ๐Ÿ™‚>>
Comparison of the denoising proโ€ฆ
0
1
0

Fahim Farook

"Learning on tree architectures outperforms a convolutional feedforward network. (arXiv:2211.11378v3 [cs.CV] UPDATED)" โ€” A 3-layer tree architecture inspired by experimental-based dendritic tree adaptations is developed and applied to the offline and online learning of the CIFAR-10 database to show that this architecture outperforms the achievable success rates of the 5-layer convolutional LeNet.

Paper: http://arxiv.org/abs/2211.11378

#AI #CV #NewPaper #DeepLearning #MachineLearning

<<Find this useful? Please boost so that others can benefit too ๐Ÿ™‚>>
Comparison of offline and onlinโ€ฆ
0
1
1

Fahim Farook

"Continual Learning by Modeling Intra-Class Variation. (arXiv:2210.05398v2 [cs.LG] UPDATED)" โ€” An examination of memory-based continual learning which identifies that large variation in the representation space is crucial for avoiding catastrophic forgetting.

Paper: http://arxiv.org/abs/2210.05398
Code: https://github.com/yulonghui/moca

#AI #CV #NewPaper #DeepLearning #MachineLearning

<<Find this useful? Please boost so that others can benefit too ๐Ÿ™‚>>
Average intra-class angle deviaโ€ฆ
0
1
0

Fahim Farook

"Scalable and Equivariant Spherical CNNs by Discrete-Continuous (DISCO) Convolutions. (arXiv:2209.13603v3 [cs.CV] UPDATED)" โ€” A hybrid discrete-continuous (DISCO) group convolution for spherical convolutional neural networks (CNN) that is simultaneously equivariant and computationally scalable to high-resolution.

Paper: http://arxiv.org/abs/2209.13603

#AI #CV #NewPaper #DeepLearning #MachineLearning

<<Find this useful? Please boost so that others can benefit too ๐Ÿ™‚>>
Spherical CNN categorization
0
0
1

Fahim Farook

"Multi-Level Visual Similarity Based Personalized Tourist Attraction Recommendation Using Geo-Tagged Photos. (arXiv:2109.08275v2 [cs.MM] UPDATED)" โ€” A geo-tagged photo based tourist attraction recommendation system which utilizes the visual contents of photos and interaction behavior data to obtain the final embeddings of users and tourist attractions, which are then used to predict the visit probabilities.

Paper: http://arxiv.org/abs/2109.08275
Code: https://github.com/revaludo/MEAL

#AI #CV #NewPaper #DeepLearning #MachineLearning

<<Find this useful? Please boost so that others can benefit too ๐Ÿ™‚>>
An illustration of the multi-leโ€ฆ
0
0
0
Show older