Posts
1368
Following
141
Followers
868
I'm a bit of an eclectic mess ๐Ÿ™‚ I've been a programmer, journalist, editor, TV producer, and a few other things.

I'm currently working on my second novel which is complete, but is in the edit stage. I wrote my first novel over 20 years ago but then didn't write much till now.

I post about #Coding, #Flutter, #Writing, #Movies and #TV. I'll also talk about #Technology, #Gadgets, #MachineLearning, #DeepLearning and a few other things as the fancy strikes ...

Lived in: ๐Ÿ‡ฑ๐Ÿ‡ฐ๐Ÿ‡ธ๐Ÿ‡ฆ๐Ÿ‡บ๐Ÿ‡ธ๐Ÿ‡ณ๐Ÿ‡ฟ๐Ÿ‡ธ๐Ÿ‡ฌ๐Ÿ‡ฒ๐Ÿ‡พ๐Ÿ‡ฆ๐Ÿ‡ช๐Ÿ‡ซ๐Ÿ‡ท๐Ÿ‡ช๐Ÿ‡ธ๐Ÿ‡ต๐Ÿ‡น๐Ÿ‡ถ๐Ÿ‡ฆ๐Ÿ‡จ๐Ÿ‡ฆ

Fahim Farook

"Meta Learning to Bridge Vision and Language Models for Multimodal Few-Shot Learning. (arXiv:2302.14794v1 [cs.CV])" โ€” Rather than using a frozen language model to communicate visual concepts, this method uses a meta -mapper to act as a bridge between large-scale visiona and language models.

Paper: http://arxiv.org/abs/2302.14794

#AI #CV #NewPaper #DeepLearning #MachineLearning

<<Find this useful? Please boost so that others can benefit too ๐Ÿ™‚>>
Multimodal few-shot meta-learniโ€ฆ
0
0
1

Fahim Farook

"TextIR: A Simple Framework for Text-based Editable Image Restoration. (arXiv:2302.14736v1 [cs.CV])" โ€” Using text input to restore damaged images by specifying how to fill in the damage areas by way of text descriptions.

Paper: http://arxiv.org/abs/2302.14736

#AI #CV #NewPaper #DeepLearning #MachineLearning

<<Find this useful? Please boost so that others can benefit too ๐Ÿ™‚>>
Overview of image restoration rโ€ฆ
0
0
0

Fahim Farook

"Towards Enhanced Controllability of Diffusion Models. (arXiv:2302.14368v1 [cs.CV])" โ€” Creating a diffusion model that is easier to edit/style based on input images by conditioning the model on a spatial content mask and a flattened style embedding.

Paper: http://arxiv.org/abs/2302.14368

#AI #CV #NewPaper #DeepLearning #MachineLearning

<<Find this useful? Please boost so that others can benefit too ๐Ÿ™‚>>
Comparison of the proposed modeโ€ฆ
0
0
0

Fahim Farook

"One-Shot Video Inpainting. (arXiv:2302.14362v1 [cs.CV])" โ€” A method to inpaint videos where instead of having to provide masks for each frame, you only need to provide the object mask for the initial frame in the video sequence.

Paper: http://arxiv.org/abs/2302.14362

#AI #CV #NewPaper #DeepLearning #MachineLearning

<<Find this useful? Please boost so that others can benefit too ๐Ÿ™‚>>
Qualitative comparison between โ€ฆ
0
1
1

Fahim Farook

"Deep Learning for Identifying Iran's Cultural Heritage Buildings in Need of Conservation Using Image Classification and Grad-CAM. (arXiv:2302.14354v1 [cs.CV])" โ€” Using machine learning to identify damage and defects to cultural heritage buildings using Convolutional Neural Networks (CNN).

Paper: http://arxiv.org/abs/2302.14354

#AI #CV #NewPaper #DeepLearning #MachineLearning

<<Find this useful? Please boost so that others can benefit too ๐Ÿ™‚>>
Comparing a picture with small-โ€ฆ
0
0
1

Fahim Farook

"Accuracy and Fidelity Comparison of Luna and DALL-E 2 Diffusion-Based Image Generation Systems. (arXiv:2301.01914v2 [cs.CV] UPDATED)" โ€” Comparing the accuracy and fidelity of images generated by DALL-E 2 and Luna, which is Stable Diffusion-based.

Paper: http://arxiv.org/abs/2301.01914
Luna code: https://github.com/slowy07/luna

#AI #CV #NewPaper #DeepLearning #MachineLearning

<<Find this useful? Please boost so that others can benefit too ๐Ÿ™‚>>
Selected image samples from theโ€ฆ
0
0
1

Fahim Farook

"Diffusion Posterior Sampling for General Noisy Inverse Problems. (arXiv:2209.14687v3 [stat.ML] UPDATED)" โ€” Extending diffusion solvers to efficiently handle general noisy (non)linear inverse problems via approximation of the posterior sampling.

Paper: http://arxiv.org/abs/2209.14687
Code: https://github.com/dps2022/diffusion-posterior-sampling

#AI #CV #NewPaper #DeepLearning #MachineLearning

<<Find this useful? Please boost so that others can benefit too ๐Ÿ™‚>>
Solving noisy linear, and nonliโ€ฆ
0
1
2

Fahim Farook

"Subspace Diffusion Generative Models. (arXiv:2205.01490v2 [cs.LG] UPDATED)" โ€” Restricting diffusion via projections onto subspaces to reduce computational time and cost without affecting the overall quality of the generated image.

Paper: http://arxiv.org/abs/2205.01490
Code: https://github.com/bjing2016/subspace-diffusion

#AI #CV #NewPaper #DeepLearning #MachineLearning

<<Find this useful? Please boost so that others can benefit too ๐Ÿ™‚>>
Random high resolution samples โ€ฆ
0
0
0

Fahim Farook

"Large Scale Visual Food Recognition. (arXiv:2103.16107v3 [cs.CV] UPDATED)" โ€” A food dataset with 2,000 categories and over 1 million images that can be used for food recognition.

Paper: http://arxiv.org/abs/2103.16107
Code: https://github.com/Liuyuxinict/prenet/

#AI #CV #NewPaper #DeepLearning #MachineLearning

<<Find this useful? Please boost so that others can benefit too ๐Ÿ™‚>>
The distributions over each catโ€ฆ
0
1
1

Fahim Farook

"Directed Diffusion: Direct Control of Object Placement through Attention Guidance. (arXiv:2302.13153v1 [cs.CV])" โ€” Controlling object placement in diffusion models by way of attention guidance.

Paper: http://arxiv.org/abs/2302.13153

#AI #CV #NewPaper #DeepLearning #MachineLearning

<<Find this useful? Please boost so that others can benefit too ๐Ÿ™‚>>
Directed Diffusion (DD) key resโ€ฆ
0
0
1

Fahim Farook

"In What Languages are Generative Language Models the Most Formal? Analyzing Formality Distribution across Languages" โ€” Measuring the formality of the generated text for different languages using multilingual generative language models.

Paper: https://arxiv.org/abs/2302.12299

#AI #NewPaper #DeepLearning #MachineLearning #Language

<<Find this useful? Please boost so that others can benefit too ๐Ÿ™‚>>
Differences between formal and โ€ฆ
0
0
1

Fahim Farook

"ISS: Image as Stepping Stone for Text-Guided 3D Shape Generation. (arXiv:2209.04145v6 [cs.CV] UPDATED)" โ€” Using 2D images as a stepping stone for creating 3D shapes and eliminating the need for paired text-shape data.

Paper: http://arxiv.org/abs/2209.04145
Code: https://github.com/liuzhengzhe/ISS-Image-as-Stepping-Stone-for-Text-Guided-3D-Shape-Generation

#AI #CV #NewPaper #DeepLearning #MachineLearning

<<Find this useful? Please boost so that others can benefit too ๐Ÿ™‚>>
Our novel โ€œImage as Stepping Stโ€ฆ
0
1
2

Fahim Farook

"Modulating Pretrained Diffusion Models for Multimodal Image Synthesis. (arXiv:2302.12764v1 [cs.CV])" โ€” Multimodal Conditioning Modules (MCM) for enabling conditional image synthesis using pretrained diffusion models so that you can generate images using not just a text prompt, but additional input such as a segmentation map or a sketch.

Paper: http://arxiv.org/abs/2302.12764

#AI #CV #NewPaper #DeepLearning #MachineLearning

<<Find this useful? Please boost so that others can benefit too ๐Ÿ™‚>>
Multimodal conditioning modulesโ€ฆ
0
1
1

Fahim Farook

"Surface Recognition for e-Scooter Using Smartphone IMU Sensor. (arXiv:2302.12720v1 [eess.SP])" โ€” Detecting whether an e-scooter is on a paved road or a sidewalk using the Inertial Measurement Unit (IMU) sensors on a smartphone.

Paper: http://arxiv.org/abs/2302.12720

#AI #CV #NewPaper #DeepLearning #MachineLearning

<<Find this useful? Please boost so that others can benefit too ๐Ÿ™‚>>
Left: an example of a street wiโ€ฆ
0
1
0

Fahim Farook

"Data-driven Approach for Automatically Correcting Faulty Road Maps. (arXiv:2211.06544v2 [cs.CV] UPDATED)" โ€” A method to fix faulty road maps (specifically roads displayed on maps) using machine learning.

Paper: http://arxiv.org/abs/2211.06544
Code: https://github.com/soojunghong/image_inpainting_model_for_lane_geomery_discovery

#AI #CV #NewPaper #DeepLearning #MachineLearning

<<Find this useful? Please boost so that others can benefit too ๐Ÿ™‚>>
GT, Input, and Output denotes tโ€ฆ
0
0
0

Fahim Farook

"ZITS++: Image Inpainting by Improving the Incremental Transformer on Structural Priors. (arXiv:2210.05950v2 [cs.CV] UPDATED)" โ€” Better image inpainting by detecting structures in the source image using techniques such as edge detection.

Paper: http://arxiv.org/abs/2210.05950
Code: https://github.com/dqiaole/zits_inpainting

#AI #CV #NewPaper #DeepLearning #MachineLearning

<<Find this useful? Please boost so that others can benefit too ๐Ÿ™‚>>
Left (a)-(e): Comparisons of ZIโ€ฆ
0
0
0

Fahim Farook

"Designing an Encoder for Fast Personalization of Text-to-Image Models. (arXiv:2302.12228v1 [cs.CV])" โ€” A method to teach text-to-image models new concepts in seconds.

Paper: http://arxiv.org/abs/2302.12228

#AI #CV #NewPaper #DeepLearning #MachineLearning

<<Find this useful? Please boost so that others can benefit too ๐Ÿ™‚>>
Our encoder-based method enableโ€ฆ
0
0
1

Fahim Farook

"Aligning Text-to-Image Models using Human Feedback. (arXiv:2302.12192v1 [cs.LG])" โ€” A fine-tuning method for better aligning generated images to the input text prompt when using diffusion models.

Paper: http://arxiv.org/abs/2302.12192

#AI #CV #NewPaper #DeepLearning #MachineLearning

<<Find this useful? Please boost so that others can benefit too ๐Ÿ™‚>>
The steps in our fine-tuning meโ€ฆ
0
4
1

Fahim Farook

"Evaluating the Efficacy of Skincare Product: A Realistic Short-Term Facial Pore Simulation. (arXiv:2302.11950v1 [cs.CV])" โ€” Simulating the effects of skincare products on your skin (specifically the pores) to gauge efficacy of the product.

Paper: http://arxiv.org/abs/2302.11950

#AI #CV #NewPaper #DeepLearning #MachineLearning

<<Find this useful? Please boost so that others can benefit too ๐Ÿ™‚>>
Qualitative results of Facial Pโ€ฆ
0
0
0

Fahim Farook

"Region-Aware Diffusion for Zero-shot Text-driven Image Editing. (arXiv:2302.11797v1 [cs.CV])" โ€” A region-aware text-guided image editing method which aims to replace one entity with another.

What I always wonder with these approaches is whether you can replace a larger entity with a smaller one, or vice versa, (say a horse with a cat) in a way that looks realistic?

Paper: http://arxiv.org/abs/2302.11797
Code: https://github.com/haha-lisa/RDM-Region-Aware-Diffusion-Model

#AI #CV #NewPaper #DeepLearning #MachineLearning

<<Find this useful? Please boost so that others can benefit too ๐Ÿ™‚>>
The results of the proposed regโ€ฆ
0
0
0
Show older