Posts
1411
Following
142
Followers
869
I'm a bit of an eclectic mess 🙂 I've been a programmer, journalist, editor, TV producer, and a few other things.

I'm currently working on my second novel which is complete, but is in the edit stage. I wrote my first novel over 20 years ago but then didn't write much till now.

I post about #Coding, #Flutter, #Writing, #Movies and #TV. I'll also talk about #Technology, #Gadgets, #MachineLearning, #DeepLearning and a few other things as the fancy strikes ...

Lived in: 🇱🇰🇸🇦🇺🇸🇳🇿🇸🇬🇲🇾🇦🇪🇫🇷🇪🇸🇵🇹🇶🇦🇨🇦

Fahim Farook

"ISS: Image as Stepping Stone for Text-Guided 3D Shape Generation. (arXiv:2209.04145v6 [cs.CV] UPDATED)" — Using 2D images as a stepping stone for creating 3D shapes and eliminating the need for paired text-shape data.

Paper: http://arxiv.org/abs/2209.04145
Code: https://github.com/liuzhengzhe/ISS-Image-as-Stepping-Stone-for-Text-Guided-3D-Shape-Generation

#AI #CV #NewPaper #DeepLearning #MachineLearning

<<Find this useful? Please boost so that others can benefit too 🙂>>
Our novel “Image as Stepping St…
0
1
2

Fahim Farook

"Modulating Pretrained Diffusion Models for Multimodal Image Synthesis. (arXiv:2302.12764v1 [cs.CV])" — Multimodal Conditioning Modules (MCM) for enabling conditional image synthesis using pretrained diffusion models so that you can generate images using not just a text prompt, but additional input such as a segmentation map or a sketch.

Paper: http://arxiv.org/abs/2302.12764

#AI #CV #NewPaper #DeepLearning #MachineLearning

<<Find this useful? Please boost so that others can benefit too 🙂>>
Multimodal conditioning modules…
0
1
1

Fahim Farook

"Surface Recognition for e-Scooter Using Smartphone IMU Sensor. (arXiv:2302.12720v1 [eess.SP])" — Detecting whether an e-scooter is on a paved road or a sidewalk using the Inertial Measurement Unit (IMU) sensors on a smartphone.

Paper: http://arxiv.org/abs/2302.12720

#AI #CV #NewPaper #DeepLearning #MachineLearning

<<Find this useful? Please boost so that others can benefit too 🙂>>
Left: an example of a street wi…
0
1
0

Fahim Farook

"Data-driven Approach for Automatically Correcting Faulty Road Maps. (arXiv:2211.06544v2 [cs.CV] UPDATED)" — A method to fix faulty road maps (specifically roads displayed on maps) using machine learning.

Paper: http://arxiv.org/abs/2211.06544
Code: https://github.com/soojunghong/image_inpainting_model_for_lane_geomery_discovery

#AI #CV #NewPaper #DeepLearning #MachineLearning

<<Find this useful? Please boost so that others can benefit too 🙂>>
GT, Input, and Output denotes t…
0
0
0

Fahim Farook

"ZITS++: Image Inpainting by Improving the Incremental Transformer on Structural Priors. (arXiv:2210.05950v2 [cs.CV] UPDATED)" — Better image inpainting by detecting structures in the source image using techniques such as edge detection.

Paper: http://arxiv.org/abs/2210.05950
Code: https://github.com/dqiaole/zits_inpainting

#AI #CV #NewPaper #DeepLearning #MachineLearning

<<Find this useful? Please boost so that others can benefit too 🙂>>
Left (a)-(e): Comparisons of ZI…
0
0
0

Fahim Farook

"Designing an Encoder for Fast Personalization of Text-to-Image Models. (arXiv:2302.12228v1 [cs.CV])" — A method to teach text-to-image models new concepts in seconds.

Paper: http://arxiv.org/abs/2302.12228

#AI #CV #NewPaper #DeepLearning #MachineLearning

<<Find this useful? Please boost so that others can benefit too 🙂>>
Our encoder-based method enable…
0
0
1

Fahim Farook

"Aligning Text-to-Image Models using Human Feedback. (arXiv:2302.12192v1 [cs.LG])" — A fine-tuning method for better aligning generated images to the input text prompt when using diffusion models.

Paper: http://arxiv.org/abs/2302.12192

#AI #CV #NewPaper #DeepLearning #MachineLearning

<<Find this useful? Please boost so that others can benefit too 🙂>>
The steps in our fine-tuning me…
0
4
1

Fahim Farook

"Evaluating the Efficacy of Skincare Product: A Realistic Short-Term Facial Pore Simulation. (arXiv:2302.11950v1 [cs.CV])" — Simulating the effects of skincare products on your skin (specifically the pores) to gauge efficacy of the product.

Paper: http://arxiv.org/abs/2302.11950

#AI #CV #NewPaper #DeepLearning #MachineLearning

<<Find this useful? Please boost so that others can benefit too 🙂>>
Qualitative results of Facial P…
0
0
0

Fahim Farook

"Region-Aware Diffusion for Zero-shot Text-driven Image Editing. (arXiv:2302.11797v1 [cs.CV])" — A region-aware text-guided image editing method which aims to replace one entity with another.

What I always wonder with these approaches is whether you can replace a larger entity with a smaller one, or vice versa, (say a horse with a cat) in a way that looks realistic?

Paper: http://arxiv.org/abs/2302.11797
Code: https://github.com/haha-lisa/RDM-Region-Aware-Diffusion-Model

#AI #CV #NewPaper #DeepLearning #MachineLearning

<<Find this useful? Please boost so that others can benefit too 🙂>>
The results of the proposed reg…
0
0
0

Fahim Farook

"Controlled and Conditional Text to Image Generation with Diffusion Prior. (arXiv:2302.11710v1 [cs.CV])" — Using a Diffusion Prior to constrain the generation to a specific domain without altering the larger Diffusion Decoder in a memory and compute efficient way.

Paper: http://arxiv.org/abs/2302.11710

#AI #CV #NewPaper #DeepLearning #MachineLearning

<<Find this useful? Please boost so that others can benefit too 🙂>>
Diffusion Prior can be trained …
0
0
0

Fahim Farook

"Composer: Creative and Controllable Image Synthesis with Composable Conditions. (arXiv:2302.09778v2 [cs.CV] UPDATED)" — A way to flexibly control the output image from diffusion models to modify the layout or style of the final image.

Paper: http://arxiv.org/abs/2302.09778

#AI #CV #NewPaper #DeepLearning #MachineLearning

<<Find this useful? Please boost so that others can benefit too 🙂>>
Concept of compositional image …
0
0
0

Fahim Farook

"Entity-Level Text-Guided Image Manipulation. (arXiv:2302.11383v1 [cs.CV])" — A more accurate/efficient text-guided image editing/manipulation?

Paper: http://arxiv.org/abs/2302.11383

#AI #CV #NewPaper #DeepLearning #MachineLearning

<<Find this useful? Please boost so that others can benefit too 🙂>>
Comparison between our method a…
0
0
1

Fahim Farook

"Open-domain Visual Entity Recognition: Towards Recognizing Millions of Wikipedia Entities. (arXiv:2302.11154v1 [cs.CV])" — Creating a non-task/domain specific, general visual recognition model.

Paper: http://arxiv.org/abs/2302.11154

#AI #CV #NewPaper #DeepLearning #MachineLearning

<<Find this useful? Please boost so that others can benefit too 🙂>>
An illustration of the proposed…
0
0
0

Fahim Farook

"Teachable Reality: Prototyping Tangible Augmented Reality with Everyday Objects by Leveraging Interactive Machine Teaching. (arXiv:2302.11046v1 [cs.HC])" — An Augmented Reality (AR) prototyping tool for creating interactive AR applications using everyday objects.

Paper: http://arxiv.org/abs/2302.11046

#AI #CV #NewPaper #DeepLearning #MachineLearning

<<Find this useful? Please boost so that others can benefit too 🙂>>
Teachable Reality is an augment…
0
0
0

Fahim Farook

"'The Taurus': Cattle Breeds & Diseases Identification Mobile Application using Machine Learning. (arXiv:2302.10920v1 [cs.LG])" — A cross-platform mobile application to identify cattle breeds, easily analyze and identify the diseases which cattle suffer from, and to provide solutions the identified diseases.

Paper: http://arxiv.org/abs/2302.10920

#AI #CV #NewPaper #DeepLearning #MachineLearning

<<Find this useful? Please boost so that others can benefit too 🙂>>
Mobile app screens showing the …
0
0
0

Fahim Farook

"Fair Diffusion: Instructing Text-to-Image Generation Models on Fairness. (arXiv:2302.10893v1 [cs.LG])" — Reducing bias in generative text-to-image models based on instructions.

Paper: http://arxiv.org/abs/2302.10893
Code: https://github.com/ml-research/Fair-Diffusion

#AI #CV #NewPaper #DeepLearning #MachineLearning

<<Find this useful? Please boost so that others can benefit too 🙂>>
Stable Diffusion (top row) runs…
0
1
1

Fahim Farook

Yesterday's #StableDiffusion prompt was: "Secret forest with hidden nooks" or "Fantasy forest with hidden nooks" ... I did some for each 🙂

These were done using multiple different models and different resolutions and so there's quite a range to the images.

I liked the results so much that I'll probably be doing a few more of this over the next few days ... Or, variations thereof.
Prompt: “secret forest with hid…
Prompt: “fantasy forest with hi…
Prompt: “fantasy forest with hi…
Prompt: “secret forest with hid…
0
5
7

Fahim Farook

"Learning 3D Photography Videos via Self-supervised Diffusion on Single Images. (arXiv:2302.10781v1 [cs.CV])" — Transforming static images into videos with additional effects using a diffusion model to handle the inpainting.

Paper: http://arxiv.org/abs/2302.10781

#AI #CV #NewPaper #DeepLearning #MachineLearning

<<Find this useful? Please boost so that others can benefit too 🙂>>
llustration of the proposed out…
0
1
1

Fahim Farook

"RealFusion: 360{\deg} Reconstruction of Any Object from a Single Image. (arXiv:2302.10663v1 [cs.CV])" — Creating a 360-degree photographic model of an object from a single image of it by fitting a neural radiance field to the image.

Paper: http://arxiv.org/abs/2302.10663

#AI #CV #NewPaper #DeepLearning #MachineLearning

<<Find this useful? Please boost so that others can benefit too 🙂>>
RealFusion generates a full 360…
0
4
4

Fahim Farook

"Diffusion Models and Semi-Supervised Learners Benefit Mutually with Few Labels. (arXiv:2302.10586v1 [cs.CV])" — A three-stage training strategy for conditional image generation and classification in semi-supervised learning.

Paper: http://arxiv.org/abs/2302.10586

#AI #CV #NewPaper #DeepLearning #MachineLearning

<<Find this useful? Please boost so that others can benefit too 🙂>>
An overview of DPT. First, a (s…
0
1
1
Show older