Posts
1368
Following
141
Followers
868
I'm a bit of an eclectic mess 🙂 I've been a programmer, journalist, editor, TV producer, and a few other things.

I'm currently working on my second novel which is complete, but is in the edit stage. I wrote my first novel over 20 years ago but then didn't write much till now.

I post about #Coding, #Flutter, #Writing, #Movies and #TV. I'll also talk about #Technology, #Gadgets, #MachineLearning, #DeepLearning and a few other things as the fancy strikes ...

Lived in: 🇱🇰🇸🇦🇺🇸🇳🇿🇸🇬🇲🇾🇦🇪🇫🇷🇪🇸🇵🇹🇶🇦🇨🇦

Fahim Farook

"Adding Conditional Control to Text-to-Image Diffusion Models. (arXiv:2302.05543v1 [cs.CV])" — A method to control pretrained large diffusion models to support additional input conditions which can be used to augment existing generative models such as StableDiffusion by enabling conditional inputs like edge maps, segmentation maps, keypoints, etc.

Paper: http://arxiv.org/abs/2302.05543
Code: https://github.com/lllyasviel/ControlNet

#AI #CV #NewPaper #DeepLearning #MachineLearning

<<Find this useful? Please boost so that others can benefit too 🙂>>
Control Stable Diffusion with C…
0
2
0

Fahim Farook

"MaskSketch: Unpaired Structure-guided Masked Image Generation. (arXiv:2302.05496v1 [cs.CV])" — An image generation method that uses a guiding sketch to generate realistic images that match the structure of the sketch.

Paper: http://arxiv.org/abs/2302.05496

#AI #CV #NewPaper #DeepLearning #MachineLearning

<<Find this useful? Please boost so that others can benefit too 🙂>>
Given an input sketch and its c…
0
1
0

Fahim Farook

"Element-Wise Attention Layers: an option for optimization. (arXiv:2302.05488v1 [cs.LG])" — A new method of attention mechanism which uses matrices multiplications and has shown 92% accuracy and a 97% reduction in parameters for the Fashion MNIST dataset.

Paper: http://arxiv.org/abs/2302.05488

#AI #CV #NewPaper #DeepLearning #MachineLearning

<<Find this useful? Please boost so that others can benefit too 🙂>>
The outputs of the attention mo…
0
1
0

Fahim Farook

I haven't been posting any #StableDiffusion images since I was busy coding and writing and doing other stuff ...

But I've been generating images. These are from prompts ranging from "a cross between an egg and a rabbit" to "a creature with a large white egg as its body, white rabbit ears, a whiskered face, and rabbit paws and legs, blue sky with clouds, lots of trees in the background" ...

I was trying to get one image to match a story I was writing but I had to try various prompts to get something I was even happy with. And in the end, I didn't use any of these images 😛

#AIArt #StableDiffusion #DeepLearning #MachineLearning #CV #AI
Prompt: “a cross between an egg…
Prompt: “a creature with a larg…
Prompt: “a creature with a larg…
Prompt: “a creature with a larg…
0
0
2

Fahim Farook

"Plausible May Not Be Faithful: Probing Object Hallucination in Vision-Language Pre-training. (arXiv:2210.07688v2 [cs.CL] UPDATED)" — A study of the object hallucination problem in large-scale Vision-Language Pre-trained (VLP) models from multiple aspects.

Paper: http://arxiv.org/abs/2210.07688
Code: No code in linked repo

#AI #CV #NewPaper #DeepLearning #MachineLearning

<<Find this useful? Please boost so that others can benefit too 🙂>>
Comparison of image captioning …
0
1
0

Fahim Farook

"Dive into Deep Learning. (arXiv:2106.11342v4 [cs.LG] UPDATED)" — An open-source book on Deep Learning based on Jupyter Notebooks so that it contains interactive examples. Freely available and well-worth checking out.

Paper: http://arxiv.org/abs/2106.11342
Code: https://github.com/d2l-ai/d2l-en
Book: https://d2l.ai/

#AI #CV #NewPaper #DeepLearning #MachineLearning

<<Find this useful? Please boost so that others can benefit too 🙂>>
Image of book cover for Dive in…
0
5
8

Fahim Farook

"Scaling Vision Transformers to 22 Billion Parameters. (arXiv:2302.05442v1 [cs.CV])" — A recipe for highly efficient and stable training of a 22B-parameter Vision Transformers (ViT) overtaking the previously known 4B parameter model.

Paper: http://arxiv.org/abs/2302.05442

#AI #CV #NewPaper #DeepLearning #MachineLearning

<<Find this useful? Please boost so that others can benefit too 🙂>>
Dense prediction from frozen Vi…
0
2
1

Fahim Farook

"Rumor Classification through a Multimodal Fusion Framework and Ensemble Learning. (arXiv:2302.05289v1 [cs.CV])" — A set of advanced image features that are inspired from the field of image quality assessment, to assess message veracIty in social networks, which exploits all message features by exploring various machine learning models.

Paper: http://arxiv.org/abs/2302.05289

#AI #CV #NewPaper #DeepLearning #MachineLearning

<<Find this useful? Please boost so that others can benefit too 🙂>>
An overview of the proposed rum…
0
2
1

Fahim Farook

"Archaeological Sites Detection with a Human-AI Collaboration Workflow. (arXiv:2302.05286v1 [cs.CV])" — Using pre-trained semantic segmentation deep learning models to detect archaeological sites within the Mesopotamian floodplains environment.

Paper: http://arxiv.org/abs/2302.05286
Code: https://github.com/mister-magpie/tell_segmentation

#AI #CV #NewPaper #DeepLearning #MachineLearning

<<Find this useful? Please boost so that others can benefit too 🙂>>
Investigation area. Orange dots…
0
8
8

Fahim Farook

"CEN-HDR: Computationally Efficient neural Network for real-time High Dynamic Range imaging. (arXiv:2302.05213v1 [cs.CV])" — A new computationally efficient neural network based on a light attention mechanism and sub-pixel convolution operations for real-time HDR imaging.

Paper: http://arxiv.org/abs/2302.05213
Code: https://github.com/steven-tel/CEN-HDR

#AI #CV #NewPaper #DeepLearning #MachineLearning

<<Find this useful? Please boost so that others can benefit too 🙂>>
Qualitative comparison of the p…
0
2
1

Fahim Farook

"DOMINO: Domain-aware Loss for Deep Learning Calibration. (arXiv:2302.05142v1 [cs.CV])" — A domain-aware loss function to calibrate deep learning models so as to avoid the potential dangers of uncalibrated models in medical imaging.

Paper: http://arxiv.org/abs/2302.05142
Code: https://github.com/lab-smile/DOMINO

#AI #CV #NewPaper #DeepLearning #MachineLearning

<<Find this useful? Please boost so that others can benefit too 🙂>>
Confusion matrices on testing s…
0
3
0

Fahim Farook

"Example-Based Sampling with Diffusion Models. (arXiv:2302.05116v1 [cs.GR])" — A generic way to produce 2-d point sets imitating existing samplers from observed point sets using a diffusion model which addresses the problem of convolutional layers by leveraging neighborhood information from an optimal transport matching to a uniform grid, that allows benefiting from fast convolutions on grids, and to support the example-based learning of non-uniform sampling patterns.

Paper: http://arxiv.org/abs/2302.05116

#AI #CV #NewPaper #DeepLearning #MachineLearning

<<Find this useful? Please boost so that others can benefit too 🙂>>
Learning Rank-1 realizations us…
0
1
0

Fahim Farook

The images show two separate iterations of the same app — the first is a SwiftUI app for doing #StableDiffusion image generation using #CoreML. That took several weeks of work to create and it never really worked right — I could only do single selection of images, couldn’t drag and drop more than one image, and it took a fair amount of time to implement even trivial stuff.

The second is the same app implemented using AppKit. It took essentially one day (yesterday). Multiselection was built-in, drag and drop (for multiple items) was a couple of lines of code, and I implemented a bunch of other things I wanted to do fairly easily.

On top of that, the app feels faster, lighter, and more responsive than the SwiftUI version for the exact same task, using the exact same models.

SwiftUI has a long way to go still, if it will even ever get there …

#Coding #macOS #Swift #SwiftUIvsAppKit
Screenshot of an image generati…
Screenshot of an image generati…
0
2
6

Fahim Farook

Opera is going to use #ChatGPT to shorten articles — so that people have to read less and think even less —, Microsoft is building ChatGPT into their Office tools — again, less thinking —, we already have ChatGPT in search engines ..

So this is how the world ends — everybody growing stupider because we are too lazy to think for ourselves, and not with a bang?

I guess SkyNet did arrive but not in the way we thought … 😛

#TheAIRevolution #MisinformationEngine
Engadget article — Opera is add…
0
1
3

Fahim Farook

For those of us outside the Dole/Chiquita hegemony, bananas can have a variety of tastes — sweet, sour, bland and all the other stuff in between 😛 So when you say “water bugs taste like banana”, which banana are we talking about?

https://en.wikipedia.org/wiki/Banana
Chefs who cook with insects rep…
0
2
1

Fahim Farook

I really miss having @qikipedia posting facts over here. For a week or so, they did and then they’ve gone back to only posting on Twitter 😞

So I guess I’ll have to repeat their facts since I do enjoy them … but then there is no quoting posts and I have to resort to screenshots .. Ah, well 🙂
There is a termite city underne…
1
0
3

Fahim Farook

"UniPC: A Unified Predictor-Corrector Framework for Fast Sampling of Diffusion Models. (arXiv:2302.04867v1 [cs.LG])" — A unified corrector (UniC) that can be applied after any existing DPM sampler to increase the order of accuracy without extra model evaluations, and derive a unified predictor (UniP) that supports arbitrary order as a byproduct.

Paper: http://arxiv.org/abs/2302.04867
Code: https://github.com/wl-zhao/UniPC

#AI #CV #NewPaper #DeepLearning #MachineLearning

<<Find this useful? Please boost so that others can benefit too 🙂>>
The main idea of UniPC. We prov…
0
1
0

Fahim Farook

"Trading Information between Latents in Hierarchical Variational Autoencoders. (arXiv:2302.04855v1 [stat.ML])" — A generalization of VAEs to application domains beyond generative modeling (e.g., representation learning, clustering, or lossy data compression) by introducing an objective function that allows practitioners to trade off between the information content ("bit rate") of the latent representation and the distortion of reconstructed data.

Paper: http://arxiv.org/abs/2302.04855

#AI #CV #NewPaper #DeepLearning #MachineLearning

<<Find this useful? Please boost so that others can benefit too 🙂>>
Left: trade-off between perform…
0
1
0

Fahim Farook

"Robot Synesthesia: A Sound and Emotion Guided AI Painter. (arXiv:2302.04850v1 [cs.CV])" — An approach for using sound and speech to guide a robotic painting process by encoding the simulated paintings and input sounds into the same latent space.

Paper: http://arxiv.org/abs/2302.04850
Code: https://github.com/pschaldenbrand/Frida

#AI #CV #NewPaper #DeepLearning #MachineLearning

<<Find this useful? Please boost so that others can benefit too 🙂>>
Our robot painting sound guidan…
0
1
1

Fahim Farook

"Is This Loss Informative? Speeding Up Textual Inversion with Deterministic Objective Evaluation. (arXiv:2302.04841v1 [cs.CV])" — A study of the training dynamics of textual inversion, with the aim of speeding it up.

Paper: http://arxiv.org/abs/2302.04841

#AI #CV #NewPaper #DeepLearning #MachineLearning

<<Find this useful? Please boost so that others can benefit too 🙂>>
A summary of our key findings: …
0
1
0
Show older