I'm a bit of an eclectic mess 🙂 I've been a programmer, journalist, editor, TV producer, and a few other things.
I'm currently working on my second novel which is complete, but is in the edit stage. I wrote my first novel over 20 years ago but then didn't write much till now.
"ISS: Image as Stepping Stone for Text-Guided 3D Shape Generation. (arXiv:2209.04145v6 [cs.CV] UPDATED)" — Using 2D images as a stepping stone for creating 3D shapes and eliminating the need for paired text-shape data.
"Modulating Pretrained Diffusion Models for Multimodal Image Synthesis. (arXiv:2302.12764v1 [cs.CV])" — Multimodal Conditioning Modules (MCM) for enabling conditional image synthesis using pretrained diffusion models so that you can generate images using not just a text prompt, but additional input such as a segmentation map or a sketch.
"Surface Recognition for e-Scooter Using Smartphone IMU Sensor. (arXiv:2302.12720v1 [eess.SP])" — Detecting whether an e-scooter is on a paved road or a sidewalk using the Inertial Measurement Unit (IMU) sensors on a smartphone.
"ZITS++: Image Inpainting by Improving the Incremental Transformer on Structural Priors. (arXiv:2210.05950v2 [cs.CV] UPDATED)" — Better image inpainting by detecting structures in the source image using techniques such as edge detection.
"Designing an Encoder for Fast Personalization of Text-to-Image Models. (arXiv:2302.12228v1 [cs.CV])" — A method to teach text-to-image models new concepts in seconds.
"Aligning Text-to-Image Models using Human Feedback. (arXiv:2302.12192v1 [cs.LG])" — A fine-tuning method for better aligning generated images to the input text prompt when using diffusion models.
"Evaluating the Efficacy of Skincare Product: A Realistic Short-Term Facial Pore Simulation. (arXiv:2302.11950v1 [cs.CV])" — Simulating the effects of skincare products on your skin (specifically the pores) to gauge efficacy of the product.
"Region-Aware Diffusion for Zero-shot Text-driven Image Editing. (arXiv:2302.11797v1 [cs.CV])" — A region-aware text-guided image editing method which aims to replace one entity with another.
What I always wonder with these approaches is whether you can replace a larger entity with a smaller one, or vice versa, (say a horse with a cat) in a way that looks realistic?
"Controlled and Conditional Text to Image Generation with Diffusion Prior. (arXiv:2302.11710v1 [cs.CV])" — Using a Diffusion Prior to constrain the generation to a specific domain without altering the larger Diffusion Decoder in a memory and compute efficient way.
"Composer: Creative and Controllable Image Synthesis with Composable Conditions. (arXiv:2302.09778v2 [cs.CV] UPDATED)" — A way to flexibly control the output image from diffusion models to modify the layout or style of the final image.
"Open-domain Visual Entity Recognition: Towards Recognizing Millions of Wikipedia Entities. (arXiv:2302.11154v1 [cs.CV])" — Creating a non-task/domain specific, general visual recognition model.
"'The Taurus': Cattle Breeds & Diseases Identification Mobile Application using Machine Learning. (arXiv:2302.10920v1 [cs.LG])" — A cross-platform mobile application to identify cattle breeds, easily analyze and identify the diseases which cattle suffer from, and to provide solutions the identified diseases.
"Fair Diffusion: Instructing Text-to-Image Generation Models on Fairness. (arXiv:2302.10893v1 [cs.LG])" — Reducing bias in generative text-to-image models based on instructions.
"Learning 3D Photography Videos via Self-supervised Diffusion on Single Images. (arXiv:2302.10781v1 [cs.CV])" — Transforming static images into videos with additional effects using a diffusion model to handle the inpainting.
"RealFusion: 360{\deg} Reconstruction of Any Object from a Single Image. (arXiv:2302.10663v1 [cs.CV])" — Creating a 360-degree photographic model of an object from a single image of it by fitting a neural radiance field to the image.
"Diffusion Models and Semi-Supervised Learners Benefit Mutually with Few Labels. (arXiv:2302.10586v1 [cs.CV])" — A three-stage training strategy for conditional image generation and classification in semi-supervised learning.