Conversation

Fahim Farook

Decided to give DeepFloyd a try today on macOS.

The good news? It works … kinda 😛

The bad news? It doesn’t work all the way … as was to be expected 🙂

I took the following code from their GitHub repo (https://github.com/deep-floyd/IF) and modified for an Apple Silicon (M1) Mac. Here’s the actual code I ran:

from diffusers import DiffusionPipeline
from diffusers.utils import pt_to_pil
import torch

# stage 1
stage_1 = DiffusionPipeline.from_pretrained("DeepFloyd/IF-I-M-v1.0").to("mps")

# stage 2
stage_2 = DiffusionPipeline.from_pretrained("DeepFloyd/IF-II-M-v1.0", text_encoder=None).to("mps")

# stage 3
safety_modules = {"feature_extractor": stage_1.feature_extractor, "safety_checker": stage_1.safety_checker, "watermarker": stage_1.watermarker}
stage_3 = DiffusionPipeline.from_pretrained("stabilityai/stable-diffusion-x4-upscaler", **safety_modules).to("mps")

prompt = 'a photo of a kangaroo wearing an orange hoodie and blue sunglasses standing in front of the eiffel tower holding a sign that says "very deep learning"'

# text embeds
prompt_embeds, negative_embeds = stage_1.encode_prompt(prompt)

generator = torch.manual_seed(0)

# stage 1
image = stage_1(prompt_embeds=prompt_embeds, negative_prompt_embeds=negative_embeds, generator=generator, output_type="pt").images
pt_to_pil(image)[0].save("./if_stage_I.png")

# stage 2
image = stage_2(image=image, prompt_embeds=prompt_embeds, negative_prompt_embeds=negative_embeds, generator=generator, output_type="pt").images
pt_to_pil(image)[0].save("./if_stage_II.png")

# stage 3
image = stage_3(prompt=prompt, image=image, generator=generator, noise_level=100).images
image[0].save("./if_stage_III.png")

You have to make sure that diffusers, transformers, and accelereate (at least in my own trial) are fully up-to-date. The larger models probably work too but it took too long to download/test and so I opted for the smallest models.

Stage I and II generated images but stage III errored out. I will need to figure out what happened there later …

Resulting images are attached …

#DeepLearning #MachineLearning #DeepFloyd #ImageGeneration


Stage I image — 64 x 64 in size…
Stage II image — 256 x 256 in s…
0
0
4