Decided to give DeepFloyd a try today on macOS.
The good news? It works … kinda 😛
The bad news? It doesn’t work all the way … as was to be expected 🙂
I took the following code from their GitHub repo (https://github.com/deep-floyd/IF) and modified for an Apple Silicon (M1) Mac. Here’s the actual code I ran:
from diffusers import DiffusionPipeline
from diffusers.utils import pt_to_pil
import torch
# stage 1
stage_1 = DiffusionPipeline.from_pretrained("DeepFloyd/IF-I-M-v1.0").to("mps")
# stage 2
stage_2 = DiffusionPipeline.from_pretrained("DeepFloyd/IF-II-M-v1.0", text_encoder=None).to("mps")
# stage 3
safety_modules = {"feature_extractor": stage_1.feature_extractor, "safety_checker": stage_1.safety_checker, "watermarker": stage_1.watermarker}
stage_3 = DiffusionPipeline.from_pretrained("stabilityai/stable-diffusion-x4-upscaler", **safety_modules).to("mps")
prompt = 'a photo of a kangaroo wearing an orange hoodie and blue sunglasses standing in front of the eiffel tower holding a sign that says "very deep learning"'
# text embeds
prompt_embeds, negative_embeds = stage_1.encode_prompt(prompt)
generator = torch.manual_seed(0)
# stage 1
image = stage_1(prompt_embeds=prompt_embeds, negative_prompt_embeds=negative_embeds, generator=generator, output_type="pt").images
pt_to_pil(image)[0].save("./if_stage_I.png")
# stage 2
image = stage_2(image=image, prompt_embeds=prompt_embeds, negative_prompt_embeds=negative_embeds, generator=generator, output_type="pt").images
pt_to_pil(image)[0].save("./if_stage_II.png")
# stage 3
image = stage_3(prompt=prompt, image=image, generator=generator, noise_level=100).images
image[0].save("./if_stage_III.png")
You have to make sure that diffusers, transformers, and accelereate (at least in my own trial) are fully up-to-date. The larger models probably work too but it took too long to download/test and so I opted for the smallest models.
Stage I and II generated images but stage III errored out. I will need to figure out what happened there later …
Resulting images are attached …
#DeepLearning #MachineLearning #DeepFloyd #ImageGeneration