Project 5

For this part, I ran the model on the three prompts twice, with a different number of inference steps each time (20 and then 50). Outputs are displayed below, for the images as well as the upsampled images. The second set of images has slightly higher resolution than the first. All images are pretty accurate with respect to the prompt. The second oil painting seems more like an oil painting to me than the first, however. Also, I chose a seed of 180.

Implementing the Forward Process

Here, I implemented the forward function which takes in an image im and a timestep t as input to generate a noisy image. Here are the outputs of the test image (Campanile) with timesteps of 250, 500, and 750.

Classical Denoising

Here, I simply used Gaussian blur filtering to try and remove the noise. I have displayed the outputs for filtering the three noised images from the previous section below. I used parameters of 7, 13, and 17 respectively for the blurring.

One Step Denoising

Here, I implemented one step denoising and have displayed the outputs of this on the three noised images (250, 500, 750) below. Each set of three is the original image, noised, and denoised version side-by-side.

Iterative Denoising

I implemented iterative denoising here and have displayed below the results of every fifth loop of denoising, as well as the final results (final denoised image, one-step denoised, and noised image).

Diffusion Model Sampling

Here, I generated 5 high-quality images from scratch by using i_start = 0 and passing in random noise.

Classifier Free Guidance

Here, I implemented a function for iterative denoising but with CFG. I have attached 5 images of high-quality photos here.

Image to Image Translation

I created edits of the test image using different noise levels, using the iterative denoise cfg function for denoising. I also did this for two of my generated images.

Editing hand-drawn and web images

Here I again did image-to-image translation, but with hand drawn and web images as shown below.

Inpainting

Here I used a mask to create new content in part of the image. I did this for three images and used two different masks. The last two have my own custom mask which changes the center.

Visual Anagrams

Here I generated three images that look like one image normally and another when viewed upside-down.

Hybrid Images

Here I generated three images that look like one image up-close and another when viewed from far away.

Project 5 - Diffusion - Part B

Part 1 - Training a Single-Step Denoising UNet

In this part, I trained a single-step denoising UNet. I first implemented the UNet and then used it to train a denoiser. Once that was done, I did some out-of-dist testing to see the results of this denoiser on images of noise levels for which it was not trained.

Part 2 - Training a Diffusion Model

In this part, I trained different diffusion models. I did one that was time-conditioned and one that was class-conditioned as well. Results are below.

Bells and Whistles - CS 180 Logo

I created a custom prompt embedding and then generated a few possible logos for CS 180 which I have displayed. This was using iterative cfg denoising.

Bells and Whistles - Sample GIFs