Here are the final results of the model I used for the project in IFT6266, i.e. image generation conditionned on a contour and a caption. The architecture used was inspired from Oord et al., 2016‘s PixelCNN. However, it made the simplifying assumption that all pixels – and all channels – of the 32×32 center image were not […]Lire la suite de "Final Blog Post"
After training on about a hundred epochs, the pixelCNN/autoencoder managed to achieve pretty good results: Unfortunately, it suffers from the same problems as the autoencoder models, i.e. the generations stay blurry. I added a residual block to the network, since I calculated that with my current architecture, at least 14 residual blocks were needed for […]Lire la suite de "More news on the latest model"
When trying to find ways to improve the training time, I looked at others’ models and found out that some of my classmates who got pretty good results had surprinsingly simple models. For example, the other student in the class who trained pixelCNN (Sherjil Ozair) supposed that the pixels in the 32×32 patch wer all […]Lire la suite de "PixelCNN/Autoencoder – A simpler yet more effective model"
After trying the mean squared error as a loss function, no improvements have been observed yet. This may indicate that the model is indeed underfitting the data. I will run the most epochs possible until the end of the project to see if it gets to generate something. The way this autoregressive model learns is also probably […]Lire la suite de "PixelCNN – Updates"
A first version of Pixel CNN was trained. The model uses the same kind of architecture as in the original article by Oord et al., 2016, i.e. it does not yet uses the captions nor the gated convolutionnel layers. The model is built as follows: 1 convolutionnal layer with a 7×7 kernel and the mask […]Lire la suite de "PixelCNN"
The main purpose of this blog is to report the results for the main projet of a Deep Learning course given at the Université de Montréal. The goal is to generated the 32 x 32 center part of 64 x 64 image by conditionning on the contour of the picture and a caption. More details of […]Lire la suite de "Conditional Image Generation"