The main purpose of this blog is to report the results for the main projet of a Deep Learning course given at the Université de Montréal. The goal is to generated the 32 x 32 center part of 64 x 64 image by conditionning on the contour of the picture and a caption. More details of this project are given on this page.
Given that PixelCNN can generate images with very sharp details and converge fastly in training, my plan is to first experiment with this model. Then, I will attempt to improve it by parallelizing it and by conditionning on the captions as well, as has been done really recently by Reed et al. 2017 which has yields really impressive results. Finally, given than GANs seem to be, in the literrature, the models that give the most deceptive results, if time permits, I will experiment on this model as well.