What I do

I'm doing research in Machine Learning. More specifically, I am interested in algorithms for learning hierarchical representations of data. So-called "deep architectures" are a class of such models that we, at the LISA lab of Yoshua Bengio, are interested in.

Why are we interested in deep architectures?

I've compiled an annotated reading list on deep architectures. It is not meant to be comprehensive, but I tried to include most of the work that is relevant to the subject.

My work, as revealed by my publications (especially the more recent ones), can be summarized as follows: I am trying to "poke" a variety of deep architectures in many different ways in order to understand how and why they work. I have been especially interested in understanding the effect of unsupervised pre-training. In this sense, we have advanced several hypotheses related to its regularization and optimization effects (AISTATS'09 and work in progress). We have also advanced a hypothesis that pre-training can be harmful in certain scenarios and that there is a need for more semi-supervised (pre-)training (ICML'07)

I'm also interested in better understanding the solution learned by a deep network: to this end, I've been looking at ways to "visualize" an arbitrary unit of a deep network. Our latest tech report analyzes in depth this problem and concludes that in a lot of cases it is indeed possible to visualize filter-like features for units from 2nd and 3rd layers.

Here's a summary of stuff that's ongoing, planned or simply very optimistic:

The ultimate goal is to write a thesis (hopefully by the end of 2010) and the plan is to wrap these ideas into one coherent story.

It is perhaps instructive to know what I don't do: robots! But I do find that robots are cool.