Hao Wu

PhD Candidate in Khoury
College of Computer Sciences
Northeastern University

I am a Ph.D candidate in the Khoury College of Computer Sciences at Northeastern University. My supervisor is Prof. Jan-Willem van de Meent. Previously, I received my M.Sc in Computer Science from University of Virginia and an M.Sc in Applied Mathematics from University of Washington.

My main research focus is in deep probabilistic modeling and probabilistic programming. I am interested in developing scalable inference methods for deep generative models, in which I hope to learn representations that can uncover hierarchical structure of data in various problem domains.

News

May 2021 : Our paper on Conjugate Energy-Based Models will appear at ICML 2021.

Mar 2021 : I am excited to be joining MIT-IBM Watson AI Lab for an internship in summer 2021, advised by Soumya Ghosh.

Dec 2020 : Two extended abstracts will appear at AABI 2021: One work on Conjugate Energy-Based Models and one work on Nested Variational Inference.

Research

wu_cebm_2021
Conjugate Energy-Based Models
We propose conjugate energy-based models (CEBMs), a class of deep latent-variable models with a tractable posterior. CEBMs have similar use cases as variational autoencoders, in the sense that they learn an unsupervised mapping between data and latent variables. However these models omit a generator, which allows them to learn more flexible notions of similarity between data points. Our experiments demonstrate that CEBMs achieve competitive results in terms of image modelling, predictive power of latent space, and out-of-distribution detection on a variety of datasets. [Read More]
wu_amortized_2020
Amortized Population Gibbs Samplers with Neural Sufficient Statistics
We develop amortized population Gibbs (APG) samplers, a class of scalable methods that frame structured variational inference as adaptive importance sampling. APG samplers construct high-dimensional proposals by iterating over updates to lower-dimensional blocks of variables. We train each conditional proposal by minimizing the inclusive KL divergence with respect to the conditional posterior. To appropriately account for the size of the input data, we develop a new parameterization in terms of neural sufficient statistics. Experiments show that APG samplers can be used to train highly-structured deep generative models in an unsupervised manner, and achieve substantial improvements in inference accuracy relative to standard autoencoding variational methods. [Read More]