Home - Hao Wu's Personal Webite

Hao Wu

E-Mail
Github
G. Scholar
Twitter
Curriculum Vitae

Welcome to my website! I am a technologist, research scientist, and software engineer with 10 years of experience in Artificial Intelligence, Machine Learning, and Cognitive Neuroscience, specializing in foundation models towards AGI, Bayesian Inference, probabilistic programming, and their industry applications.

Currently I am employed as an Applied Scientist at AWS, located at Seattle.

Previously I got my Ph.D degree in Computer Science at Northeastern University, where I was advised with Prof. Jan-Willem van de Meent. Before that I had my M.Sc in Computer Science at University of Virginia and my M.Sc in Applied Mathematics from University of Washington.

News

Oct 2024 : Our short paper on conditional mixture networks got accepted at NeurIPS workshop on Bayesian Decision-making and Uncertainty!

Oct 2024 : I am excited to be joining AWS as an Applied Scientist!770

Sep 2024 : Our paper on predictive coding got accepted at Neurips 2024!

Selected Publications

Nested Variational Inference

Conference on Neural Information Processing Systems (NeurIPS), 2021
Heiko Zimmermann, Hao Wu, Babak Esmaeili, Sam Stites, Jan-Willem van de Meent

We develop nested variational inference (NVI), a family of methods that learn proposals for nested importance samplers by minimizing an forward or reverse KL divergence at each level of nesting. NVI is applicable to many commonly-used importance sampling strategies and provides a mechanism for learning intermediate densities, which can serve as heuristics to guide the sampler.In our experiments we observe that optimizing nested objectives leads to improved sample quality in terms of log average weight and effective sample size. [Paper]

Learning Proposals for Probabilistic Programs with Inference Combinators

Uncertainty in Artificial Intelligence (UAI), 2021
Sam Stites*, Heiko Zimmermann*, Hao Wu, Eli Sennesh, Jan-Willem van de Meent

We develop operators for construction of proposals in probabilistic programs, which we refer to as inference combinators. Inference combinators define a grammar over importance samplers that compose primitive operations such as application of a transition kernel and importance resampling. Proposals in these samplers can be parameterized using neural networks, which in turn can be trained by optimizing variational objectives. The result is a framework for user-programmable variational methods that are correct by construction and can be tailored to specific models. [Paper]

Conjugate Energy-Based Models

International Conference on Machine Learning (ICML), 2021
Hao Wu*, Babak Esmaeili*, Michael Wick, Jean-Baptiste Tristan, Jan-Willem van de Meent

We propose conjugate energy-based models (CEBMs), a class of deep latent-variable models with a tractable posterior. CEBMs have similar use cases as variational autoencoders, in the sense that they learn an unsupervised mapping between data and latent variables. However these models omit a generator, which allows them to learn more flexible notions of similarity between data points. Our experiments demonstrate that CEBMs achieve competitive results in terms of image modelling, predictive power of latent space, and out-of-distribution detection on a variety of datasets. [Paper]

Amortized Population Gibbs Samplers with Neural Sufficient Statistics

International Conference on Machine Learning (ICML), 2020
Hao Wu, Heiko Zimmermann, Eli Sennesh, Tuan Anh Le, Jan-Willem van de Meent

We develop amortized population Gibbs (APG) samplers, a class of scalable methods that frame structured variational inference as adaptive importance sampling. APG samplers construct high-dimensional proposals by iterating over updates to lower-dimensional blocks of variables. We train each conditional proposal by minimizing the inclusive KL divergence with respect to the conditional posterior. To appropriately account for the size of the input data, we develop a new parameterization in terms of neural sufficient statistics. Experiments show that APG samplers can be used to train highly-structured deep generative models in an unsupervised manner. [Paper]

Structured Disentangled Representations

Artificial Intelligence and Statistics (AISTATS), 2019
Babak Esmaeili, Hao Wu, Sarthak Jain, Alican Bozkurt, N. Siddharth, Brooks Paige, Dana H. Brooks, Jennifer Dy, Jan-Willem van de Meent

Deep latent-variable models learn representations of high-dimensional data in an unsupervised manner. A number of recent efforts have focused on learning representations that disentangle statistically independent axes of variation by introducing modifications to the standard objective function. These approaches generally assume a simple diagonal Gaussian prior and as a result are not able to reliably disentangle factors of variation. We propose a two-level hierarchical objective to control relative degree of statistical independence between blocks of variables and individual variables within blocks. [Paper]