Statistical Machine Learning Group

Research group

University College London

We are a research group at UCL’s Centre for Artificial Intelligence. Our research expertise is in data-efficient machine learning, probabilistic modeling, and autonomous decision making. Applications focus on robotics, climate science, and sustainable development.

If you are interested in joining the team, please check out our openings.

Meet the Team

Principal Investigators

Marc Deisenroth

DeepMind Chair of Machine Learning and Artificial Intelligence

Machine learning, Gaussian processes, Reinforcement learning, Robotics, Meta learning

Administrators

Alice Winters

Administrator

Research Fellows

So Takao

Senior Research Fellow

Machine learning, Climate science, Fluid mechanics, Geometric mechanics

Yasemin Bekiroğlu

Senior Research Fellow

Machine learning, Robotics

PhD Students

Daniel Ramos Macedo Antunes De Souza

PhD Student

Machine learning, Gaussian processes

Jackie Kay

PhD Student

Machine learning, Robotics, Fairness, Ethical AI, Reinforcement Learning

Jacob Menick

PhD Student

Machine learning, Generative models, Large-scale deep learning, Variational inference, Information theory, Sparsity

Jake Cunningham

PhD Student

Machine learning

Mihaela Rosca

PhD Student

Generative models, Reinforcement learning, Natural language processing, Scalable and safe machine learning.

Samuel Cohen

PhD Student

Machine learning, Optimal transport, Gaussian processes

Sicelukwanda Zwane

PhD Student

Machine learning, Robotics, Transfer Learning, Reinforcement Learning

Yicheng Luo

PhD Student

Meta-learning, Probabilistic Programming, Reinforcement Learning, Deep Generative Models

Project Students

Bengt Lofgren

MSc Project Student

Christopher Tan

MEng Project Student

Maria Kapros

MEng Project Student

Rares-Ioan Iordan

MSc Project Student

Ronald MacEachern

MSc Project Student

Sean Nassimiha

MSc Project Student

William Bankes

MSc Project Student

Affiliates

Christina Winkler

PhD Student @ TU Munich

Fabian Paischer

PhD Student @ JKU Linz

K. S. Sesh Kumar

Research Associate

Machine learning, Discrete optimization, Differential privacy, Submodularity

Mathieu Alain

PhD student

Mirgahney H. Mohamed

PhD Student

Computer vision, Uncertainty estimation

Oscar Key

PhD Student

Probabilistic modeling, Approximate inference, Machine learning, Climate science

Rendani Mbuvha

Lecturer

Alumni

Alexander Terenin

PhD (10/2018-11/2021)

Machine learning, Bayesian theory, Geometric machine learning

Benjamin Chamberlain

PhD (10/2014-08/2018)

Machine learning, Community detection, Representation of graphs, Hyperbolic embeddings

Hugh Salimbeni

PhD (10/2015-10/2019)

Machine learning, Deep probabilistic models, Approximate inference

James Wilson

PhD Student (10/2017-08/2022)

Machine learning, Gaussian processes, Bayesian optimization, Practical approximate inference

Janith Petangoda

PhD Student (10/2017-07/2022)

Machine learning, Meta learning, Differential geometry, Reinforcement learning

K. S. Sesh Kumar

Research Associate

Machine learning, Discrete optimization, Differential privacy, Submodularity

Riccardo Moriconi

PhD (10/2016-02/2021)

Machine learning, Gaussian processes, Bayesian optimization

Sanket Kamthe

PhD (10/2016-03/2021)

Machine learning, Reinforcement learning, Optimal control, Copulas

Simon Olofsson

PhD (06/2016-03/2020)

Machine learning, Bayesian optimization, Mechanistic models, Model discrimination

Sophie Ostler

Administrator (08/2021-08/2022)

Steindór Sæmundsson

PhD (11/2016-11/2021)

Machine learning, Gaussian processes, Meta learning, Structural priors, Variational inference

Recent Blog Posts

Iterative State Estimation in Non-linear Dynamical Systems Using Approximate Expectation Propagation

State estimation in nonlinear systems is difficult due to the non-Gaussianity of posterior state distributions. For linear systems, an exact solution is attained by running the Kalman filter/smoother. However for nonlinear systems, one typically relies on either crude Gaussian approximations by linearising the system (e.

So Takao

Last updated on Jun 29, 2022

Vector-valued Gaussian Processes on Riemannian Manifolds via Gauge Independent Projected Kernels

Gaussian processes are machine learning models capable of learning unknown functions with uncertainty. Motivated by a desire to deploy Gaussian processes in novel areas of science, we present a new class of Gaussian processes that model random vector fields on Riemannian manifolds that is (1) mathematically sound, (2) constructive enough for use by machine learning practitioners and (3) trainable using standard methods such as inducing points.

So Takao, Alexander Terenin

Last updated on Jun 28, 2022

Vector-valued Gaussian Processes on Riemannian Manifolds via Gauge Independent Projected Kernels

Symbolic Parallel Adaptive Importance Sampling for Probabilistic Program Analysis

Probabilistic software analysis methods extend classic static analysis techniques to consider the effects of probabilistic uncertainty, whether explicitly embedded within the code – as in probabilistic programs – or externalized in a probabilistic input distribution.

Yicheng Luo

Last updated on Aug 22, 2021

Riemannian Convex Potential Maps

Modeling distributions on Riemannian manifolds is a crucial component in understanding non-Euclidean data that arises, e.g., in physics and geology. We propose a class of flows that uses convex potentials from Riemannian optimal transport.

Samuel Cohen

Last updated on Nov 23, 2021

Discretization Drift in Two-Player Games

In this work, we quantify the discretisation error induced by gradient descent in two-player games, and use that to understand and improve such games, including Generative Adversarial Networks. Two-player games Many machine learning applications involve not one single model, but two models which get trained jointly.

Mihaela Rosca

Last updated on Jul 29, 2021

Discretization Drift in Two-Player Games

See all

Recent News

Dr. Wilson

Dr. James Wilson successfully passed his PhD viva

Marc Deisenroth

Last updated on Aug 2, 2022

Dr. Petangoda

Dr. Janith Petangoda successfully passed his PhD viva

Marc Deisenroth

Last updated on Jul 5, 2022

Paper accepted at TMLR

The Graph Cut Kernel for Ranked Data has been accepted at TMLR

Marc Deisenroth

Last updated on Jun 29, 2022

Paper accepted at TMLR

Paper on Iterative State Estimation in Non-linear Dynamical Systems Using Approximate Expectation Propagation accepted at TMLR

Marc Deisenroth

Last updated on Jun 29, 2022

Paper accepted at JMLR

Paper on Cauchy–Schwarz Regularized Autoencoder accepted at JMLR

Marc Deisenroth

Last updated on Jun 13, 2022

See all

Recent Publications

One-Shot Transfer of Affordance Regions? AffCorrs!

In this work, we tackle one-shot visual search of object parts. Given a single reference image of an object with annotated affordance …

Denis Hadjivelichkov, Sicelukwanda Zwane, Lourdes Agapito, Marc P. Deisenroth, Dimitrios Kanoulas

The Graph Cut Kernel for Ranked Data

Many algorithms for ranked data become computationally intractable as the number of objects grows due to the complex geometric …

Michelangelo Conserva, Marc P. Deisenroth, K. S. Sesh Kumar

Iterative State Estimation in Non-linear Dynamical Systems Using Approximate Expectation Propagation

Bayesian inference in non-linear dynamical systems seeks to find good posterior approximations of a latent state given a sequence of …

Sanket Kamthe, So Takao, Shakir Mohamed, Marc P. Deisenroth

Iterative State Estimation in Non-linear Dynamical Systems Using Approximate Expectation Propagation

Cauchy-Schwarz Regularized Autoencoder

Recent work in unsupervised learning has focused on efficient inference and learning in latent variables models. Training these models …

Linh Tran, Maja Pantic, Marc P. Deisenroth

See all publications

Recent & Upcoming Talks

Dmitry Berenson: Learning Where to Trust Unreliable Dynamics Models for Motion Planning and Manipulation

The world outside our labs seldom conforms to the assumptions of our models. This is especially true for dynamics models used in …

Yasemin Bekiroğlu

Last updated on Sep 26, 2022

Dan Roy: Admissibility is Bayes Optimality with Infinitesimals

We give an exact characterization of admissibility in statistical decision problems in terms of Bayes optimality in a so-called …

Marc Deisenroth

Last updated on Aug 30, 2022

Benjamin Chamberlain: A Continuous Perspective on Graph Neural Networks

In this talk I will discuss several recent papers that develop new graph neural networks by considering their relation to continuous …

Marc Deisenroth

Last updated on Jul 7, 2022

Michalis Titsias: Functional Regularisation for Continual Learning with Gaussian Processes

We introduce a framework for Continual Learning (CL) based on Bayesian inference over the function space rather than the parameters of …

Marc Deisenroth

Last updated on Jun 13, 2022

Ollie Hamelijnck: Spatio-Temporal Variational Gaussian Processes

We introduce a scalable approach to Gaussian process inference that combines spatio-temporal filtering with natural gradient …

So Takao

Last updated on Mar 25, 2022

See all

Featured Publications

Sanket Kamthe, So Takao, Shakir Mohamed, Marc P. Deisenroth

2022-06-13 Transactions on Machine Learning Research

Iterative State Estimation in Non-linear Dynamical Systems Using Approximate Expectation Propagation

Bayesian inference in non-linear dynamical systems seeks to find good posterior approximations of a latent state given a sequence of observations. Gaussian filters and smoothers, including the (extended/unscented) Kalman filter/smoother, which are commonly used in engineering applications, yield Gaussian posteriors on the latent state. While they are computationally efficient, they are often criticised for their crude approximation of the posterior state distribution. In this paper, we address this criticism by proposing a message passing scheme for iterative state estimation in non-linear dynamical systems, which yields more informative (Gaussian) posteriors on the latent states. Our message passing scheme is based on expectation propagation (EP). We prove that classical Rauch–Tung–Striebel (RTS) smoothers, such as the extended Kalman smoother (EKS) or the unscented Kalman smoother (UKS), are special cases of our message passing scheme. Running the message passing scheme more than once can lead to significant improvements of the classical RTS smoothers, so that more informative state estimates can be obtained. We address potential convergence issues of EP by generalising our state estimation framework to damped updates and the consideration of general alpha-divergences.

Michael J. Hutchinson, Alexander Terenin, Viacheslav Borovitskiy, So Takao, Yee Whye Teh, Marc P. Deisenroth

2021-12-06 Advances in Neural Information Processing Systems (NeurIPS)

Vector-valued Gaussian Processes on Riemannian Manifolds via Gauge Independent Projected Kernels

James T. Wilson, Viacheslav Borovitskiy, Alexander Terenin, Peter Mostowsky, Marc P. Deisenroth

2021-06-04 Journal of Machine Learning Research

Pathwise Conditioning of Gaussian Processes

As Gaussian processes are used to answer increasingly complex questions, analytic solutions become scarcer and scarcer. Monte Carlo methods act as a convenient bridge for connecting intractable mathematical expressions with actionable estimates via sampling. Conventional approaches for simulating Gaussian process posteriors view samples as draws from marginal distributions of process values at finite sets of input locations. This distribution-centric characterization leads to generative strategies that scale cubically in the size of the desired random vector. These methods are prohibitively expensive in cases where we would, ideally, like to draw high-dimensional vectors or even continuous sample paths. In this work, we investigate a different line of reasoning: rather than focusing on distributions, we articulate Gaussian conditionals at the level of random variables. We show how this pathwise interpretation of conditioning gives rise to a general family of approximations that lend themselves to efficiently sampling Gaussian process posteriors. Starting from first principles, we derive these methods and analyze the approximation errors they introduce. We, then, ground these results by exploring the practical implications of pathwise conditioning in various applied settings, such as global optimization and reinforcement learning.

Samuel Cohen, Giulia Luise, Alexander Terenin, Brandon Amos, Marc Deisenroth

2021-04-13 Proceedings of the International Conference on Artificial Intelligence and Statistics (AISTATS)

Aligning Time Series on Incomparable Spaces

Dynamic time warping (DTW) is a useful method for aligning, comparing and combining time series, but it requires them to live in comparable spaces. In this work, we consider a setting in which time series live on different spaces without a sensible ground metric, causing DTW to become ill-defined. To alleviate this, we propose Gromov dynamic time warping (GDTW), a distance between time series on potentially incomparable spaces that avoids the comparability requirement by instead considering intra-relational geometry. We demonstrate its effectiveness at aligning, combining and comparing time series living on incomparable spaces. We further propose a smoothed version of GDTW as a differentiable loss and assess its properties in a variety of settings, including barycentric averaging, generative modeling and imitation learning.

Andreas Hochlehnert, Alexander Terenin, Steindór Sæmundsson, Marc Deisenroth

2021-04-13 Proceedings of the International Conference on Artificial Intelligence and Statistics (AISTATS)

Learning Contact Dynamics using Physically Structured Neural Networks

Learning physically structured representations of dynamical systems that include contact between different objects is an important problem for deep learning based approaches in robotics. Black-box neural networks can learn to approximately represent discontinuous dynamics, but typically require impractical quantities of data, and often suffer from pathological behaviour when forecasting for longer time horizons. In this work, we use connections between deep neural networks and differential equations to design a family of deep network architectures for representing contact dynamics between objects. We show that these networks can learn discontinuous contact events in a data-efficient manner from noisy observations in settings which are traditionally difficult for black-box approaches and recent physics inspired neural networks. Our results indicate that an idealised form of touch feedback—which is heavily relied upon by biological systems—is a key component of making this learning problem tractable. Together with the inductive biases introduced through the network architectures, our techniques enable accurate learning of contact dynamics from physical data.

See all publications