My research aims to build a unified algorithmic framework where a robot efficiently infers the optimal value function from a bounded set of interactions with both humans and the environment. It ties together insights from motion planning and imitation learning and applies them to robots deployed in the wild.


2017 - 2019

2017_brl

Blending MPC & Value Function Approximation

We present a framework for improving on model predictive control (MPC) with model-free reinforcement learning (RL). The key insight is to view MPC as constructing a series of local Q-function approximations. By appropriately blending these various Q approximations over time, we can systematically trade-off model errors with learned value errors.
Papers:  ICLR'21

2017 - 2019

2017_brl

Bayesian Reinforcement Learning

Addressing uncertainty is critical for autonomous systems to robustly adapt to the real world. We formulate the problem of model uncertainty as a Bayes-Adaptive Markov Decision Process (BAMDP), where an agent maintains a posterior distribution over latent model parameters given a history of observations and maximizes its expected long-term reward with respect to this belief distribution. We propose algorithms to solve continuous BAMDPs efficiently.
Papers:  ICLR'19arXiv'18arXiv'20

2016 - 2017

2019_btp

Bayesian Traveler's Problem

Consider a traveler on a graph who must reach a goal (or cover a set of goals) but does not know which edges are traversable. The traversability is only revealed when the traveler attempts the edge (or visits an adjacent vertex). Given a prior on edges, how should the traveler move to minimize expected travel time? Many real robotics applications are instances of this problem, e.g. manipulation in occlusion.
Papers:  ISRR'19Video