My research spans topics relating to motion planning and machine learning for robots solving complex tasks.
| 2018 - 2019 |
| f-Divergence Minimization |
We view imitation learning as minimizing divergence between the learner's and the expert's state-action distributions. We propose a general framework for estimating and minimizing any f-Divergence. By plugging in different divergences, we recover existing algorithms such as Behavior Cloning (Forward KL), GAIL (Jensen Shannon) and DAGGER (Total Variation). Moreover, we motivate cases where Reverse KL matters and derive new algorithms for minimizing it.
| 2018 - Present |
| Human-Centric Imitation Learning |
Typical imitation learning algorithms rely on either interactive feedback or kinesthetic demonstrations, both of which are expensive, repetitive, and often unnatural for an expert to provide. Can we learn from less burdensome expert inputs such as interventions, corrections or hints? We formalize these problems and provide algorithms that learn the correct behavior even with such minimal interaction.
| 2016 - 2017 |
| Imitation of Clairvoyant Oracles |
We look at POMDP problems where the latent space is large, e.g. the space of all possible maps. Directly computing optimal policies for all possible beliefs is not tractable. However, during training, we can be clairvoyant, i.e., we know the ground truth MDP and can compute optimal plans. We show how to properly imitate such clairvoyant oracles to get good, and sometimes near-optimal, POMDP policies.