BPN915: Control of Microrobots with Tiny Sequence Models


Generating comprehensive task schedulers and specialized low-level controllers for robots in complex environments often requires system and environment knowledge, which can result in long design times. Real-life experiments for the development, tuning, and validation of such controllers can be costly for a microrobot. To address these issues, we propose the distillation of computation-intensive expert policies (that use Model-Based Reinforcement Learning) into small sequence models trained auto-regressively on model predictive control (MPC) trajectories. Previously, we modeled the long-term dynamics of robots and distilled an MPC policy into a context-unaware neural network policy. The controller emerging from that simulated environment could adapt to terrain while tracking input commands using only motor input and a two-axis accelerometer on a flat surface and over a step response obstacle. The controller was successfully deployed on a scaled-up robot platform through domain randomization. We improve the complex terrain adaptability and obstacle avoidance capability of our previous controller by providing our new policy with mechanisms to represent a recurrent internal state. Hyperparameter sweeps on existing offline reinforcement learning for OpenAI Gym’s HalfCheetah reveal that notably smaller transformers without design changes achieve similar performance to large models. Current tests of in-context learning capability for transformers to learn stateful function classes (Kalman Filters) suggest promise for a scaled-up control problem. We explore the effectiveness of inference speed optimizations within the transformer architecture and sequence models outside of transformers to improve sequential control. Preliminary results suggest that replacing select operations with hardware-aware analogues does not greatly affect accuracy in controlling simple stateful signals. We aim to develop an end-to-end controller from sense to control for microrobots (hexapod, ionocraft, jumper) to accomplish tasks such as walking around obstacles on microscale robots using on-board processing through Single Chip microMote (SCuM, BPN803).

Research currently funded by: Member Fees

Kesava Viswanadha
Nelson Lojo
Derrick Han Sun
Aviral Mishra
Rushil Desai
Zhongyu Li
Publication date: 
February 13, 2024
Publication type: 
BSAC Project Materials (Current)
PREPUBLICATION DATA - ©University of California 2024

*Only registered BSAC Industrial Members may view project materials & publications. Click here to request member-only access.