BPN915: Control of Microrobots with Reinforcement Learning

Abstract: 

Developing task schedulers and low-level end-to-end controllers for microrobots operating in complex environments often demands extensive system and environment knowledge, leading to prolonged design cycles for specialized controllers. To expedite the generation of general controllers without requiring domain-specific expertise, we propose utilizing model-based reinforcement learning (MBRL) trained within simulated environments. Our research advances microrobot control through two key approaches: modeling the long-term dynamics of robots and distilling computationally intensive model predictive control (MPC) into reactive neural network policies. Accurate long-term control is achieved by reframing proprietary state-action data, thereby minimizing online computation and replanning requirements. MPC distillation involves extensive offline planning, with a focus on prioritizing states that contribute to goal attainment, rather than uniformly weighting all past actions, including erroneous ones. Previous results demonstrate that controllers emerging from simulated environments successfully adapt to varying terrains while tracking input commands using only motor input and a two-axis accelerometer on both flat surfaces and step-response obstacles. The controller has been successfully deployed on a scaled-up robot platform via domain randomization. For real-time implementation on microrobots, specifically the Single Chip microMote (SCuM, BPN803), a smaller, more efficient controller with reduced computational and memory demands is essential. We have trained a separate transformer model using the controller data, resulting in a network with 60% less memory usage while retaining 80% of the original network’s accuracy. Our current efforts focus on verifying the controller for quadrupedal microrobots through both sim-to-sim and sim-to-real approaches, using the scaled-up robot platform. These efforts include tasks such as walking with different gaits, balancing over varied terrain, and recovering from disturbances. Our ultimate goal is to develop an end-to-end controller capable of sensing and controlling quadrupedal microrobots to accomplish tasks using onboard processing capabilities.

Research currently funded by: Member Fees

Project ended: 03/01/2025

Author: 
Kesava Viswanadha
Zhongyu Li
Emily Tan
Nelson Lojo
Derrick Han Sun
Aviral Mishra
Rushil Desai
Publication date: 
March 1, 2025
Publication type: 
BSAC Project Materials (Final/Archive)
Citation: 
PREPUBLICATION DATA - ©University of California 2025

*Only registered BSAC Industrial Members may view project materials & publications. Click here to request member-only access.