University of California, Los Angeles
February 2026
What are the benefits of hierarchies in offline RL?
Goal-conditioned hierarchies achieve state-of-the-art performance on long-horizon tasks, but require a lot of design complexity and a generative model over the subgoal space, which is expensive to train. We do a deep dive into a state-of-the-art hierarchical method for offline goal-conditioned RL and identify a simple yet key reason for its success: it's just easier to train policies for short-horizon goals!