arxiv:2603.02765

Next Embedding Prediction Makes World Models Stronger

Published on Mar 3

· Submitted by

Natyren on Mar 4

T-Tech

Upvote

Authors:

George Bredis ,

Nikita Balagansky ,

Daniil Gavrilov ,

Ruslan Rakhimov

Abstract

NE-Dreamer uses a temporal transformer to predict next-step encoder embeddings for model-based reinforcement learning without requiring decoders or auxiliary supervision.

AI-generated summary

Capturing temporal dependencies is critical for model-based reinforcement learning (MBRL) in partially observable, high-dimensional domains. We introduce NE-Dreamer, a decoder-free MBRL agent that leverages a temporal transformer to predict next-step encoder embeddings from latent state sequences, directly optimizing temporal predictive alignment in representation space. This approach enables NE-Dreamer to learn coherent, predictive state representations without reconstruction losses or auxiliary supervision. On the DeepMind Control Suite, NE-Dreamer matches or exceeds the performance of DreamerV3 and leading decoder-free agents. On a challenging subset of DMLab tasks involving memory and spatial reasoning, NE-Dreamer achieves substantial gains. These results establish next-embedding prediction with temporal transformers as an effective, scalable framework for MBRL in complex, partially observable environments.

View arXiv page View PDF Project page GitHub 8 Add to collection

Community

GeorgeBredis

Paper author Paper submitter about 22 hours ago

Most world models learn representations by reconstructing pixels. But reconstruction isn’t necessarily aligned with control.

In this paper we explore a different idea:
➡️predict the next encoder embedding instead of reconstructing the observation.

Using a next-embedding prediction objective and temporal transformer over latents, NE-Dreamer learns temporally predictive latent states and significantly improves performance on hard navigation tasks.