Apache-2.0 · Bots

RLinf

Production-grade reinforcement learning infrastructure for embodied and agentic AI.

3.2K stars 0.4K forks Apache-2.0 license 2026-06-04 verified
bash
$pip install rlinf
Open source
Overview

What is RLinf?

RLinf is a flexible and scalable open-source RL infrastructure designed for Embodied and Agentic AI. It supports real-world robot RL on Franka, XSquare Turtle2, and DOS-W1 arms, multiple simulation backends (ManiSkill, LIBERO, MetaWorld, IsaacLab, RoboCasa), and state-of-the-art VLA model fine-tuning (Pi0, Pi0.5, GR00T, OpenVLA). It also extends to agentic AI with support for Search-R1, rStar2, and multi-agent RL.

Unified RL across simulation and real hardware

RLinf supports 10+ simulation backends (ManiSkill, LIBERO, MetaWorld, IsaacLab, RoboCasa, Calvin, etc.) and real-world robots (Franka, XSquare Turtle2, DOS-W1) with the same API.

You can prototype in simulation and deploy on real hardware without rewriting your RL pipeline.

State-of-the-art VLA RL fine-tuning

Fine-tune Pi0, Pi0.5, GR00T, OpenVLA, LingBot-VLA and other VLA models using RL algorithms like GRPO, PPO, and DAPO.

VLA models are typically trained with imitation learning only. RLinf enables RL-based post-training that can surpass demonstration quality.

Real-world online RL with HG-DAgger

Human-Gated DAgger allows safe online RL on real robots — a human supervisor gates when the policy's actions are used vs. when human corrections are needed.

Online RL on real hardware is dangerous without safety mechanisms. HG-DAgger provides a practical bridge between human demonstrations and autonomous RL.

Agentic AI support

Extends beyond robotics to support RL for language agents — Search-R1, rStar2, coding agents, and multi-agent systems.

RLinf is one of the few frameworks that bridges embodied RL and agentic RL in a single codebase.
Install

One command to start

$ pip install rlinf
Use cases

What teams use it for

RL-based post-training for VLA policies

After collecting demonstration data and training a VLA policy with imitation learning, use RLinf to fine-tune the policy with RL for higher success rates.

Real-world robot learning with safety guarantees

Deploy RLinf on a Franka arm with HG-DAgger for safe online learning — the human intervenes when the policy makes unsafe moves, and the system learns from both successes and corrections.

Multi-agent embodied RL research

Use RLinf's multi-agent support to study coordination between multiple robots performing collaborative tasks in simulation.

Ecosystem

Tags & capabilities

botopen sourceroboticsmessagingopen source
Comparison

How it stacks up

Choose RLinf for production RL across robots and agents

vs specialized RL libraries

Stable-Baselines3 is simpler for standard RL benchmarks but lacks robot integration. RLinf provides the full stack from simulation to real hardware to agentic AI.

FAQ

Questions

What RL algorithms does RLinf support?

RLinf supports IQL, GRPO, PPO, DAPO, Reinforce++, SAC, CrossQ, RLPD, SAC-Flow, DSRL, and RECAP/CFG among others.

What robots are supported for real-world RL?

Franka Arm (with RealSense, ZED cameras, Franka Hand, Robotiq gripper), XSquare Turtle2 dual-arm, and DOS-W1. More robots are being added.

Can I use RLinf without real hardware?

Yes, RLinf supports 10+ simulation backends including ManiSkill, LIBERO, MetaWorld, IsaacLab, RoboCasa, Calvin, and more — all accessible with the same API.

Decision brief

Should you use RLinf?

JSON
Best for
  • Robotics researchers running RL experiments across simulation and real hardware
  • Teams fine-tuning VLA models with reinforcement learning
  • Developers building agentic AI systems with RL-based training
Not for
  • Beginners looking for a simple out-of-the-box robot control interface (start with LeRobot)
Trust and freshness
  • Verified 2026-06-04
  • License: Apache-2.0
  • Repo: RLinf/RLinf
  • Open-source signal
Deployment

cloud

Permission surface

messages, hardware

Decision signals

No extra signals recorded

Agent packet

Structured decision data for RLinf

This packet is the compact machine-readable view agents should use before following source links or taking action.

Capabilities

robotics, messaging

Constraints

open source

Deployment

cloud

Permission surface

messages, hardware

Recommended workflows

Robotics or embodied agent workflow

Overview

What RLinf does

What it is

RLinf is a flexible and scalable RL infrastructure supporting 10+ simulation backends, real-world robot control, VLA model fine-tuning, and agentic AI. It implements major RL algorithms (PPO, GRPO, SAC, DAPO, IQL, CrossQ, RLPD) with a unified API that works identically across simulation and real hardware. Its real-world RL stack includes HG-DAgger for safe online training, and its agentic AI module extends RL to language agents.

Why it matters

Reinforcement learning for embodied AI has been held back by the gap between simulation research and real-world deployment. RLinf bridges this gap by providing the same API across 10+ simulators and multiple real robot platforms. It also bridges the gap between robotics RL and agentic RL — a convergence that is increasingly important as VLA models and language agents share architectures and training techniques.

How to evaluate it

RLinf provides a modular architecture where environments, policies, and algorithms are swappable components. An experiment is configured via YAML or Python dict, specifying the simulator backend (or real robot), the policy model (from MLP to VLA), and the RL algorithm. For real-world RL, the HG-DAgger loop runs a policy on hardware, a human supervisor monitors and intervenes via a GUI, and the system logs both autonomous and human-corrected episodes for training.

Facts

Known metadata and operating surface

These fields are separated from editorial interpretation so agents can reason over facts and missing checks.

Resource type bot
Category Bots
Maturity active
Difficulty Unknown
License Apache-2.0
Pricing open source
Verified 2026-06-04
Source confidence high
Risk level elevated
Fit matrix

Where RLinf fits in an agent stack

strong

Robotics or embodied agent workflow

RLinf has multiple signals for robotics or embodied agent workflow, including matching tags, capabilities, category, or positioning.

  • Separate simulator claims from hardware claims and verify safety boundaries before real-world operation.
  • Confirm official docs, current maintenance, license, and runtime constraints before production use.
partial

Coding agent workflow

RLinf has at least one signal for coding agent workflow, but should be checked against a real task before adoption.

  • Run a small repository change and inspect the diff, tests, and rollback path.
  • Confirm official docs, current maintenance, license, and runtime constraints before production use.
partial

Memory or RAG workflow

RLinf has at least one signal for memory or rag workflow, but should be checked against a real task before adoption.

  • Create, update, retrieve, correct, and delete memory or retrieval objects with real data.
  • Confirm official docs, current maintenance, license, and runtime constraints before production use.
partial

Reusable skill workflow

RLinf has at least one signal for reusable skill workflow, but should be checked against a real task before adoption.

  • Run one skill end to end and check whether it produces evidence or structured output.
  • Confirm official docs, current maintenance, license, and runtime constraints before production use.
weak

Browser automation

RLinf is not primarily positioned for browser automation in the current metadata.

  • Run one non-sensitive website task and inspect clicks, waits, retries, and changed URLs.
  • Confirm official docs, current maintenance, license, and runtime constraints before production use.
weak

Connector or protocol layer

RLinf is not primarily positioned for connector or protocol layer in the current metadata.

  • Connect one low-risk service, then inspect schemas, auth scope, errors, and logs.
  • Confirm official docs, current maintenance, license, and runtime constraints before production use.
Inputs and outputs

What an agent should inspect

Likely inputs

  • Repositories, files, issues, terminal output, and test results
  • Prompts, messages, documents, images, or model inputs
  • Official setup instructions and a small real workflow

Likely outputs

  • Diffs, commits, explanations, test results, or review notes
  • A decision on whether this resource fits the target workflow
Evidence

Sources, claims, and missing checks

Claims are marked separately from source links so future crawlers and reviewers can update them without rewriting the page.

verified

RLinf is listed as open source.

License metadata: Apache-2.0
verified

RLinf has a recorded GitHub repository: RLinf/RLinf.

Resource facts and GitHub source link.
inferred

RLinf supports these recorded deployment modes: cloud.

OpenAgent decision signal metadata.
inferred

RLinf is tagged with robotics, messaging capabilities.

OpenAgent capability taxonomy.
Missing checks
  • Dedicated docs link is missing.
  • Repository freshness has not been recorded.
Next action

How to start evaluating RLinf

Inspect repository

Check license, recent activity, issues, examples, and security-sensitive code paths.

Open source

Open Homepage

Start from the official source before adopting third-party instructions.

Open source

Install RLinf

Install RLinf from PyPI.

pip install rlinf
Compare

Alternatives and nearby resources

Use related resources to compare category fit, license, deployment model, and first-workflow behavior.

FAQ

Common questions about RLinf

What RL algorithms does RLinf support?

RLinf supports IQL, GRPO, PPO, DAPO, Reinforce++, SAC, CrossQ, RLPD, SAC-Flow, DSRL, and RECAP/CFG among others.

What robots are supported for real-world RL?

Franka Arm (with RealSense, ZED cameras, Franka Hand, Robotiq gripper), XSquare Turtle2 dual-arm, and DOS-W1. More robots are being added.

Can I use RLinf without real hardware?

Yes, RLinf supports 10+ simulation backends including ManiSkill, LIBERO, MetaWorld, IsaacLab, RoboCasa, Calvin, and more — all accessible with the same API.