Glossary

OpenHands

OpenHands (originally OpenDevin, renamed Sept 2024) is the leading open-source autonomous software-engineer agent. It launched within days of Cognition's Devin demo (March 2024) and grew into a 35k+ star research platform led by Xingyao Wang and Graham Neubig at All-Hands AI / CMU.

Architecture

+-------------------+
|   LLM (Claude /   |
|   GPT / local)    |
+---------+---------+
          |
+---------v---------+
|  Agent loop with  |
|  ReAct + memory   |
+---------+---------+
          |
+---------v---------+
|  Action runtime   |
|   (Docker VM)     |
| - bash terminal   |
| - browser         |
| - file editor     |
| - python REPL     |
+-------------------+

The agent acts inside a sandboxed Docker container with a real Linux desktop, browser (Chromium + Playwright), full bash shell, and a structured file editor. The action space is similar to computer-use but tuned for software engineering: open file, edit lines, run command, browse documentation, finish.

Agent variants

OpenHands ships several agents:

  • CodeActAgent , default; communicates via executable code blocks ("CodeAct" paper).
  • BrowsingAgent, pure web-research.
  • DummyAgent, for benchmarking.
  • Plug-in interface for custom agents.

Benchmarks

OpenHands holds leading open-source SWE-bench Verified scores, ~55% with Claude 3.5 Sonnet by mid-2025, surpassing many closed systems. It also competes on WebArena, GAIA, and HumanEval-Plus.

Distinctive design choices

  1. CodeAct paradigm, instead of structured tool JSON, the agent writes Python that calls primitives. This unifies the action space and lets the LLM use its strongest skill (code) for control flow.
  2. Microagents, task-specific sub-agents loaded on demand.
  3. Multi-LLM, switches between models per step (cheap model for routine, expensive for hard).
  4. Open weights everywhere, works with local Llama, DeepSeek, or Qwen via vLLM / Ollama.

Comparison

System Open? Sandbox SWE-bench Verified (≈mid-2025)
Devin (Cognition) No Yes ~50%
Claude Code (Anthropic) No Local FS ~55%
OpenAI Codex CLI (2025) Partial Local FS ~55%
OpenHands Yes Yes ~55%
Aider + Claude Yes (no sandbox) No ~30–45%

Modern relevance

OpenHands is the canonical open-source comparison in every SWE-agent paper post-2024. Its Docker-based runtime and CodeAct paradigm are widely copied; many corporate "internal Devin" systems are forks or clones.

Citation

Wang, X. et al. (2024). OpenHands: An Open Platform for AI Software Developers as Generalist Agents. arXiv:2407.16741.

Related terms: Devin / AI Software Engineer, Aider, OpenAI Codex (2025 generation), SWE-Bench, Computer-Use Agents

Discussed in:

This site is currently in Beta. Contact: Chris Paton

Textbook of Usability · Textbook of Digital Health

Auckland Maths and Science Tutoring

AI tools used: Claude (research, coding, text), ChatGPT (diagrams, images), Grammarly (editing).