Skill libraries are an agent architecture in which the agent treats its own learned procedures as a growing, queryable library of reusable code. The seminal example is Voyager (Wang et al., NVIDIA + Caltech, 2023), an LLM-driven agent that plays Minecraft open-endedly.
Voyager's architecture
+------------------+ propose +-----------------+
| Automatic | ----------> | Iterative |
| Curriculum | | Prompting |
+------------------+ | (writes JS code)|
+--------+--------+
|
v
+-----------------+
| Minecraft env |
| (executes JS) |
+--------+--------+
|
v
add successful skills
|
v
+-----------------+
| Skill Library |
| (vector DB of |
| JS functions) |
+-----------------+
Three components:
- Automatic curriculum, GPT-4 proposes increasingly hard goals based on the agent's current state ("collect wood" → "build crafting table" → "craft iron pickaxe" → "diamond pickaxe").
- Iterative prompting, GPT-4 writes JavaScript code (Mineflayer API) to achieve each goal, executes it, observes failures, and refines.
- Skill library, successful programs are stored, embedded by docstring, and retrieved when relevant.
A skill
/**
* Mine a wooden log; precondition: empty hands; postcondition: oak_log in inventory.
*/
async function mineWoodLog(bot) {
const tree = bot.findBlock({matching: 'oak_log', maxDistance: 32});
if (!tree) throw new Error("No tree nearby");
await bot.tool.equipForBlock(tree);
await bot.dig(tree);
}
Stored with its docstring as the embedding key. When a future task involves wood, this function is retrieved and made available to the LLM as a callable tool.
Results
Voyager:
- Acquired the diamond tier (Minecraft's hardest tech tree) 3.3× faster than baselines.
- Built a library of 63+ unique skills.
- Showed transfer: skill libraries trained on one Minecraft world worked in new worlds.
- Demonstrated lifelong learning without weight updates.
Why it matters
Voyager is the cleanest demonstration of continual learning without gradient descent. The agent improves over time by:
- Storing successful behaviour as code.
- Retrieving relevant code as context for future tasks.
- Composing new skills from old ones.
This is conceptually similar to a programmer building up a personal utility library.
Modern relevance
Skill-library patterns now appear in:
- Anthropic Skills (2025), bundles of reusable instructions and code that Claude loads on demand.
- Coding agents, Aider's
aider.code-mdrepo conventions, Cursor's notepads, OpenAI Codex'sAGENTS.md. - Agent memory systems, Generative Agents (Park et al. 2023) store reflections as a skill library of insights.
- Robotics, RoboCat, RT-2, and other VLAs use skill libraries with action heads.
Limitations
- Skill quality control is hard, bad skills pollute the library.
- Embedding-based retrieval can fetch superficially similar but inappropriate skills.
- Library can grow until retrieval is the bottleneck (similar to long-context problems in memory management).
Citation
Wang, G. et al. (2023). Voyager: An Open-Ended Embodied Agent with Large Language Models. TMLR 2024. arXiv:2305.16291.
Related terms: Memory and Context Management, Tool Use, Vector Database, Self-Reflection
Discussed in:
- Chapter 15: Modern AI, Modern AI