Skip to main content

Documentation Index

Fetch the complete documentation index at: https://vastai-80aa3a82-fix-stale-links.mintlify.app/llms.txt

Use this file to discover all available pages before exploring further.

Introduction

Autoresearch is Andrej Karpathy’s framework for autonomous AI-driven ML research. The idea is simple: point an AI agent (Claude Code) at a small but real LLM training setup and let it experiment autonomously overnight. The agent modifies the model code, trains for 5 minutes, checks if the result improved, keeps or discards, and repeats, running ~12 experiments per hour, ~100 overnight. This guide walks you through setting up autoresearch on a Vast.ai GPU instance with Claude Code as the autonomous research agent.

Prerequisites

Install the Vast CLI if you haven’t already:
pip install vastai
vastai set api-key YOUR_API_KEY

Rent a GPU Instance and Set Up

Autoresearch requires a single NVIDIA GPU with 80GB VRAM (H100 or A100 80GB). It needs CUDA 12.8+ and about 50GB of disk for the repo, data, and dependencies.
Use the Autoresearcher template to launch a pre-configured instance with uv, Claude Code, and autoresearch already installed.

Learn more about templates

Templates are reusable configurations that bundle a Docker image, environment variables, and startup scripts into a one-click launch.
Search for available instances:
vastai search offers 'gpu_ram>=70 num_gpus=1 cuda_vers>=12.8 disk_space>=50 reliability>0.95' -o 'dph+'
Pick an instance ID from the results and rent it using the template:
vastai create instance INSTANCE_ID \
  --template_hash 934769670bfd9bc5e05d8696ef340c2b \
  --disk 50
Wait for the instance to be ready, then SSH in:
vastai show instances
ssh -p PORT root@HOST_IP
The template installs everything on first boot (~10 minutes). You can monitor progress with tail -f /var/log/provisioning.log.
The template automatically configures Claude Code permissions (Read, Edit, Write, Bash) in .claude/settings.json so it can run experiments without prompting, no manual setup needed.Once provisioning completes, skip ahead to Launch Autonomous Research.

Launch Autonomous Research

Start Claude Code

cd /workspace/autoresearch
claude
When Claude Code starts, log in to your Anthropic account:
/login
This will give you a URL to open in your browser. Follow the prompts to authenticate, then you’re ready to go. Kick off the research loop:
Hi have a look at program.md and let's kick off a new experiment! let's do the setup first.
Claude will:
  1. Read program.md for the research guidelines
  2. Create a fresh git branch (e.g. autoresearch/mar10)
  3. Run the baseline experiment
  4. Begin the autonomous loop, modifying train.py, training for 5 minutes, evaluating, keeping improvements, discarding regressions
  5. Log all results to results.tsv
Claude runs indefinitely until manually stopped. Each experiment takes ~5 minutes, so you can expect ~12 experiments/hour and ~100 experiments overnight. Each iteration also uses Claude API tokens.

What Claude can modify

Claude has full freedom to edit train.py, the model architecture, optimizer, hyperparameters, batch size, model size, training loop. The only constraints are:
  • prepare.py is read-only, the evaluation harness and data loading are fixed
  • No new packages, only dependencies in pyproject.toml
  • 5-minute time budget, every experiment runs for exactly 5 minutes

Monitoring progress

In another tmux pane (Ctrl+b then %), you can watch the experiment log:
watch -n 30 cat /workspace/autoresearch/results.tsv
Or check the git log to see what Claude has tried:
cd /workspace/autoresearch
git log --oneline -20

Cleanup

When you’re done, download your results and destroy the instance:
# From your local machine — copy results
scp -P PORT root@HOST_IP:/workspace/autoresearch/results.tsv ./results.tsv

# Destroy the instance
vastai destroy instance INSTANCE_ID
Destroying an instance permanently deletes all data on it. Make sure to copy any results you want to keep before destroying.

Additional Resources