Introducing Hermes Dreaming: Reviewable Self-Improvement for Hermes Agent
Hermes Dreaming is a staged, artifact-first self-improvement engine for Hermes Agent. It proposes changes as reviewable artifacts you can diff, validate, apply, or discard — turning self-improvement into a receipt trail instead of silent mutation.
Introducing Hermes Dreaming: Reviewable Self-Improvement for Hermes Agent
Hermes Dreaming v0.1.0 adds a focused layer on top of Hermes Agent's existing self-improvement bones — memory, skills, user notes, and facts. It is a staged plugin workflow for proposing changes, reviewing them as artifacts, validating them, applying them deliberately, or discarding them cleanly.
This is not a replacement for Hermes self-improvement. It is a receipt layer for it.
The real problem with agent self-improvement is not intelligence. It is trust. Anyone can say an agent is improving — the hard part is making the change legible before it lands.
Why Build Hermes Dreaming
Hermes already assumes that long-running agents need memory, skills, facts, and evolving context. The next problem is making that evolution easier to review before it lands. Hermes Dreaming exists to answer a set of operator questions with a confident "yes":
- What changed?
- Where did the proposal come from?
- What file is it going to touch?
- Can I inspect it?
- Can I validate it?
- Can I back up the existing state first?
- Can I throw the whole thing away if it smells wrong?
Staged Change Beats Silent Mutation
The phrase "self-improving agent" gets more serious once the agent has real state — and Hermes does. That power needs a staged path.
For operators, the next level is not just more autonomy. It is reviewable autonomy: proposed improvements should arrive as artifacts, with provenance, validation, backups, and a clean way to say no before anything touches live state.
Hermes Dreaming turns self-improvement into a receipt trail:
- It scans explicit sources.
- It stages proposed changes.
- It writes an artifact bundle.
- It lets you diff, validate, apply, or discard the result.
There is no mystery step between "the agent noticed something useful" and "the operator approved the change."
The Command Surface Is Boring On Purpose
Hermes Dreaming is a standalone, open-source staged self-improvement engine that also ships as a Hermes plugin. The core command surface is intentionally simple:
dreaming create --live-root ./live --artifact-root ./artifacts --source ./sources
dreaming diff ./artifacts/<artifact-id>
dreaming validate ./artifacts/<artifact-id> --live-root ./live
dreaming apply ./artifacts/<artifact-id> --live-root ./live --backup-root ./backups --approve all
dreaming discard ./artifacts/<artifact-id> --archive-root ./archive
dreaming status --artifact-root ./artifacts
| Command | What it does |
|---|---|
create | Scans the sources you explicitly provide and stages a dream artifact |
diff | Shows the report and staged proposals |
validate | Checks the artifact before it is allowed to touch live state |
apply | Writes approved proposals and backs up existing files first |
discard | Archives the artifact without mutating the live workspace |
status | Shows the staged artifacts sitting under the artifact root |
The important detail: --source is explicit and repeatable. You point Dreaming at the source material — it does not inhale your repo and start making lifestyle choices. Autonomy with a review path is how you get durable systems instead of accidental messes.
The Artifact Is The Product
The most important part of Hermes Dreaming is not the command name — it is the artifact. Each run produces a staged directory:
manifest.json # what run you are looking at
REPORT.md # human-readable summary
sources.jsonl # what got scanned
proposals.jsonl # the proposed mutations
That bundle is the receipt. It is the difference between "the agent learned" and "here is the proposed change, here is where it came from, here is what it wants to touch, and here is your chance to say no."
Not magic. Control.
Offline-First Is Not A Downgrade
The default provider path is intentionally legible. The offline marker workflow looks for explicit DREAM: lines in the source bundle, so you can test the core loop without a cloud model, an API key, or an opaque inference layer in the middle.
DREAM: memory: Keep updates short and concrete.
DREAM: user: Prefer concise status updates.
DREAM: fact: {"type": "preference", "key": "tone", "value": "casual"}
DREAM: skill: path=skills/review.md | Preserve review gates and backups.
This does not make it less useful — it makes it inspectable. Once the workflow is legible, you can swap in more capable providers later. The release already includes an optional OpenAI-compatible provider path, but the core idea does not depend on pretending a model is magic. The model can propose. The workflow still governs.
It Also Ships As A Hermes Plugin
Hermes Dreaming is standalone, but built for Hermes operators. Install it as a plugin:
hermes plugins install asimons81/hermes-dreaming --enable
hermes dreaming --help
There is also a bundled Hermes skill for the staged review workflow:
hermes-dreaming:dreaming
The CLI is not just a dev convenience — it is the operational interface. If an agent is going to touch memory, skills, user notes, or facts, the operator should have a command surface that makes the lifecycle obvious:
scan -> stage -> diff -> validate -> apply -> discard
That is the whole thesis in one line.
What This Is Not
- Not broad external sync
- Not gateway plumbing
- Not a dashboard
- Not a promise that your agent wakes up a genius from recursively staring at its own files
- Not trying to be mystical
The first release is an artifact-first MVP with explicit apply and discard semantics, validation, backups, offline marker parsing, an optional OpenAI-compatible provider, tests around the core model and CLI flow, and enough repo hygiene to be safe for public review. That is the right shape for v0.1.0: small surface, hard edges, receipts everywhere.
Why Operators Should Care
Most agent demos over-index on capability — can it write code, call tools, make plans, run overnight? Useful questions. But long-running agents eventually hit a deeper one: what happens when the system needs to change itself?
That is where trust gets real. A self-improving agent gets far more useful when it can show its work in a form the operator can inspect, validate, apply, or discard. The bar looks more like release engineering than mythology:
- Stage the change.
- Show the diff.
- Validate the artifact.
- Back up the live state.
- Apply only what was approved.
- Discard the rest without drama.
The Point
Hermes Dreaming makes Hermes-style self-improvement more legible. It does not replace the existing self-improvement story — it gives operators a plugin-shaped review workflow around it, with a staged artifact you can inspect before the change lands.
That sounds small until you have been burned by tools that mutate state silently, overclaim their intelligence, or make rollback feel like digging through a landfill with tweezers. Dreaming does not promise magic — it promises a workflow you can trust because you can actually see it.
Controlled mutation with receipts beats clever bullshit every time.
Resources
- Repository: github.com/asimons81/hermes-dreaming
- Package:
hermes-dreaming - Current release: v0.1.0
Related flows
How to Dominate Projects with the Hermes Agent Kanban Board
One agent is the wrong unit once work grows teeth. This field manual shows how to use Hermes Kanban — boards, tasks, claims, blocks, schedules, and receipts — to give long-running multi-agent work durable coordination that survives a dead shell.
Hermes Agent FULL GUIDE: Architecture, Setup, and the Self-Improving Loop
A complete walkthrough of how Hermes is put together — installation, model routing, terminal backends, messaging, context and memory engines — and how its self-improving loop turns conversations into permanent upgrades.
Forget About Memory: Building a Context OS for Your Hermes Agent
Most AI memory is a sticky note. This flow breaks down an 11-layer context architecture for Hermes Agent — identity, facts, procedures, session archives, compression, and scheduled routines — and the distinctions that decide whether your agent actually remembers how you work.