
How I Solved the Biggest Problem in AI Agents: Memory
If you've been building with AI agents for any length of time, you've hit the wall.
You spend an afternoon setting up a great agent. It knows your context, your projects, your preferences. It's sharp. It's useful. You close the session, come back the next morning, and it's gone. Not the agent itself - just everything it knew. You're starting over. Again.
That problem drove me crazy for days. And it's the question I get asked more than anything else now that I've written about building an 8-agent AI team. People want to know: how do your agents actually remember things? How does your main agent wake up every session and already know what's going on?
Here's what I built, and why it works.
The Problem Is Worse Than You Think
Most people frame the memory problem as "agents forget stuff." That's true, but it understates how bad it actually is.
The real problem has three parts:
1. No persistence between sessions. Every new conversation is a blank slate. The agent has no idea what you worked on yesterday, what decisions you made last week, or what context matters right now. You end up re-explaining everything, every time.
2. Context windows fill up fast. Even if you try to dump everything into the system prompt, you hit limits quickly. And a bloated context means slower, more expensive calls - with the most important information buried under noise.
3. No structure to what gets remembered. Even with tools like long-term memory APIs, you end up with a blob of text that the agent has to hunt through. It's not organized around how the agent actually needs to use the information.
The result is an agent that feels smart in the moment but has no continuity. No identity. No real relationship with your work over time.
I needed something different.
The Architecture: Three Tiers of Memory
The solution I landed on is a three-tier file-based memory system. Everything lives in plain text files in a structured directory. No vector databases, no embeddings, no external memory APIs. Just files, organized by how often and how urgently the agent needs them.
Tier 1: Hot Memory (Loads Every Session)
This is the small set of files that loads automatically at the start of every single session. It has to be small - because it counts against the context window - but it has to cover the essentials.
My hot memory tier includes:
-
SOUL.md- The agent's identity. Who it is, what it values, how it operates, what it's accountable for. This is the most important file in the whole system. An agent without a stable identity is just a stateless chatbot. With it, the agent has consistent behavior, consistent judgment, and consistent voice across every session. -
AGENTS.md- Operating rules. How the agent should work, what it should check, how it should route different kinds of tasks. Think of it as the agent's standing orders. -
MEMORY.md- Hot context. The stuff that matters right now. Current priorities, recent decisions, open loops, things the agent needs to know walking into any session. This file has a hard limit of 50 lines. Not a soft limit - a mechanical one. If it goes over, overflow gets moved elsewhere. Keeping it tight means it stays useful. -
session-briefing.md- A short summary of what happened recently, written by the system during nightly consolidation. 20 lines max. The agent reads this and immediately knows where things stand.
That's it for Tier 1. Everything else is on-demand.
Tier 2: Searchable Memory (On-Demand)
This is the bulk of the memory system. It includes:
- Daily logs (
memory/daily/YYYY-MM-DD.md) - raw append-only captures of what happened each day - Entity files - structured knowledge about specific people, projects, and systems
- Knowledge files - learned patterns, decisions, and context that builds up over time
- Session logs - historical record of conversations
The agent doesn't load all of this every session. It searches for what it needs, when it needs it. Before answering a question about a decision we made last month, the agent searches memory. Before starting work on a project, it reads that project's status file.
The key insight here is that searchable is not the same as loaded. You don't need everything in context - you need the right thing in context at the right moment.
Tier 3: Archive (Cold Storage)
Old information that might matter someday but definitely doesn't matter today. Resolved issues, superseded decisions, old drafts. It's there if you need it, but it never loads automatically.
Daily Logs: The Foundation of Everything
The daily log is the simplest and most important piece of the whole system.
Every significant thing that happens gets appended to memory/daily/YYYY-MM-DD.md. Decisions made. Tasks completed. Problems encountered. Context the agent will need later.
These logs are append-only and immutable. You don't edit them. You don't clean them up. They're a raw record of what happened, in order, as it happened.
This matters because it gives you a ground truth. The agent's other memory files can be wrong - MEMORY.md might be stale, a project status file might not have been updated. But the daily log doesn't lie. It's what was actually written down at the time.
The logs are also how the rest of the memory system gets fed.
The Dream Cycle: Nightly Consolidation
Here's where it gets interesting.
Every night, a process I call the Dream Cycle runs. It goes through the day's raw logs and distills them into structured knowledge. Key decisions get captured. Important context gets surfaced. MEMORY.md gets pruned and updated. The session briefing for the next morning gets written.
I'm not going to give you the full spec on how this works - that's one of the things I'm keeping close. But the concept is simple: raw daily chaos goes in, structured knowledge comes out.
The result is that every morning, the agent wakes up with an accurate, current picture of where things stand. It doesn't have to search through weeks of logs to figure out what's relevant. The Dream Cycle already did that work.
Think of it like this: the daily log is the inbox, and the Dream Cycle is the filing system that processes the inbox every night.
Why Files Beat Vector Databases
I get this question a lot. Why not use embeddings? Why not use a proper vector store with semantic search?
Here's why files work better for this use case:
Predictability. I know exactly what's in context at any moment. With embeddings, you're trusting a similarity function to surface the right information. Sometimes it does. Sometimes it retrieves something technically similar but contextually irrelevant. Files give you explicit control.
Debuggability. When something goes wrong - when the agent makes a bad decision or misses something important - I can open the file and see exactly what it knew. I can trace the problem. You can't do that with an embedding store.
Structure matches usage. SOUL.md isn't a blob of text to search through - it's a structured document designed to be read from top to bottom, every session. The agent's identity doesn't need to be retrieved; it needs to be internalized. Files support that in a way that vector search doesn't.
No infrastructure overhead. No database to maintain, no embeddings to regenerate, no API calls to an external service. Just files. Fast, cheap, reliable.
The tradeoff is that you have to design the structure yourself. You can't just dump information in and expect the retrieval to sort it out. But that design work is also the work of making the system actually useful.
What This Looks Like in Practice
My main agent, Oscar, starts every session by loading the hot tier. He knows who he is (SOUL.md). He knows his operating rules (AGENTS.md). He knows what's happening right now (MEMORY.md and session-briefing.md).
Before working on any specific project, he reads that project's status file. Before answering questions about past decisions, he searches memory. Before responding to anything sensitive, he checks the relevant context.
The result is an agent that genuinely has continuity. He remembers that we had a conversation two weeks ago about a specific project and what we decided. He knows that a particular approach was tried and didn't work. He knows the current state of every active project without me having to brief him every session.
That's not magic. That's just a well-designed file system and the discipline to maintain it.
Getting Started
If you want to build something like this, start small. You don't need the full three-tier system on day one.
Start with three files:
- An identity file (
SOUL.md) - who is this agent, what does it value, how does it operate - A context file (
MEMORY.md) - what matters right now, kept ruthlessly short - A daily log - append to it every session, don't clean it up
That alone will give you dramatically better continuity than starting from scratch every time.
From there, you can add searchable memory, structured project files, and eventually something like the Dream Cycle as the volume of information grows.
More to Explore
If you haven't read it yet, check out How We Built an 8-Agent AI Team - that post covers the team structure this memory system is built to support.
And if you're looking for skills and tools to build out your own agent setup, Clelp.ai is where I'd start. It's a directory of AI tools, MCP servers, and Claude skills - built specifically for people who are actually building with agents, not just reading about them.
The memory problem is solvable. You just have to treat it like an engineering problem, not a feature request.
Jason Haugh is the founder of the Oscar Sterling Agency and the creator of Clelp.ai.
