What's New in the Docly AI Agent
A look at the latest updates to Docly's built-in AI agent: a redesigned architecture for lower token cost, real-time usage tracking, multimodal image support, and a smoother editing experience.
What's New in the Docly AI Agent
The Docly AI agent has received a significant round of updates over the past months. This post covers the highlights — from an architectural redesign that cuts token costs, to quality-of-life improvements that make the agent noticeably more reliable and pleasant to work with.
Redesigned Agent Architecture
The biggest change under the hood is a new round-based architecture with isolated conversation rounds, artifacts, and prompt caching.
Why it matters:
- Lower token usage. Each round carries only the context it needs instead of the entire conversation history. Combined with prompt caching, this significantly reduces the number of tokens sent per request.
- Artifacts. The agent now tracks proposed file changes as structured artifacts that persist across rounds, so the model doesn't have to re-read files it already knows about.
- Prompt caching. Static parts of the system prompt — tool definitions, rules, schema descriptions — are cached and reused across rounds, cutting redundant tokens on every call.
The result is a faster, cheaper agent loop with no loss in quality.
The internal code was also restructured: the main RunAgentLoop orchestrator was decomposed into focused helper methods (CallAIWithRetry, HandleReplaceOnlyResult, ExecuteToolCommands, HandleFinishOrStatus), and eight helper classes were extracted from AgentController into dedicated files under Agent/. This makes the agent codebase easier to maintain and extend.
Workspace Agent Improvements
The workspace agent — the mode where the AI operates on files in your workspace — has gained several practical improvements:
- Jump straight to the agent from any folder. Previously you had to navigate through the old report modal. Now you go directly to the workspace agent from the folder menu.
- Continue button at max iterations. When the agent hits its iteration limit, it now asks you to continue rather than silently stopping. This lets you keep a long-running task going without starting over.
- Pending changes for all file operations. Rename, delete, edit, move, and copy now all work correctly on files the agent has proposed but you haven't approved yet. The agent keeps track of pending changes in memory, so it can chain operations (create a file → rename it → edit it) before you review.
- Better error handling for replacements. When the agent edits a file and the target text isn't found, the error message now tells the agent exactly what went wrong and what to do — no more silent failures where the agent assumes everything worked.
- Multi-occurrence warnings. If a
replaceStrmatches more than one place in the document, the agent is warned that all occurrences were replaced, so it can correct unintended changes.
AI Usage Tracking
You can now see what the AI costs in real time:
- Live token counter in the agent header — updates as the agent works, so you always know where you stand.
- AI Usage dashboard in CloudPortal — monthly breakdown per model, per user. Accessible from the agent header via a direct link.
- Monthly cost limits — set a spending cap per workspace to prevent surprises.
- CloudAdmin overview — administrators get a global, push-based live view of AI usage across all controllers.
Multimodal Image Support
The agent can now see images. When you attach an image to your message, the agent receives it as part of the conversation context and can reason about it — useful for discussing layouts, debugging visual issues, or describing what a screenshot shows.
Embedded file data in .docly documents is automatically stripped when the agent reads files, keeping the context window lean while still letting you work with visual content interactively.
UX Polish
A collection of smaller fixes that add up to a smoother experience:
- Thinking bubble with elapsed time — see how long the agent has been working instead of staring at a spinner.
- Model selector remembers your choice — no more re-selecting after navigation or page refresh.
- Session isolation — thinking bubbles and state no longer leak between chat sessions.
- Overflow fix — long unbroken strings (URLs, base64 data) no longer blow out the chat layout.
- Attachments restored on session reload — re-opening a previous session now shows attachments correctly instead of raw text.
- Old/new name display for file operations — rename and move operations clearly show both the original and new path in the file list.
What's Next
We're continuing to refine the agent — better context management, smarter tool use, and broader language model support are all in progress. If you're using the Docly AI agent today, these updates are already live in your workspace.
Have feedback or questions? Reach out to us at support@docly.no.