Back to Blog
ai-agents
coding-agents
pi-agent
hermes-agent
codex
claude-code
mcplato

Pi, Hermes, Codex, Claude Code, and MCPlato: Which Agent Fits Your Work?

A practical, scenario-based comparison of Pi Agent, Hermes Agent, Codex, Claude Code, and MCPlato across control, workflow fit, long-running tasks, and permission strategy.

Published on 2026-05-27

The useful question is not, "Which AI agent is strongest?"

It is: "Which agent fits this job, this environment, and this level of risk?"

Pi Agent, Hermes Agent, Codex, Claude Code, and MCPlato are all called agents, but they are not trying to be the same product. Pi is a minimal terminal coding harness. Hermes is a memory-and-automation-heavy assistant framework. Codex is a managed coding workflow across local and cloud surfaces. Claude Code is a mature agentic coding loop with strong repo workflows. MCPlato is an AI workspace for research, reports, office work, local materials, multi-session execution, and background tasks.

During research, the GitHub API returned 56,110 stars and 6,677 forks for earendil-works/pi, 169,745 stars and 28,286 forks for NousResearch/hermes-agent, and 86,227 stars and 12,601 forks for openai/codex.123 Treat those numbers as repository attention signals, not active-user counts.

This is a practical comparison, not a product ranking.

Product fit at a glance

ProductBest fitWhy people choose itMain trade-off
Pi AgentTerminal-native power users, agent builders, minimal harness usersSmall surface area, direct file/bash tools, interactive and JSON/RPC/SDK modes, session tree and forkingYou own governance, extensions, and long-running workflow discipline
Hermes AgentAlways-on assistants, memory experiments, automations, bot-like gatewaysPersistent memory, self-improvement framing, skill creation, 70+ built-in tools4, subagents and scheduled/background automationsMemory, compression, and learning loops add state complexity and failure modes
CodexCoding workflows across CLI, IDE, desktop, cloud, GitHubStrong sandbox and approval documentation, cloud tasks, MCP, web search, image inputs, exec scriptingPrimarily a coding workflow, not a general office or multi-app workspace
Claude CodeRepo maintenance, refactoring, CI, code review, subagent/skill workflowsMature agentic coding loop across terminal, IDEs, desktop/web, GitHub/GitLab, Slack, MCP, Agent SDKLess hackable than a minimal harness and still needs explicit governance
MCPlatoResearch, reports, office workflows, local materials, multi-app tasks, async AI coworker patternsAI workspace, AI Partner, multi-session orchestration, local-first connected materials, artifacts, scheduled/background tasks, permissioned executionHeavier than a minimal terminal harness; not the fastest path for one-off shell coding

Scenario fit map for Pi, Hermes, Codex, Claude Code, and MCPlatoScenario fit map for Pi, Hermes, Codex, Claude Code, and MCPlato

Figure 1: Think in scenarios and work surfaces, not in a single universal leaderboard.

Why Pi is getting attention

Pi's appeal is easy to understand if you have been frustrated by heavyweight agent products.

The canonical project is earendil-works/pi, with the public website at pi.dev and the npm package @earendil-works/pi-coding-agent reported as version 0.75.5 during research.56 Its positioning is deliberately narrow: a minimal terminal coding harness with default tools such as read, write, edit, and bash, plus optional read-only search/navigation tools.

That minimalism solves several user pain points:

  1. Too many agents hide the control plane. Pi exposes a smaller, more inspectable tool loop.
  2. Power users want composability. Interactive use, print/JSON mode, RPC, and SDK entry points make Pi feel like a building block, not only an app.
  3. Long sessions need branching. Pi's session tree, fork/clone flow, compaction, and JSONL session record match how developers actually explore alternatives.
  4. Some users do not want popups as product philosophy. Pi does not default to built-in MCP, subagents, permission popups, plan mode, or background bash. Those belong to extensions/packages rather than the core.

The weakness is the same as the strength: Pi is not trying to be a managed governance layer. If you need permission policy, background execution recovery, team review, or non-code office workflows out of the box, you will need to build or add that layer yourself.

The five choosing principles

1. Choose by job, not by "strongest agent"

A strong coding agent is not automatically a strong research assistant. A flexible memory agent is not automatically safe for production repositories. A workspace agent is not automatically the fastest terminal tool.

Use the job first:

JobGood default fitWhy
Build or customize a terminal coding harnessPiMinimal core, direct tools, SDK/RPC-friendly shape
Run an always-on personal assistant or bot gatewayHermesMemory, skills, automations, voice/gateway/MCP-oriented surface
Delegate coding work across CLI, IDE, cloud, and GitHubCodexMultiple coding entry points plus documented sandbox and approval modes
Maintain a serious repo with refactors, CI, subagents, and review loopsClaude CodeMature code-agent workflows, permissions/settings, skills, subagents, CI/Slack surfaces
Produce sourced reports, office artifacts, multi-app work, and background researchMCPlatoWorkspace, connected materials, multi-session orchestration, artifacts, scheduled/background tasks

This is where MCPlato fits naturally: not as "the best agent," but as the better default when the work spans documents, browser research, local materials, office outputs, multiple sessions, and asynchronous follow-through. If the task is simply "edit this file from the terminal," Pi or a coding-native tool may be a cleaner fit.

2. Control versus managed workflow is a real trade-off

The market is splitting into two useful extremes.

At one end, Pi gives expert users a compact harness. You can see the pieces, wire your own extensions, and keep the agent close to the shell. That is excellent for agent builders and terminal power users.

At the other end, Codex, Claude Code, and MCPlato provide more managed product surfaces. Codex documents sandbox modes such as read-only, workspace-write, and danger-full-access, plus approval policies such as untrusted, on-request, and never; its default posture is described as workspace-write with network off.7 Claude Code's quickstart states that it asks permission before modifying files and its settings/permissions documentation gives teams ways to tune behavior.89 MCPlato exposes public workspace concepts such as AI Partner, Desktop AI Engine, connected materials, ClawMode, scheduled/background tasks, decision traces, diary, and four permission levels.1011

Hermes sits in a different place: it offers broad autonomy and extensibility, but the state model is more complex. Its docs emphasize self-improvement, persistent memory, skill creation, CLI/gateway/voice/MCP, background tasks, scheduled automations, and subagents.4 That makes it promising for long-lived assistants, but not automatically safer. Memory and compression issues, including discussions such as issue #33256, are reminders that persistent agent state needs careful review rather than blind trust.12

The best choice depends on whether you want to assemble the control plane or use a product that already gives you one.

3. Long-running work needs checkpoints, recovery, and artifacts

Short coding tasks can survive as a chat. Long-running work cannot.

A long-running agent task should have:

  • a prompt contract;
  • a curated context/environment;
  • permission boundaries;
  • checkpoints;
  • reviewable artifacts;
  • a recovery or continuation path.

Long-task control stack for AI agentsLong-task control stack for AI agents

Figure 2: Long-running agent work is safer when control is layered instead of hidden inside a single chat thread.

Each product approaches this differently:

  • Pi gives useful primitives such as session trees, forks, clones, compaction, and JSONL records. Great for controlled exploration; less complete as a finished operations layer.
  • Hermes aims at durable memory and scheduled/background automations. Powerful for continuity; riskier when memory quality, compression, or self-improvement feedback loops are not inspected.
  • Codex supports local and cloud coding tasks, MCP, web search, image inputs, and scripted execution across its coding surfaces.13
  • Claude Code adds subagents with independent context/tool access, skills, MCP, GitHub Actions/GitLab CI, Slack, and scheduled/routine-oriented workflows in its documentation.141516
  • MCPlato is strongest when long work is not only code: research branches, document drafting, browser/material review, image or office artifact production, and background tasks can live as workspace-level workstreams rather than one overloaded chat.

A practical rule: if the task will last more than one session, require an artifact and a checkpoint plan before letting the agent run far.

4. The best agent is the one that fits your environment

Interfaces matter because they shape mistakes.

Your daily environmentPreferWatch out for
Terminal and scriptsPiAdd your own permission and recovery discipline
Code editor + repo + cloud task queueCodexKeep non-code workflows elsewhere
Terminal/IDE/CI/chatops engineering loopClaude CodeSet repo rules, tool permissions, and review checkpoints
Assistant framework, gateways, voice, memory, automationHermesAudit memory and scheduled behavior carefully
Desktop knowledge work across files, browser, office artifacts, and multiple sessionsMCPlatoUse curated connected materials; do not overuse it for tiny shell-only tasks

This is also the simplest way to avoid tool sprawl. Do not force every job through the newest agent. Put each tool where its interface is already natural.

5. Permission strategy must match risk

The agent with the most autonomy is not always the agent with the best permission model for your task.

A lightweight permission strategy works well:

Risk levelExamplesRecommended policy
LowRead files, summarize docs, search approved materialsAllow with logging
MediumEdit drafts, create reports, run local scriptsAllow in workspace or sandbox, require artifacts
HighDelete, deploy, publish, send external messages, access sensitive systemsRequire explicit confirmation and evidence

Codex's public sandbox and approval docs make this discussion explicit.7 Claude Code's docs emphasize permissions/settings rather than a single sandbox promise.9 Pi's minimal default means permission strategy is often your wrapper's responsibility. Hermes users should be extra cautious with background automations and persistent memory. MCPlato is best used with a workspace-level risk boundary: connect only the materials needed, pick an appropriate permission level, and make the final artifact reviewable before external action.

Product highlights and honest limitations

Pi Agent: minimalism as a feature

Pi is compelling because it refuses to become a full workspace. Its default tool set is small, its session mechanics are developer-friendly, and its multiple entry points make it attractive for people building their own agent workflows.

Choose Pi when you want control, hackability, and terminal-native iteration. Do not choose it expecting polished governance, office workflow coverage, or autonomous background operations out of the box.

Hermes Agent: long-lived assistant energy

Hermes is the most ambitious in memory and self-improvement language. Persistent memory, skill creation, gateways, voice, MCP, subagents, and scheduled/background automations make it attractive if you want an assistant that survives across tasks.4

Choose Hermes when you are comfortable managing stateful autonomy. Avoid treating its learning loop as inherently reliable. Memory is useful only when it is inspectable, correctable, and bounded.

Codex: managed coding across surfaces

Codex is the strongest fit when the unit of work is software engineering and you want one system across desktop, IDE, CLI, cloud/web, and GitHub @codex flows.131718 Its sandbox and approval vocabulary is especially helpful for teams that need to discuss risk concretely.

Choose Codex for coding work with managed execution choices. Do not expect it to replace a general workspace for office documents, research synthesis, or multi-app knowledge work.

Claude Code: mature agentic coding loop

Claude Code is less about being a tiny harness and more about being a full professional coding companion. Its public docs cover terminal use, IDE integrations, desktop/web surfaces, MCP, GitHub Actions/GitLab CI, subagents, skills, settings, Slack, and Agent SDK entry points.14191516

Choose Claude Code for serious repository maintenance and engineering workflows. The limitation is that maturity does not remove the need for governance: teams still need permissions, coding standards, test requirements, and review checkpoints.

MCPlato: workspace-first AI work

MCPlato is not trying to beat Pi at being a tiny terminal harness. Its public positioning is an AI workspace with AI Partners, Desktop AI Engine, async workflows, local-first connected materials, multi-session orchestration, multi-window work, virtual partner/Sprite concepts, artifact discipline, scheduled/background tasks, ClawMode, permissioned observable execution, decision trace, and diary.10

Choose MCPlato when the deliverable is a report, comparison, research brief, office artifact, multi-app workflow, or long-running background task. It is especially useful when work needs several sessions: one for research, one for drafting, one for image generation, one for source cleanup, and a coordinating partner that keeps track of what is done.

The limitation is complexity. If your job is a one-file terminal edit, a minimal harness may feel faster.

A practical selection strategy

Use a small portfolio instead of searching for one universal agent:

  1. Default to Pi for small terminal-native experiments and custom harness building.
  2. Use Codex or Claude Code when the center of gravity is a repository, tests, pull requests, and CI.
  3. Use Hermes for experimental always-on assistants, memory, gateway, and automation scenarios where you can audit state.
  4. Use MCPlato when the work crosses research, local materials, browser context, office artifacts, multiple sessions, or background follow-through.
  5. Escalate permissions only when the artifact is inspectable. Read first, draft second, write third, publish/deploy/send last.

The winning pattern is not maximum autonomy. It is bounded autonomy matched to the job.

Conclusion

Pi's rise makes sense: many technical users want a smaller, more legible harness after dealing with heavier agent products. Hermes shows the appeal and risk of persistent assistant state. Codex and Claude Code show how quickly coding agents are becoming full engineering workflows. MCPlato points at a different category: the AI workspace for knowledge work, artifacts, local materials, and parallel execution.

None is universally best. The right agent is the one whose interface, permission model, and recovery story match the work you are actually doing.

References

Footnotes

  1. Pi canonical GitHub repository, earendil-works/pi. https://github.com/earendil-works/pi

  2. Hermes Agent GitHub repository, NousResearch/hermes-agent. https://github.com/NousResearch/hermes-agent

  3. OpenAI Codex GitHub repository. https://github.com/openai/codex

  4. Hermes Agent documentation. https://hermes-agent.nousresearch.com/docs/ 2 3

  5. Pi official website. https://pi.dev/

  6. npm package @earendil-works/pi-coding-agent. https://www.npmjs.com/package/@earendil-works/pi-coding-agent

  7. OpenAI Codex sandbox documentation. https://developers.openai.com/codex/sandbox 2

  8. Claude Code quickstart documentation. https://code.claude.com/docs/en/quickstart

  9. Claude Code settings documentation. https://code.claude.com/docs/en/settings 2

  10. MCPlato official website. https://mcplato.com/en/ 2

  11. MCPlato pricing information. https://mcplato.com/pricing

  12. Hermes Agent GitHub issue #33256. https://github.com/NousResearch/hermes-agent/issues/33256

  13. OpenAI Codex documentation. https://developers.openai.com/codex 2

  14. Claude Code overview documentation. https://code.claude.com/docs/en/overview 2

  15. Claude Code sub-agents documentation. https://code.claude.com/docs/en/sub-agents 2

  16. Claude Code skills documentation. https://code.claude.com/docs/en/skills 2

  17. OpenAI Codex CLI documentation. https://developers.openai.com/codex/cli

  18. OpenAI Codex IDE documentation. https://developers.openai.com/codex/ide

  19. Claude Code MCP documentation. https://code.claude.com/docs/en/mcp