Back to Blog
ai-agents
agent-stack
agent-harness
runtime
ai-workspace
mcplato

The 2026 H1 Agent Stack: Models, Harnesses, Runtimes, and AI Workspaces

A concise 2026 H1 landscape of AI agents, coding agents, harnesses, runtimes, browser and sandbox infrastructure, observability, governance, and AI workspaces — with MCPlato positioned as part of the workspace layer.

Published on 2026-05-29

The agent race in 2026 H1 no longer looks like a simple model leaderboard.

Better models still matter. Claude 4, Claude Sonnet 4.5, Claude Opus 4.8, Gemini 2.5 Pro, DeepSeek R1/V3.1, Qwen3-Coder, and Mistral Magistral all pushed the base layer forward in reasoning, coding, context, and tool use.12345678 But the competitive question has changed:

Who can put those models into reliable work?

That means harnesses, runtimes, browsers, sandboxes, evals, observability, governance, permissions, and user-facing workspaces. The model is the engine. The agent product is the vehicle. The harness and workspace decide whether the vehicle can run inside a real company without losing state, authority, or trust.

The layered 2026 H1 agent stack

A useful way to read the market is as a stack, not a directory of logos.

A layered 2026 H1 agent stack from foundation models to AI workspaceA layered 2026 H1 agent stack from foundation models to AI workspace

Figure 1: The 2026 H1 agent stack is moving upward from model capability into execution, observability, governance, and workspace continuity.

LayerWhat it contributesRepresentative examples
Foundation modelsReasoning, coding, long context, computer/tool use, planningClaude 4 / Sonnet 4.5 / Opus 4.8, Gemini 2.5 Pro, DeepSeek R1/V3.1, Qwen3-Coder, Mistral Magistral
Agent productsPackaged workflows for coding, research, app building, operations, and enterprise processesClaude Code, OpenAI Codex, GitHub Copilot coding agent, Cursor, Devin, Jules, Replit Agent, Lovable, Bolt.new, Manus, Perplexity Labs
Harness / runtimeState, retries, human-in-the-loop, orchestration, memory, structured tool callsLangGraph/LangChain, LlamaIndex, AutoGen, CrewAI, OpenAI Agents SDK, Vercel AI SDK, Mastra, PydanticAI, Agno, Letta
Browser and sandbox infraSafe execution environments, browser automation, code sandboxes, task isolationBrowserbase, Stagehand, Playwright MCP, E2B, Daytona, Temporal, Arcade, Composio
Observability and evalsTraces, cost, latency, regression tests, prompt/tool debugging, production reviewLangSmith, Langfuse, Helicone, model and agent benchmarks
Enterprise governanceVisibility, access control, policy, agent inventory, auditability, compliance workflowsMicrosoft Copilot Studio, Salesforce Agentforce, ServiceNow AI Control Tower, MCP-based integration patterns
AI workspaceThe user-facing place where multi-step work, files, sessions, artifacts, and decisions persistMCPlato, Dust, Hebbia, workspace-style agent platforms

The important point is not that every product must cover every layer. It is that serious agent work now needs all of them somewhere in the system.

Product clusters, not a raw directory

1. Coding agents became the first mass-market agent category

Coding agents are the clearest proof that agents can move beyond chat. Claude Code became generally available alongside Claude 4 and is documented as an agentic coding tool for terminal and development workflows.19 OpenAI Codex, GitHub Copilot coding agent, Cursor, Devin, Google Jules, and Replit Agent all point to the same direction: developers want agents that can inspect repositories, edit files, run commands, open pull requests, and continue work across local and cloud contexts.101112131415

This cluster is ahead because software work already has useful guardrails: files, diffs, tests, logs, branches, CI, and review. The lesson for the rest of the market is not “everything should be coding.” It is that agents need reviewable artifacts and verification loops.

2. App builders and general agents turned prompts into workflows

Lovable, Bolt.new, Replit Agent, and Manus are examples of products centered on producing apps, websites, or executable work; Perplexity describes Labs as a creation feature for projects such as reports, dashboards, and lightweight apps.16171819 OpenAI's developer documentation describes computer-use and agent-building primitives, including a visual browser tool surface, so its agent direction is better treated as part of the same workflow shift rather than as a simple chat feature.2021

These products compress the distance between intent and artifact. Their challenge is the same challenge facing the broader agent market: once the task becomes long-running, multi-step, or externally visible, the product needs state, permissions, rollback, and a clear handoff from generated draft to production asset.

3. Enterprise agents are shifting from adoption to control

Enterprise agent platforms are now talking less like demo tools and more like operating systems for governed automation. Microsoft Copilot Studio emphasizes capabilities for scaling agent adoption.2223 Salesforce Agentforce 3 highlights visibility and control through a Command Center, MCP support, lower latency, and industry actions.24 ServiceNow positions AI Control Tower as a product for managing the AI lifecycle and governing agents, models, and workflows; its product page is a safer reference point than relying only on a press-release URL.25

Zapier Agents, Lindy, Gumloop, Dust, and Hebbia sit closer to business-team workflow automation and knowledge work.2627282930 They matter because agent adoption is not only an engineering problem. Sales, finance, legal, operations, recruiting, research, and support teams also need agent systems that can use tools without quietly bypassing policy.

4. Frameworks and runtimes became the agent middle layer

LangGraph/LangChain, LangSmith, LlamaIndex, AutoGen, CrewAI, OpenAI Agents SDK, Vercel AI SDK, Mastra, PydanticAI, Agno, and Letta represent the build layer beneath packaged products.313233343536373839404142

This layer is where durable state, memory, tool routing, human approval, structured outputs, and multi-agent orchestration become reusable primitives. It is also where many teams discover that “agent” is not one abstraction. A retrieval assistant, a coding worker, a browser operator, a finance analyst, and a customer-service agent need different runtime contracts.

5. Infra and observability became production requirements

Browserbase, Stagehand, Playwright MCP, E2B, Daytona, Temporal, Arcade, and Composio are not peripheral tools. They are part of the agent control plane.4344454647484950

Agents need browsers because much of the working web still lacks clean APIs. They need sandboxes because code and tools must run in isolated environments. They need durable workflow engines because long tasks fail and resume. They need integration gateways because credentials, permissions, and action scopes should not be improvised inside a prompt.

LangSmith, Langfuse, and Helicone show the same maturation from the observability side.325152 If an agent is touching customer data, production systems, or expensive model calls, teams need traces, evals, cost visibility, latency visibility, and regression checks.

Five trends to watch

1. Model-only differentiation is fading into runtime differentiation

The best models are converging on strong coding, tool use, long context, and planning. Anthropic reports Claude 4 coding results and Claude Code availability, while Gemini 2.5 Pro emphasizes coding and long-context capability, DeepSeek V3.1 frames itself as a step toward the agent era, and Qwen3-Coder highlights large-scale code-agent training environments.1467

That makes the runtime more important, not less. When multiple base models can reason well enough, teams choose the stack that can preserve state, call tools safely, evaluate outcomes, and keep humans in control.

2. Observability is becoming the production gate

The question “Did the model answer?” is too weak for agents. Production teams need to know:

  • Which tools were called?
  • What state changed?
  • What evidence supports completion?
  • How much did the run cost?
  • Where did latency appear?
  • Which prompt, model, tool, or environment change caused a regression?

This is why LangSmith, Langfuse, Helicone, benchmark suites, and enterprise command centers are becoming part of the buying discussion. A company cannot govern what it cannot see.

3. Browser and code sandboxes are becoming first-class infra

Computer-use agents and coding agents need safe operating surfaces. Browserbase and Stagehand focus on browser automation for AI agents; Playwright MCP exposes browser control through MCP; E2B and Daytona focus on isolated execution environments; Temporal frames durable execution for agentic AI workflows.434445464753

This is one of the most important shifts of 2026 H1: the “agent environment” is becoming a product category. The environment is where autonomy becomes either useful or dangerous.

4. Governance and protocols are becoming default expectations

MCP is important because it gives the market a shared language for connecting models to tools and context.5455 But protocols do not remove governance requirements. They make governance more urgent: once tools are easier to connect, teams need clearer policies for who can connect them, what actions are allowed, how credentials are scoped, and how activity is audited.

Salesforce Agentforce, ServiceNow AI Control Tower, and Microsoft Copilot Studio all reflect this enterprise reality.242523 Agent adoption now depends on visibility, policy, permissions, and operational ownership, not only prompt quality.

5. Async multi-session workspace is the missing user layer

A single chat thread is a poor container for long work. Real agent work often branches: one session researches, another drafts, another tests, another reviews, another waits for a scheduled follow-up. Users need a place where those workstreams, files, decisions, and artifacts remain inspectable.

This is where MCPlato fits naturally. MCPlato is best understood as an AI workspace layer: an environment for local materials, multiple sessions, background or scheduled work, artifacts, and permissioned observable execution.56 It should not be treated as a universal replacement for coding agents, enterprise control towers, or browser infrastructure. Its role is different: helping users organize and supervise AI work that spans documents, research, browser context, office outputs, and asynchronous follow-through.

In other words, MCPlato belongs on the workspace layer of the agent stack: close to the user, close to the materials, and above the lower-level runtime and infra components that make execution possible.

A practical decision framework

A decision matrix for choosing agent products by autonomy horizon and governance needsA decision matrix for choosing agent products by autonomy horizon and governance needs

Figure 2: Agent stack choices should be based on autonomy horizon and governance pressure, not on a single universal ranking.

Use five questions before choosing an agent stack.

QuestionIf the answer is “yes,” prioritize
Will the agent modify code, data, records, or external systems?Sandbox, permissions, audit logs, review gates, rollback paths
Will the task run longer than one prompt or one session?Durable state, checkpoints, background execution, workspace continuity
Will the agent use browsers or execute code?Browser automation infra, isolated sandboxes, credential boundaries
Will multiple teams rely on the output?Observability, evals, cost tracking, policy, ownership
Will users need to supervise many parallel workstreams?AI workspace, multi-session orchestration, artifacts, summaries, handoff discipline

A simple mapping helps:

  • Short coding task: start with a coding-native agent such as Claude Code, Codex, Cursor, Jules, Devin, Replit Agent, or GitHub Copilot coding agent.
  • App prototype: consider Lovable, Bolt.new, Replit Agent, or similar builder surfaces, then add review before production use.
  • Business workflow automation: look at Copilot Studio, Agentforce, ServiceNow, Zapier Agents, Lindy, Gumloop, Dust, or Hebbia depending on data, governance, and domain fit.
  • Custom agent product: assemble runtime and infra pieces such as LangGraph, LlamaIndex, CrewAI, OpenAI Agents SDK, Vercel AI SDK, MCP, Browserbase, E2B, Temporal, Composio, Langfuse, Helicone, and LangSmith.
  • Cross-material knowledge work: use an AI workspace pattern, where MCPlato is a relevant example, especially when the work spans local materials, research, artifacts, multiple sessions, and permissioned execution.

Conclusion

The 2026 H1 agent landscape is not a battle between “models” and “products.” It is the emergence of a full stack.

Models provide the reasoning substrate. Agent products package common jobs. Harnesses and runtimes keep work stateful. Browser and sandbox infrastructure make tool use safer. Observability and evals make execution inspectable. Governance makes autonomy acceptable in organizations. AI workspaces give users a place to coordinate long-running work.

The winners will not simply be the teams with the biggest model benchmark number. They will be the teams that can turn model intelligence into reliable, reviewable, permissioned workflows.

References

Footnotes

  1. Anthropic, “Introducing Claude 4,” https://www.anthropic.com/news/claude-4 2 3

  2. Anthropic, “Claude Sonnet 4.5,” https://www.anthropic.com/news/claude-sonnet-4-5

  3. Anthropic, “Claude Opus 4.8,” https://www.anthropic.com/news/claude-opus-4-8

  4. Google, “Gemini 2.5 Pro coding performance,” https://developers.googleblog.com/en/gemini-2-5-pro-io-improved-coding-performance/ 2

  5. DeepSeek, “DeepSeek-R1 release,” https://api-docs.deepseek.com/news/news250120

  6. DeepSeek, “DeepSeek-V3.1 release,” https://api-docs.deepseek.com/news/news250821 2

  7. Qwen, “Qwen3-Coder,” https://qwenlm.github.io/blog/qwen3-coder/ 2

  8. Mistral AI, “Magistral,” https://mistral.ai/news/magistral

  9. Anthropic, “Claude Code overview,” https://code.claude.com/docs/en/overview

  10. OpenAI Codex developer documentation, https://developers.openai.com/codex

  11. GitHub, “GitHub Copilot coding agent in public preview,” https://github.blog/changelog/2025-05-19-github-copilot-coding-agent-in-public-preview/

  12. Cursor changelog, https://cursor.com/changelog

  13. Cognition, “Devin 2,” https://cognition.ai/blog/devin-2

  14. Google, “Jules now available,” https://blog.google/innovation-and-ai/models-and-research/google-labs/jules-now-available/

  15. Replit, “Introducing Agent 3,” https://replit.com/blog/introducing-agent-3-our-most-autonomous-agent-yet

  16. Lovable, https://lovable.dev/

  17. Bolt.new, https://bolt.new/

  18. Manus, https://manus.im/

  19. Perplexity, “Getting started with Labs,” https://www.perplexity.ai/hub/getting-started

  20. OpenAI developer documentation, “Computer use,” https://developers.openai.com/api/docs/guides/tools-computer-use

  21. OpenAI developer documentation, “Agents,” https://developers.openai.com/api/docs/guides/agents

  22. Microsoft Copilot Studio release plan, https://learn.microsoft.com/en-us/power-platform/release-plan/2025wave2/microsoft-copilot-studio/

  23. Microsoft, “6 core capabilities to scale agent adoption in 2026,” https://www.microsoft.com/en-us/microsoft-copilot/blog/copilot-studio/6-core-capabilities-to-scale-agent-adoption-in-2026/ 2

  24. Salesforce, “Salesforce launches Agentforce 3,” https://www.salesforce.com/ap/news/press-releases/2025/06/24/salesforce-launches-agentforce-3-to-solve-the-biggest-blockers-to-scaling-ai-agents-visibility-and-control/ 2

  25. ServiceNow, “AI Control Tower,” https://www.servicenow.com/products/ai-control-tower.html 2

  26. Zapier, “AI agents survey,” https://zapier.com/blog/ai-agents-survey/

  27. Lindy Agents, https://www.lindy.ai/agents

  28. Gumloop, https://www.gumloop.com/

  29. Dust documentation, “Welcome to Dust,” https://docs.dust.tt/docs/welcome-to-dust

  30. Hebbia product, https://www.hebbia.com/product

  31. LangChain, “LangChain and LangGraph 1.0,” https://www.langchain.com/blog/langchain-langgraph-1dot0

  32. LangSmith platform, https://www.langchain.com/langsmith-platform 2

  33. LlamaIndex, “Introducing LlamaIndex 0.11,” https://www.llamaindex.ai/blog/introducing-llamaindex-0-11

  34. Microsoft Research, AutoGen, https://www.microsoft.com/en-us/research/project/autogen/

  35. CrewAI, “CrewAI OSS 1.0,” https://blog.crewai.com/crewai-oss-1-0-we-are-going-ga/

  36. OpenAI Agents SDK, https://openai.github.io/openai-agents-python/

  37. Vercel AI SDK documentation, https://ai-sdk.dev/docs/introduction

  38. Vercel, “Agentic infrastructure,” https://vercel.com/blog/agentic-infrastructure

  39. Mastra, https://mastra.ai/

  40. PydanticAI documentation, https://pydantic.dev/docs/ai/

  41. Agno documentation, https://docs.agno.com/introduction

  42. Letta, “Letta v1 agent,” https://www.letta.com/blog/letta-v1-agent

  43. Browserbase for AI, https://www.browserbase.com/industry/ai 2

  44. Browserbase Stagehand, https://www.browserbase.com/stagehand 2

  45. Microsoft Playwright MCP, https://github.com/microsoft/playwright-mcp 2

  46. E2B Enterprise, https://e2b.dev/enterprise 2

  47. Daytona sandboxes, https://www.daytona.io/docs/en/sandboxes/ 2

  48. Temporal AI solutions, https://temporal.io/solutions/ai

  49. Arcade, https://www.arcade.dev/

  50. Composio, https://composio.dev/

  51. Langfuse documentation, https://langfuse.com/docs

  52. Helicone, https://www.helicone.ai/

  53. Temporal, Agentic AI, https://temporal.io/ai/agentic-ai

  54. Anthropic, “Model Context Protocol,” https://www.anthropic.com/news/model-context-protocol

  55. Model Context Protocol, “2026 MCP Roadmap,” https://blog.modelcontextprotocol.io/posts/2026-mcp-roadmap/

  56. MCPlato, https://mcplato.com/en/