Microsoft's AI Committee: Copilot Cowork Pits GPT and Claude Against Each Other Before Anything Reaches You

Microsoft's Copilot Cowork is now in Frontier preview — and its Model Council and Critique features run GPT and Claude in parallel, peer-reviewing each other's work before you ever see the output. The multi-model enterprise era has a product now.

Sable

Tool & Practice Writer

The Setup

Microsoft's Copilot Cowork, now available in early access via the Frontier program, is the first enterprise-grade AI product to deploy GPT and Claude not as alternatives — but as collaborators in the same workflow.

That's a meaningful distinction. Most multi-model strategies ask users to pick a model. Cowork bakes both in and lets them work on the same task simultaneously.

What Model Council and Critique Actually Do

Model Council is the simpler feature. It shows you outputs from different models side by side — where they agree, where they diverge, what each brings distinctively. Think of it as a structured A/B for knowledge work: you get two perspectives on the same prompt, in the same interface, without switching tabs.

Critique is more architecturally interesting. One model generates a draft; a second model — from a different Frontier lab — acts as a peer reviewer before the output reaches you. The generation and evaluation functions are explicitly separated.

Microsoft says this has produced measurable results: Researcher (the deep research product built on Cowork's infrastructure) now scores 13.8% higher on the DRACO benchmark — the industry's standard for deep research quality in accuracy, completeness, and objectivity.

The Enterprise Angle

For buyers, the most important thing here isn't the feature set — it's the architectural posture.

Microsoft is explicitly committing to vendor optionality as a core feature. Every model runs inside your Microsoft 365 tenant, inside enterprise data protection boundaries. You don't have to trust OpenAI and Anthropic independently — you trust Microsoft's container. The models compete; your compliance posture doesn't have to.

Capital Group, one of the early design partners, framed it plainly: this isn't about generating content. It's about *taking action* — scheduling, planning, executing workflows. The fact that two competing AI models are doing the reasoning underneath is infrastructure, not a choice users have to manage.

What It Doesn't Do Yet

Cowork is still in Frontier (Microsoft's early access program). That means:

Not generally available — requires Frontier enrollment
Feature parity across model providers is not confirmed
Enterprise customization depth isn't fully documented yet
The DRACO benchmark improvement is self-reported

For IT decision-makers, this is a watch-and-pilot situation, not a rip-and-replace.

Assessment

What works: The multi-model-by-default architecture is genuinely novel for enterprise software. Running Critique before output reaches the user addresses a real hallucination-reduction use case. The M365 integration means deployment doesn't require new infrastructure.

What to watch: Frontier access gates the product. Benchmark claims need independent validation. And Microsoft's ability to maintain competitive balance between OpenAI (in which it's a major investor) and Anthropic will face scrutiny over time.

Bottom line: Multi-model is the right direction for enterprise AI. Microsoft is the first to ship it at this scale. The architecture is sound; the execution is early.

Sable's Rating: 7.5/10 — Structurally smart, commercially credible, operationally premature. Check back at GA.

---

*Source: Microsoft 365 Blog — Copilot Cowork: Now Available in Frontier*

← All Tools