Model ComparisonsAugust 4, 20255 min read

Claude Code vs Cursor vs Copilot Workspace: the AI coding war in data

I used all three on the same 20 real coding tasks. Claude Code completed 17. Cursor completed 15. Copilot Workspace completed 11. But completion rate isn't the whole story. I also tracked "time to working code" and "bugs introduced."

I spent three weeks using Claude Code, Cursor, and GitHub Copilot Workspace on the same 20 real coding tasks. Not toy examples. Real tasks from my actual projects.

The completion rate headline is clear. But the full picture is more interesting.

The 20 tasks

| Category | Tasks | Complexity | |----------|-------|-----------| | Build a new feature | 5 | High | | Fix a bug | 5 | Medium-High | | Refactor existing code | 4 | Medium | | Write tests | 3 | Medium | | Database migration + code update | 3 | High |

Languages: TypeScript (12 tasks), Python (6 tasks), SQL (2 tasks).

Completion rate

| Tool | Completed | Partial | Failed | Rate | |------|-----------|---------|--------|------| | Claude Code | 17 | 2 | 1 | 85% | | Cursor | 15 | 3 | 2 | 75% | | Copilot Workspace | 11 | 5 | 4 | 55% |

Sources: My testing, July-August 2025.

Claude Code completed 17 of 20 tasks. Cursor completed 15. Copilot Workspace completed 11.

The gap between Claude Code and Copilot Workspace (85% vs 55%) is significant. On complex, multi-file tasks, Claude Code's agentic approach (reading the codebase, planning, executing across files) consistently outperformed Copilot's suggestion-based approach.

Time to working code

| Task type | Claude Code (avg) | Cursor (avg) | Copilot Workspace (avg) | |-----------|------------------|-------------|------------------------| | New feature | 18 min | 24 min | 42 min | | Bug fix | 8 min | 12 min | 22 min | | Refactor | 14 min | 16 min | 28 min | | Write tests | 6 min | 9 min | 15 min | | DB migration | 22 min | 28 min | 45 min | | Overall avg | 14 min | 18 min | 31 min |

Claude Code is fastest on every category. The gap is smallest on refactoring (14 vs 16 min) and largest on new features (18 vs 42 min for Copilot).

The "time to working code" metric includes iteration. If the first attempt has bugs, the time includes the fix cycle. Claude Code's first attempts tend to be closer to correct, which reduces the iteration time.

Bugs introduced

| Tool | Tasks with bugs in first attempt | Bugs per task (avg) | Severity (1-3) | |------|--------------------------------|--------------------|--------------| | Claude Code | 8 of 20 (40%) | 0.7 | 1.4 (mostly minor) | | Cursor | 11 of 20 (55%) | 1.1 | 1.6 (minor to moderate) | | Copilot Workspace | 14 of 20 (70%) | 1.8 | 1.9 (moderate) |

Claude Code introduces the fewest bugs. 40% of tasks had at least one bug in the first attempt, vs 70% for Copilot.

More telling: Claude Code's bugs tend to be minor (missing edge case, wrong variable name). Copilot's bugs tend to be structural (wrong approach, misunderstood the task).

Where each tool excels

| Scenario | Best tool | Why | |----------|-----------|-----| | Multi-file feature | Claude Code | Agent reads full codebase, plans across files | | Quick inline edit | Cursor | Tab-complete in the editor is fast | | Understanding existing code | Claude Code | Reads the whole project context | | Writing tests from existing code | Claude Code | Better at understanding what to test | | Exploring unfamiliar codebase | Cursor | Inline AI chat with file context | | Simple bug fixes | Cursor | Quick inline suggestions | | Boilerplate generation | Copilot Workspace | Good at repetitive patterns |

Claude Code's strength is agentic, multi-step coding. It reads your codebase, plans changes, and executes across multiple files. That's where the 30-point lead over Copilot comes from.

Cursor's strength is interactive, in-editor assistance. For quick edits and exploration, the inline AI is faster than switching to a terminal tool.

Copilot's strength is... name recognition and GitHub integration. On raw capability, it trails.

The UX comparison

| Aspect | Claude Code | Cursor | Copilot Workspace | |--------|-----------|--------|-------------------| | Interface | Terminal | IDE (VSCode fork) | Web + IDE | | Workflow | Agent (command, wait, review) | Interactive (inline, chat) | Plan + Execute | | Context awareness | Reads full project | Indexes project | Limited to open files | | Cost | Pay per token (~$5-50/task) | $20/month | $10/month | | Learning curve | Medium (terminal comfort) | Low (familiar IDE) | Low |

Sources: Product documentation, my experience.

The cost models are different. Claude Code charges per token, so heavy use can cost $50+ per day. Cursor is $20/month flat. Copilot is $10/month.

For occasional coding, Cursor's flat rate is better value. For intensive, all-day coding on complex projects, Claude Code's per-token pricing adds up but the productivity gains likely offset it.

My daily driver

After three weeks, I settled on: Claude Code for complex tasks (new features, multi-file changes) and Cursor for quick edits and exploration.

I stopped using Copilot Workspace. The quality gap is too large to justify, even at $10/month.

The AI coding tool market isn't "one winner." It's multiple tools for different moments. My workflow now involves two tools, and I think that's where most developers will end up.

Twenty tasks. Three tools. Forty-seven hours of testing. I've never had more data about my own coding productivity.


If you found this interesting, you might also like:

-- dataku

More from dataku