Skip to content
AI Lab Notes
Go back

AI Coding Agents: Tools, Workflows, and What Actually Works

The AI coding assistant space has exploded. Beyond GitHub Copilot’s autocomplete, there are now terminal agents that can refactor entire codebases, IDE extensions with autonomous task execution, and workflow patterns that make AI pair programming genuinely productive. This post maps the landscape and shares what actually works in day-to-day development.

Table of contents

Open Table of contents

Two Categories: CLI Agents and IDE Assistants

The first distinction to understand is between terminal-based agents and IDE-integrated assistants. They are complementary, not competing.

CLI agents run in your terminal. You give them a prompt, they read your codebase, edit files, run commands, and iterate until the task is done. They excel at large refactors, multi-file changes, and autonomous work.

IDE assistants live inside your editor. They provide autocomplete, inline suggestions, quick fixes, and visual diffs. They excel at the moment-to-moment coding experience — the next line, the quick refactor, the inline explanation.

The most productive setup uses both: an IDE assistant for real-time coding and a CLI agent for bigger tasks.

CLI Agents: The Landscape

Claude Code

Claude Code is Anthropic’s official terminal agent. It connects to Claude’s frontier models and can read files, edit code, run shell commands, and interact with git.

What it does well:

Tradeoffs:

Best for: Complex reasoning tasks, large refactors, and any work that benefits from the strongest available model.

Aider

Aider is an open-source terminal coding assistant with excellent git integration. It maps your entire codebase, makes changes, and creates clean commits automatically.

pip install aider-chat

# Start with Claude
aider --model claude-3-5-sonnet

# Or with a local model via Ollama
aider --model ollama/gemma3:27b

What it does well:

Tradeoffs:

Best for: Git-disciplined workflows, structured refactoring, and developers who want clean commit history.

OpenCode

OpenCode is an open-source (MIT) CLI coding agent built with model flexibility in mind. It supports 75+ providers including local models through Ollama.

# Install
go install github.com/opencode-ai/opencode@latest

# Configure with Ollama
opencode --provider ollama --model codestral:22b

What it does well:

Tradeoffs:

Best for: Developers who want model flexibility, cost control, or fully offline operation.

Gemini CLI

Gemini CLI is Google’s open-source terminal agent. Its standout feature is access to Gemini’s 1M token context window for free.

What it does well:

Tradeoffs:

Best for: Free exploration of large codebases, understanding unfamiliar projects, and supplementary research.

IDE Assistants

Cursor

Cursor is a VS Code fork with deep AI integration. It offers inline code generation, multi-file editing with visual diffs, background agents, and parallel agent sessions.

Known issue (as of early 2026): Cursor has a persistent terminal hanging bug caused by a race condition during shell integration initialization. Custom shell configurations (Oh My Zsh, Powerlevel10k) are particularly affected. The bug was acknowledged in late 2025 but has not been fixed.

Best for: Visual editing workflows, developers who prefer GUI-driven AI interaction.

Continue

Continue is an open-source VS Code extension that supports any LLM provider, including local models via Ollama. It provides code completion, inline chat, and codebase-aware assistance.

Best for: Privacy-conscious developers, local model users, full customization.

Windsurf

Windsurf is another VS Code fork with an AI “Cascade” agent and autocomplete. Note: Windsurf was acquired by Cognition (makers of Devin) in mid-2025, and its long-term direction is uncertain.

Best for: Developers who want a polished IDE experience at a lower price point than Cursor.

Cline and Roo Code

Cline and Roo Code are VS Code extensions that provide autonomous agent capabilities with approval gates. Roo Code adds multi-role AI teams (security reviewer, performance analyst, architect).

Best for: Developers who want IDE-integrated autonomous agents without switching to a separate terminal.

The Multi-Tool Approach

No single tool is best at everything. Here is a practical allocation:

TaskBest Tool
Complex reasoning, large refactorsClaude Code
Quick edits, autocomplete, visual diffsCursor or Continue
Clean git commits, structured refactoringAider
Model flexibility, local models, zero costOpenCode
Free high-context explorationGemini CLI
CI/CD automation, batch operationsClaude Code headless (-p)

You do not need all of these. Start with one CLI agent (Claude Code if you have a subscription, Aider or OpenCode otherwise) and one IDE assistant (Continue for local models, Cursor for cloud models). Add more as your workflow demands it.

Workflow Patterns That Actually Work

The tools matter less than how you use them. Here are the patterns that deliver the most value:

1. TDD-First Prompting

This is the single highest-leverage pattern for working with AI coding agents. Tests give the AI a binary success signal and prevent hallucination.

The cycle:

  1. Describe what you want with concrete inputs and expected outputs
  2. Have the AI write tests first
  3. Commit the tests
  4. Have the AI write implementation code to pass the tests
  5. Run tests, iterate until green

Weak prompt: “Implement a function that validates email addresses.”

Strong prompt: “Write a validateEmail function. Test cases: user@example.com returns true, 'invalid' returns false, user@.com returns false. Write the tests first, then implement to pass them. Run tests after each change.”

The test-first constraint forces the AI to think about edge cases upfront and gives it an objective way to verify its own work.

2. Writer-Reviewer Separation

Never let the same context both write and review code. This is the AI equivalent of not proofreading your own writing.

With Claude Code, you can use sub-agents for this — define a security-reviewer agent that reviews code for vulnerabilities using a fresh context, separate from the agent that wrote the code.

Cross-model review is even more effective: have one model write and a different model critique. Different models catch different categories of errors.

3. Planner-Worker Architecture

For large tasks, separate planning from execution:

  1. Planner explores the codebase, identifies what needs to change, and creates a task list
  2. Workers execute individual tasks from the list without needing to understand the full picture
  3. Judge reviews completed work and decides whether to continue or iterate

This maps naturally to Claude Code’s Agent Teams feature, where a lead agent coordinates parallel workers.

4. One Feature Per Session

Long sessions degrade quality. The model accumulates stale context, earlier mistakes compound, and the effective instruction-following drops.

Better approach:

If you are working on a multi-day feature, structure it as a series of focused sessions rather than one marathon.

5. Escape Velocity with Context Files

The most underappreciated technique: write a CLAUDE.md (or equivalent) file in your project root that tells the AI agent how to work in your codebase.

Include:

Skip:

A well-maintained context file eliminates repeated explanations and measurably improves output quality. Research from Arize showed a 5-10% accuracy improvement from prompt optimization alone — no architecture changes needed.

Safety and Oversight

Giving an AI agent access to your terminal is powerful and risky. A few non-negotiable practices:

Always Work in Git

Every change the agent makes should be in a git repository. If something goes wrong, git checkout . reverts everything. Commit frequently — after each successful sub-task, not just at the end.

Use Deny Lists

Configure your agent to block reads of sensitive files (.env, SSH keys, credentials) and block dangerous commands (rm -rf /, mkfs, curl | bash). Most agents support this:

{
  "permissions": {
    "deny": [
      "Read(.env)",
      "Read(.env.*)",
      "Read(/home/*/.ssh/*)",
      "Bash(sudo rm -rf *)"
    ]
  }
}

Scope Repository Access

When giving agents GitHub access, use fine-grained personal access tokens scoped to specific repositories with minimal permissions (Contents read/write, Pull requests read/write). Never give an agent your full-access token.

Review Before Pushing

The agent should commit, but you should review before pushing. Use git log --oneline and git diff HEAD~3 to review what the agent did before it reaches the remote.

The Honest Assessment

AI coding agents are genuinely useful in 2026, but they are not magic:

The developers getting the most value are the ones who treat AI agents as powerful but fallible collaborators, not as autonomous replacements. Write clear specifications, maintain strong test suites, review everything, and use the right tool for each task.


Share this post on:

Previous Post
Running Local LLMs on Linux: A Practical Guide
Next Post
Hello World: Welcome to AI Lab Notes