Coding Agents

Intro

Coding agents are software tools that run an LLM inside an action loop to complete engineering tasks end to end: read code, propose a plan, edit files, run commands, inspect failures, and iterate until a stopping condition is met. They shift AI from suggestion mode (autocomplete predicts the next line) to execution mode (the agent ships a tested change across multiple files). In practice, this changes team throughput only when three conditions hold: the agent follows repository conventions via instruction files, verifies its own changes via build/test/lint, and exposes enough execution detail for humans to trust and correct the result. A concrete example: an agent tasked with adding pagination to the /orders endpoint might read the existing controller, add query parameters, update the repository layer, write an integration test, run dotnet test, fix a failing assertion, and re-run until green — all without human intervention.

A useful distinction in daily engineering:

How the Agent Loop Works

flowchart TD
    U[User prompt] --> P[Plan task and choose actions]
    P --> R[Read repo files and context]
    R --> E[Edit code and configs]
    E --> V[Run checks tests or build]
    V --> D{Pass criteria met}
    D -->|No| P
    D -->|Yes| O[Return result and rationale]

The key mechanism is iterative tool use, not one-shot generation. The model decides what to do next from observed outputs (test failures, lint errors, command logs), then re-plans. This is what separates agents from chat: a chat assistant suggests code you paste; an agent edits the file, runs the test, sees it fail, reads the error, fixes the code, and re-runs. Better agents expose this loop clearly (step-by-step logs, approval gates) so developers can intervene before incorrect edits cascade. The failure mode is an agent that loops 20+ times without converging — burning tokens and potentially making the codebase worse with each iteration.

Major Tools

Claude Code (Anthropic)

Claude Code is a terminal-first agentic environment that can also integrate with IDE workflows. It uses Claude models and executes a tool-use loop around file operations and shell commands. It supports MCP servers for external capabilities, supports reusable skills, and offers hooks to run custom automation around agent actions. Project instructions are commonly stored in AGENTS.md or CLAUDE.md to constrain behavior consistently across sessions.

Cursor

Cursor is a VS Code-based IDE with three integrated modes: tab completion, chat, and agent mode. The agent can inspect project files, apply edits, and run commands while preserving editor-native workflows such as navigation and refactoring. Cursor supports multiple model providers and uses rules files in .cursor/rules/ (.mdc with frontmatter), superseding the legacy .cursorrules approach.

GitHub Copilot

GitHub Copilot spans IDE extension workflows, Copilot Chat, CLI support, and newer agent-style experiences in GitHub environments. It is strongest when teams already standardize on GitHub for source control and pull-request operations. Repository-level behavior can be guided with .github/copilot-instructions.md, so generated changes align with team architecture, testing policy, and naming standards.

Cline

Cline is an open-source VS Code extension focused on transparent agentic execution. It supports multiple LLM providers through user-managed credentials and exposes action-by-action behavior so developers can approve or redirect work. Teams often use .clinerules to persist local project guidance.

Aider

Aider is an open-source terminal coding assistant with a strong git-aware workflow. It is optimized for patch-style iteration in existing repositories and works with many model providers. Configuration can be centralized in .aider.conf.yml, which helps teams keep consistent defaults for model choice, test commands, and editing behavior.

Windsurf (Codeium)

Windsurf is a Codeium IDE centered on agentic development via Cascade and assisted editing via Supercomplete. It combines planning, editing, and conversational interaction in one interface, with project rules typically encoded in .windsurfrules. It is positioned as an integrated IDE workflow rather than a terminal-first agent.

Opencode

Opencode is an open-source coding agent that runs in terminal and extended app/editor contexts. It emphasizes provider flexibility, MCP server integration, and a skills system for repeatable workflows. It uses AGENTS.md for project instructions, which makes it easier to align automation behavior with repository policy.

Amazon Q Developer

Amazon Q Developer provides AI coding assistance in IDE and CLI experiences with a stronger AWS-centric operating model. Beyond generation and chat, it focuses on modernization and transformation workflows for enterprise codebases (for example, migration and refactoring assistance tied to AWS services).

Pitfalls

Tradeoffs

Decision Option A Option B Practical tradeoff
Interaction model Terminal agents IDE agents Terminal gives scriptability and explicit command logs; IDE gives lower context-switching and faster interactive editing
Product model Open-source tools Commercial tools Open-source gives transparency and provider control; commercial tools give polished UX, managed infra, and enterprise support
Model strategy Single-model stack Multi-model stack Single-model simplifies behavior and tuning; multi-model improves task fit and cost optimization but adds configuration complexity

Questions

References


Whats next