From 18,000 Tokens to 300: How We Built a Lightweight Browser Skill

December 2025. TeamDay's agent runtime needed browser automation.

The obvious choice: MCP servers. Playwright MCP (~13.7k tokens), Chrome DevTools MCP (~18k tokens). Both battle-tested, well-documented, widely adopted.

Then we read Mario Zechner's article. And we built something different.

The Token Tax Nobody Talks About

When you connect an MCP server to Claude, every tool description loads into context immediately. Chrome DevTools MCP consumes 18,000 tokens just for its 26 tool definitions - that's 9% of Claude's context window, before you've done anything.

Playwright MCP is slightly better at 13,700 tokens (21 tools), but still takes 6.8% of your budget.

The problem compounds:

Each additional MCP server adds thousands more tokens
Tool descriptions stay loaded whether you use them or not
Complex tools need detailed documentation, making them expensive
Agent gets confused with too many tools to choose from

The math is brutal. Connect 3-4 MCP servers and you've burned 30-40k tokens on tool definitions. That's context you can't use for actual work.

What If You Don't Need MCP at All?

Mario Zechner asked this question and showed an answer: simple bash scripts using Puppeteer Core, documented in ~225 tokens.

His insight: Modern LLMs already know how to code. Instead of 26 specialized tools, give them a few primitives and let them compose solutions.

The key advantages:

Token efficiency: ~300 tokens vs ~18,000 tokens
True composability: Output to files, pipe to other tools
Easy to extend: Add new capabilities in minutes
Profile support: Connect to running browser with your logins
Progressive disclosure: Skills only load when needed

We took this approach and built /browser - a lightweight skill for Claude Code.

What We Built: 7 Scripts, Infinite Possibilities

The Core Primitives

# Start browser once (with your Chrome profile for authenticated sessions)
bun browser-start.ts --profile

# Navigate anywhere
bun browser-navigate.ts https://example.com

# Execute JavaScript in page context
bun browser-eval.ts 'document.title'

# Fast screenshots to file
bun browser-screenshot.ts page.webp

# Interactive element picker (user clicks, agent gets selectors)
bun browser-pick.ts --multi

# Extract HTTP-only cookies
bun browser-cookies.ts --domain=example.com

# Clean shutdown
bun browser-stop.ts

That's it. Seven simple scripts. Each does one thing well. The agent composes them for complex workflows.

Why This Works

Connect to running browser instead of launching new instances:

Instant startup (no browser launch overhead)
Preserve authentication (Google, GitHub, LinkedIn, etc.)
User can interact (click elements, log in manually)
Fast iteration (no cleanup between commands)

Progressive disclosure:

Skill metadata: ~50 tokens ("you have a browser skill")
Skill documentation: ~300 tokens (loaded when needed)
Total overhead: 60x less than Chrome DevTools MCP

True composability:

# Extract data to file
bun browser-eval.ts 'data' > output.json

# Screenshot and process
bun browser-screenshot.ts page.webp && magick page.webp page.jpg

# Chain navigation and extraction
bun browser-navigate.ts https://site.com && bun browser-eval.ts 'scrape()'

The Real Test: 548 Slovak Professionals

Leaf NGO needed Slovak senior professionals working abroad for a mentor initiative. The data was in Clay.com, behind authentication and API limits.

The challenge:

Must be authenticated to Clay.com
API undocumented (reverse engineer from network)
Preview limited to 100 results per request
Need to segment by country to work around limits
Extract 500+ profiles with LinkedIn URLs, titles, companies, locations

Traditional approach: Build a full scraper, manage sessions, handle pagination, export data. Hours of work.

With /browser skill: 45 minutes from start to 548 exported professionals.

How It Worked

1. Start browser with profile (preserves Clay login):

bun browser-start.ts --profile
bun browser-navigate.ts https://app.clay.com

2. Discover the API by inspecting network traffic:

bun browser-eval.ts 'performance.getEntriesByType("resource").map(r => r.name)'

Spotted: https://api.clay.com/v3/actions/run-enrichment

3. Extract authentication cookies:

bun browser-cookies.ts --domain=clay.com

4. Build country-segmented extraction script:

The agent wrote a specialized script (clay-slovak-experts.ts) that:

Uses authenticated session from browser
Searches for Slovak speakers outside Slovakia
Filters by seniority levels (senior, director, VP, C-level)
Segments by country to work around 100-result preview limit
Extracts from 11 countries: US, UK, Germany, Austria, Czech Republic, Netherlands, Switzerland, Ireland, Canada, Australia, France
Handles pagination and rate limiting
Deduplicates by LinkedIn URL

5. Run extraction:

bun clay-slovak-experts.ts --count=500 --seniority=senior,director,vp

Output:

🇸🇰 Slovak Experts Extractor for Leaf NGO
=========================================
Target: 500 senior Slovak professionals working abroad
Seniority levels: senior, director, vp

🔐 Getting authentication cookies...
📊 Checking total available records...
   Found 1,247 Slovak experts abroad with selected seniority

📥 Extracting 500 records in 5 batches...
   Batch 1/5 (offset 0)... ✅ Got 100 records
   ⏳ Waiting 3s before next batch...
   Batch 2/5 (offset 100)... ✅ Got 100 records
   ...
   Batch 5/5 (offset 400)... ✅ Got 100 records

✅ Extracted 500 Slovak experts
   After deduplication: 548 unique profiles
💾 Saved to: slovak-experts.json
📄 CSV saved to: slovak-experts.csv

📈 Summary:
   Top locations:
   - United States: 198
   - United Kingdom: 142
   - Germany: 87
   - Netherlands: 34
   - Switzerland: 28
   - Austria: 21
   - Czech Republic: 18
   - Canada: 11
   - Ireland: 5
   - Australia: 3
   - France: 1

Total time: 45 minutes (including API discovery, script creation, extraction).

The power: The agent used browser automation as a composable primitive. Navigate, inspect, authenticate, extract - each step clean and simple.

Why This Matters

Token Efficiency That Compounds

Approach	Token Overhead	% of Context
Chrome DevTools MCP	~18,000 tokens	9.0%
Playwright MCP	~13,700 tokens	6.8%
Browser Skill	~300 tokens	0.15%

Savings: 60x less context consumption.

That's 17,700 tokens you can use for actual work - documents, code, reasoning.

Speed Through Simplicity

MCP approach:

Launch new browser for each session
Wait for initialization
Run tool through MCP protocol
Parse structured responses

Skill approach:

Connect to running browser (instant)
Execute script directly (no protocol overhead)
Output to files (compose with standard tools)
Keep browser alive between commands

Real-world difference: Extract 548 professionals in 45 minutes vs. hours of custom integration.

Extensibility by Design

Need a new capability? Write a 50-line script.

Example: Interactive element picker (built in 15 minutes):

// browser-pick.ts
// User clicks elements in browser, agent gets selectors

import puppeteer from 'puppeteer-core';

async function pick() {
    const browser = await puppeteer.connect({...});
    const page = await browser.pages()[0];

    // Inject click listener
    await page.evaluate(() => {
        document.addEventListener('click', (e) => {
            const el = e.target;
            // Return selector, text, attributes, etc.
            console.log({
                tag: el.tagName,
                selector: getSelector(el),
                text: el.textContent,
                ...
            });
        });
    });
}

Try building that as an MCP tool. Then try modifying it for your specific use case.

Profile Support Changes Everything

With your Chrome profile:

Already logged into Google, GitHub, LinkedIn, Clay, etc.
No credential management
No OAuth flows
No session handling
Just navigate and work

For the Clay extraction: We didn't write authentication code. The browser already had the session. The agent just used it.

The Architecture: Skills > Tools

Anthropic explained this in their agent skills talk:

"Code is not just a use case but the universal interface to the digital world."

Skills are organized folders with procedural knowledge. Progressive disclosure means you can give an agent hundreds of skills without drowning context.

MCP provides connectivity. Skills provide expertise.

The browser skill demonstrates this:

Metadata: "You have browser automation capability"
Documentation: 6,131 characters (~300 tokens when loaded)
Scripts: Only executed when needed, not loaded into context
Composable: Outputs files, works with other skills

When to Use MCP vs. Skills

Use MCP servers when:

You need real-time data streaming (databases, APIs)
External service provides official MCP integration
Tool state must persist across requests
Multiple agents need shared access

Use skills when:

You want minimal token overhead
Composability with bash/files matters
You need to customize behavior frequently
Speed of iteration > standardized protocol

Use both: MCP for connectivity (databases, Slack, Notion). Skills for expertise (browser automation, data processing, workflows).

Try It Yourself

The browser skill is available in TeamDay. Ask an agent to:

"Navigate to example.com and extract all article titles"
"Screenshot our homepage in both light and dark mode"
"Find the selector for the search button on this page"
"Extract the top 50 posts from Hacker News with scores > 100"

The agent will use the browser skill automatically - connect, navigate, extract, output results.

Token overhead: ~300 tokens when loaded. 60x less than Chrome DevTools MCP.

The Core Insight

Modern LLMs are good enough at coding that simple scripts + documentation beats complex tooling.

You don't need 26 specialized browser tools. You need 7 primitives and an agent that knows how to compose them.

The result: Less context overhead. Faster execution. Easier customization. Real work done in minutes instead of hours.

548 Slovak professionals for Leaf NGO. Extracted in 45 minutes with 300 tokens of overhead.

That's the power of lightweight skills.

Based on Mario Zechner's insight. Built for TeamDay agents. Used in production for real NGO work.

From 18,000 Tokens to 300: How We Built a Lightweight Browser Skill That Outperforms MCP