MCP Catalogs
Homewebclaw screenshot

webclaw

by 0xMassi·1,155·Score 55

A Rust-based fast local-first web content extraction tool with MCP server for AI agents.

web-scrapingai-llmdeveloper-tools
137
Forks
0
Open issues
this month
Last commit
2d ago
Indexed

Overview

WebClaw is a comprehensive web scraping tool that extracts clean content from websites and transforms it into markdown, JSON, and LLM-ready formats. Built in Rust for performance, it offers both local processing capabilities and a hosted API option. The project provides multiple interfaces including a CLI, REST API, and an MCP server for direct integration with AI agents. Its architecture separates core extraction logic from fetching layers, enabling flexibility for different use cases while maintaining high performance.

Try asking AI

After installing, here are 5 things you can ask your AI assistant:

you:AI agent web access for clean page context
you:RAG ingestion from documentation sites
you:Competitor monitoring and content analysis
you:Does WebClaw require an API key for basic use?
you:Can WebClaw handle JavaScript-rendered content?

When to choose this

Choose WebClaw when you need reliable, clean web content extraction for AI agents, especially if you prefer local-first processing and want multiple output formats (markdown, JSON, LLM-ready text).

When NOT to choose this

Don't choose WebClaw if you need JavaScript-heavy site rendering (requires cloud API) or if you need a different license than AGPL-3.0 for commercial redistribution.

Tools this server exposes

10 tools extracted from the README
  • scrape

    Extract one URL as markdown, text, JSON, LLM format, or HTML

  • crawl

    Follow same-origin links and extract discovered pages

  • map

    Discover URLs without extracting every page

  • batch

    Scrape multiple URLs in parallel

  • extract

    Convert page content into structured data

  • summarize

    Summarize a page

  • diff

    Compare page content snapshots

  • brand

    Extract colors, fonts, logos, and metadata

  • search

    Search the web and scrape results

  • research

    Multi-source research workflow

Comparable tools

firecrawlscrape-doctorperplexity-parsereadability-api

Installation

Installation Options

**Agent Setup (Recommended)**

npx create-webclaw

**Homebrew**

brew tap 0xMassi/webclaw
brew install webclaw

**Cargo**

cargo install --git https://github.com/0xMassi/webclaw.git webclaw-mcp
cargo install --git https://github.com/0xMassi/webclaw.git webclaw-cli

**Claude Desktop Configuration**

{
  "mcpServers": {
    "webclaw": {
      "command": "~/.webclaw/webclaw-mcp"
    }
  }
}

FAQ

Does WebClaw require an API key for basic use?
No, the CLI and MCP server work locally without an account for core extraction. The API key is only needed when using the hosted service.
Can WebClaw handle JavaScript-rendered content?
The local version doesn't execute JavaScript, but the hosted API at webclaw.io does for dynamic content when needed.

Compare webclaw with

GitHub →

Last updated · Auto-generated from public README + GitHub signals.