MCP Catalogs
Home

mcp-server-webscan

by bsmi021·12·Score 37

MCP server for web scanning with page fetching, link extraction, crawling, and sitemap generation.

web-scrapingdeveloper-toolsproductivity
11
Forks
2
Open issues
10 mo ago
Last commit
2d ago
Indexed

Overview

The MCP Webscan Server is a TypeScript-based implementation that provides tools for fetching, analyzing, and extracting information from web pages. It features page fetching with Markdown conversion, link extraction with filtering options, recursive site crawling with depth control, broken link checking, pattern matching for URLs, and XML sitemap generation. The server runs on stdio transport, making it compatible with MCP clients like Claude Desktop. The codebase is well-structured with clear separation between services, tools, and types.

Try asking AI

After installing, here are 5 things you can ask your AI assistant:

you:Content analysis by fetching web pages and converting them to Markdown for easier processing
you:Website auditing by extracting links and checking for broken links across pages
you:SEO analysis by generating XML sitemaps and discovering site structure through crawling
you:What is the maximum depth for site crawling?
you:Can I filter links by base URL?

When to choose this

Choose this MCP server when you need to analyze web content, perform site audits, or generate sitemaps through AI agents.

When NOT to choose this

Don't choose this if you need real-time monitoring, have strict performance requirements, or require authentication for accessing protected content.

Tools this server exposes

6 tools extracted from the README
  • fetch-pageurl: string, selector?: string

    Fetches a web page and converts it to Markdown

  • extract-linksurl: string, baseUrl?: string, limit?: number

    Extracts all links from a web page with their text

  • crawl-siteurl: string, maxDepth?: number

    Recursively crawls a website up to a specified depth

  • check-linksurl: string

    Checks for broken links on a page

  • find-patternsurl: string, pattern: string

    Finds URLs matching a specific pattern

  • generate-site-mapurl: string, maxDepth?: number, limit?: number

    Generates a simple XML sitemap by crawling

Comparable tools

mcp-server-web-scraperbrowser-mcpcurl-mcphttp-mcp

Installation

Installing via Smithery

To install Webscan for Claude Desktop automatically via [Smithery](https://smithery.ai/server/mcp-server-webscan):

npx -y @smithery/cli install mcp-server-webscan --client claude

Manual Installation

# Clone the repository
git clone <repository-url>
cd mcp-server-webscan

# Install dependencies
npm install

# Build the project
npm run build

Claude Desktop Configuration

{
  "mcpServers": {
    "webscan": {
      "command": "node",
      "args": ["path/to/mcp-server-webscan/build/index.js"],
      "env": {
        "NODE_ENV": "development",
        "LOG_LEVEL": "info"
      }
    }
  }
}

FAQ

What is the maximum depth for site crawling?
The default max depth is 2, but it can be configured between 0-5 for the crawl-site and generate-site-map tools.
Can I filter links by base URL?
Yes, the extract-links tool accepts an optional baseUrl parameter to filter links.

Compare mcp-server-webscan with

GitHub →

Last updated · Auto-generated from public README + GitHub signals.