MCP Catalogs
Homemineru-tianshu screenshot

mineru-tianshu

by magicyuan876·634·Score 51

Enterprise-grade AI preprocessing platform with MCP protocol for document, image, audio, and video processing.

ai-llmdeveloper-toolsproductivity
88
Forks
12
Open issues
1 mo ago
Last commit
2d ago
Indexed

Overview

Tianshu (MinerU) is a comprehensive AI preprocessing platform that converts unstructured data into AI-usable structured formats. It provides document, image, audio, and video processing capabilities with GPU acceleration and MCP protocol integration. The platform features a Vue3 frontend and FastAPI backend with Docker deployment support, making it suitable for enterprise environments.

Try asking AI

After installing, here are 5 things you can ask your AI assistant:

you:Document parsing and conversion to Markdown/JSON
you:Multi-modal data preprocessing for RAG applications
you:AI assistant integration via MCP protocol
you:What file formats are supported?
you:How does MCP integration work?

When to choose this

Choose this platform for enterprise document processing pipelines requiring multimodal data preprocessing and MCP integration, especially when working with scientific or professional documents needing high-fidelity conversion.

When NOT to choose this

Avoid if you need lightweight document processing without GPU acceleration or prefer solutions with simpler deployment models. The platform may be overkill for basic document conversion needs.

Tools this server exposes

4 tools extracted from the README
  • parse_document

    Parse documents (PDF, Word, Excel, etc.) into Markdown/JSON format

  • get_task_status

    Check the status of a document processing task

  • list_tasks

    List recent document processing tasks

  • get_queue_stats

    Get statistics about the document processing queue

Comparable tools

minerupaddleocrllama-parseunstructured-apidocling

Installation

Installation

Docker Deployment (Recommended)

# One-click setup
make setup

# Or use scripts
./scripts/docker-setup.sh    # Linux/Mac
scripts\docker-setup.bat     # Windows

Local Development

cd backend
bash install.sh
python start_all.py --enable-mcp  # Enable MCP

cd frontend
npm install
npm run dev

Claude Desktop Configuration

Add to claude_desktop_config.json:

{
  "mcpServers": {
    "mineru-tianshu": {
      "url": "http://localhost:8002/sse",
      "transport": "sse"
    }
  }
}

FAQ

What file formats are supported?
The platform supports PDF, Word, Excel, PowerPoint, images (JPG/PNG), audio files (MP3/WAV), video files (MP4/MKV), and bioinformatics formats (FASTA/GenBank).
How does MCP integration work?
The platform exposes tools via Model Context Protocol (MCP), allowing AI assistants like Claude Desktop to directly call document parsing services through configured endpoints.

Compare mineru-tianshu with

GitHub →

Last updated · Auto-generated from public README + GitHub signals.