FastAPI-BitNet
by grctest·★ 38·Score 42
FastAPI-based MCP server for Microsoft's BitNet model with session management, chat, and benchmarking capabilities.
Overview
FastAPI-BitNet provides a robust REST API built with FastAPI to manage and interact with BitNet model instances. It allows developers to programmatically control llama-cli and llama-server processes for automated testing, benchmarking, and interactive chat sessions. The server integrates with VS Code Copilot via the Model Context Protocol, enabling seamless model interaction within development workflows.
Try asking AI
After installing, here are 5 things you can ask your AI assistant:
When to choose this
Choose FastAPI-BitNet if you need an MCP interface for Microsoft's BitNet model with comprehensive session management and benchmarking capabilities, particularly when integrating with VS Code.
When NOT to choose this
Not suitable for production environments requiring high availability or load balancing, as it appears to be a single-instance implementation without built-in redundancy.
Tools this server exposes
9 tools extracted from the READMEcreate_sessionStart a new llama-cli or llama-server session
stop_sessionStop a running llama-cli or llama-server session
check_session_statusCheck the status of a running session
chat_with_sessionSend a prompt to a running BitNet session and receive a response
initialize_multiple_instancesInitialize multiple BitNet instances simultaneously
shutdown_multiple_instancesShut down multiple BitNet instances in a single API call
run_benchmarkRun a benchmark test on a GGUF model
calculate_perplexityCalculate perplexity scores for a model on test data
estimate_server_capacityEstimate maximum number of BitNet instances the server can handle
Note: Tool names inferred from feature descriptions in the README, as no explicit 'Tools' section was found. The README describes functionality for session management, chat operations, and benchmarking, which were mapped to tool names.
Comparable tools
Installation
Installation
- Prerequisites: Docker Desktop, Conda, Python 3.10+
- Set up Python environment:
``bash conda create -n bitnet python=3.11 conda activate bitnet ``
- Install Huggingface CLI:
``bash pip install -U "huggingface_hub[cli]" ``
- Download BitNet model:
``bash huggingface-cli download microsoft/BitNet-b1.58-2B-4T-gguf --local-dir app/models/BitNet-b1.58-2B-4T ``
- Run with Docker (recommended):
``bash docker build -t fastapi_bitnet . docker run -d --name ai_container -p 8080:8080 fastapi_bitnet ``
Claude Desktop Configuration
Add to claude_desktop_config.json:
{
"mcpServers": {
"fastapi-bitnet": {
"command": "http",
"args": ["http://localhost:8080/mcp"]
}
}
}FAQ
- What models are supported?
- Currently supports Microsoft's BitNet-b1.58-2B-4T model in GGUF format.
- How do I integrate with VS Code?
- Run the server and configure VS Code Copilot to use 'http://localhost:8080/mcp' as an HTTP MCP server.
Compare FastAPI-BitNet with
Last updated · Auto-generated from public README + GitHub signals.