gemini-skill
by WJZ-P·★ 813·Score 52
MCP server that automates Google Gemini through browser automation for image generation, chat, and image extraction.
Overview
Gemini-Skill is a comprehensive MCP server that controls Google Gemini's web interface via Chrome DevTools Protocol (CDP). It provides automated access to Gemini's features including AI image generation with HD downloads, multi-turn conversations, image uploads for generation, and extraction of images from chat sessions. The server operates with a daemon mode that maintains a browser instance in the background, enabling fast subsequent requests without needing to relaunch the browser each time.
Try asking AI
After installing, here are 4 things you can ask your AI assistant:
When to choose this
Choose this MCP server when you need AI-powered image generation through Gemini within your AI agent workflow and require persistent browser automation with stealth capabilities.
When NOT to choose this
Avoid this if you need direct API access to Gemini without browser automation, or if you require multi-browser parallel processing which is not yet supported.
Tools this server exposes
12 tools extracted from the READMEgemini_generate_imagegemini_generate_image(prompt, newSession, referenceImages, fullSize, timeout)Generate an image through Gemini AI with prompt
gemini_new_chatgemini_new_chat()Start a new blank conversation with Gemini
gemini_temp_chatgemini_temp_chat()Enter temporary conversation mode with Gemini
gemini_switch_modelgemini_switch_model(model)Switch between different Gemini models
gemini_send_messagegemini_send_message(message, timeout)Send a text message to Gemini and wait for a reply
gemini_upload_imagesgemini_upload_images(images)Upload images to Gemini as reference for image generation
gemini_get_imagesgemini_get_images()Retrieve all image metadata from the current conversation
gemini_extract_imagegemini_extract_image(imageUrl)Extract an image as base64 and save it locally
gemini_download_full_size_imagegemini_download_full_size_image(index)Download the full-size high-resolution version of an image
gemini_share_latest_imagegemini_share_latest_image(index, timeout, copyToClipboard, closeDialog)Create a public share link for the latest image
gemini_get_all_text_responsesgemini_get_all_text_responses()Get all text responses from the current conversation
gemini_get_latest_text_responsegemini_get_latest_text_response()Get the latest text response from Gemini
Comparable tools
Installation
Installation
Prerequisites
- Node.js ≥ 18
- Chrome/Edge/Chromium browser with Google account logged in
Steps
git clone https://github.com/WJZ-P/gemini-skill.git
cd gemini-skill
npm installConfiguration
Create a .env file in the project root with your configuration:
BROWSER_DEBUG_PORT=40821
BROWSER_HEADLESS=false
DAEMON_TTL_MS=1800000
OUTPUT_DIR=./gemini-imageClaude Desktop Configuration
Add to Claude Desktop's claude_desktop_config.json:
{
"mcpServers": {
"gemini": {
"command": "node",
"args": ["<absolute_path_to_gemini-skill>/src/mcp-server.js"]
}
}
}Compare gemini-skill with
Last updated · Auto-generated from public README + GitHub signals.