Engineering
brightdata-web-mcp avatar

brightdata-web-mcp

Reliable web access for MCP agents: scrape data, bypass anti-bot measures, perform structured extraction, and automate browsers.

Introduction

The Bright Data Web MCP provides a robust, production-ready interface for AI agents to interact with the live web. It solves the common challenges of web scraping—such as CAPTCHAs, dynamic JavaScript-rendered content, and anti-bot protections—allowing developers to focus on building intelligent agentic workflows. By integrating this MCP server, agents gain the ability to perform high-fidelity data extraction, real-time web searching, and complete browser automation without managing complex infrastructure.

  • Advanced anti-bot bypassing: Automatically handles CAPTCHAs and sophisticated fingerprinting defenses.

  • Versatile scraping modes: Convert raw URLs into clean Markdown, retrieve full HTML, or batch-process up to 10 requests simultaneously.

  • AI-powered structured extraction: Utilize natural language prompts to extract specific data fields (e.g., price, description, stock status) into clean JSON.

  • Comprehensive browser automation: Includes full browser session control, including clicking, typing, scrolling, and network request monitoring via browser ref-based interaction.

  • Specialized data toolkits: Pre-built extractors for major platforms including Amazon, LinkedIn, Instagram, TikTok, YouTube, Google Maps, and various financial/business aggregators.

  • Scalability: Supports both Rapid (Free) mode for lightweight search tasks and Pro mode for advanced browser interaction and high-volume data scraping.

  • Ensure you have a valid Bright Data API token to enable the Pro/advanced_scraping toolsets.

  • For dynamic JS-heavy websites, prioritize the scraping_browser_* suite to ensure accurate rendering and element interaction.

  • Use batch tools like search_engine_batch and scrape_batch to optimize latency and cost when dealing with multiple data sources.

  • The SSE/HTTP endpoint setup allows for remote operation, making it ideal for distributed agent architectures.

  • Pay attention to group configurations (ecommerce, social, etc.) to optimize your token usage and tool access based on specific project requirements.

  • Local deployment can be managed via npx @brightdata/mcp, allowing for seamless integration into custom agentic frameworks like CrewAI or smolagents.

Repository Stats

Stars
34,466
Forks
5,696
Open Issues
127
Language
Jupyter Notebook
Default Branch
main
Sync Status
Idle
Last Synced
May 1, 2026, 08:45 AM
View on GitHub