Search Gateway

Dec 2025

Type: api

Code: 19k lines

Files: 107

Active: Nov 2025 — Dec 2025

Stack:

Tags:

Overview

Search Gateway is a unified REST API that aggregates 27+ search and content extraction providers behind a single endpoint. It enables applications to query Brave, Tavily, DuckDuckGo, arXiv, GitHub, Reddit, YouTube, Wikipedia, and many more through consistent request/response schemas, with automatic fallback chains when providers fail.

The gateway handles the complexity of managing multiple API keys, rate limits, cost tracking, and caching so consumers can focus on their search logic rather than provider integration. It is designed for AI agents, developer tools, and any application requiring reliable, cost-controlled access to diverse web search and content sources.

Screenshots

Problem

Modern applications often need to search the web, fetch articles, query academic papers, or extract content from URLs. Each provider (Brave, Tavily, DuckDuckGo, etc.) has different APIs, rate limits, pricing models, and capabilities. Managing 10+ provider integrations creates significant maintenance burden:

Different authentication mechanisms and API schemas
Varying rate limits requiring per-provider throttling
No unified fallback when a provider fails or rate-limits
Difficulty tracking costs across pay-per-use services
Redundant caching logic in each integration

Search Gateway solves this by providing one API to rule them all, with intelligent routing, automatic retries, and comprehensive observability.

Approach

Stack

FastAPI - High-performance async Python framework handling concurrent provider calls efficiently
SQLite - Lightweight persistence for response caching, usage tracking, and idempotency without external dependencies
Redis (optional) - Multi-replica rate limit coordination for horizontal scaling
Playwright - Headless browser extraction for JavaScript-heavy pages that block traditional crawlers
Prometheus + OpenTelemetry - Full observability with metrics export and distributed tracing

Challenges

Provider abstraction - Each of the 27+ providers has unique quirks. Created a BaseAdapter class with standardized retry logic, exponential backoff, and capability declarations. Adapters implement a consistent interface while handling provider-specific transformations internally.
Intelligent fallback routing - Not all providers support all operations. Built a ProviderSelector that maps operations (search:web, search:news, extract:web, search:academic) to capable providers, respecting priority order and circuit breaker states.
Cost tracking accuracy - Providers use different pricing models (per-request, per-credit, tiered plans). Implemented a cost estimation engine that reads provider catalog YAML and calculates real-time spend with tier awareness.
Stale-while-revalidate caching - For high-availability, implemented cache modes that can serve stale data immediately while refreshing in the background, reducing perceived latency for non-critical freshness requirements.

Outcomes

The gateway successfully abstracts provider complexity, reducing integration effort from weeks to hours for new applications. Key achievements:

27+ providers integrated with consistent schemas
Sub-second average response times with aggressive caching
Zero-config fallbacks that automatically route around failures
Accurate cost tracking enabling budget controls per client

Learned the importance of defensive coding when dealing with third-party APIs - providers change schemas, rate limits, and behaviors without notice. The circuit breaker pattern proved essential for graceful degradation.

Implementation Notes

Provider Adapter Pattern

Each provider extends BaseAdapter with standardized retry logic:

class BraveAdapter(BaseAdapter):
    name = "brave"
    base_url = "https://api.search.brave.com/res/v1"
    
    def capabilities(self) -> Dict[str, Any]:
        return {
            "ops": ["search:web", "search:news", "ai:grounding"],
            "filters_supported": ["include_domains", "freshness_days"],
            "options_supported": ["max_results", "safesearch"],
        }
    
    async def search(self, req: SearchRequestModel) -> List[SearchResult]:
        response = await self._request_with_retry(
            "GET", f"{self.base_url}/web/search",
            params={"q": req.query, "count": req.options.max_results}
        )
        return self._transform_results(response.json())

Operation-Based Routing

The selector routes by operation category, not just provider name:

selection = selector.select(
    operation="search:academic",  # Routes to arXiv, Semantic Scholar, OpenAlex
    client_id=x_client_id,
    fallback=True,
)
# Returns prioritized list: ["arxiv", "semantic_scholar", "openalex"]

Rate Limiting with Token Bucket

Per-client, per-provider rate limiting with burst allowance:

class TokenBucket:
    def allow(self) -> bool:
        now = time.monotonic()
        self.tokens = min(self.capacity, self.tokens + elapsed * self.rate)
        if self.tokens >= 1:
            self.tokens -= 1
            return True
        return False

Provider Catalog Configuration

All provider metadata lives in provider-catalog.yaml:

providers:
  brave:
    ops: ["search:web", "search:news", "ai:grounding"]
    limits: { rps: 1, monthly_cap: 2000 }
    pricing_usd:
      "search:web": 0.003
    plans:
      free:
        limits: { rps: 1, monthly_cap: 2000 }
      pro_ai:
        limits: { rps: 50 }
        pricing_usd: { "search:web": 0.009 }

No posts yet.

Adam Bandel