Build an AI-Powered Competitive Intelligence System on a $10 VPS: Ollama Embeddings + Qdrant + FastAPI

Why Competitive Intelligence Needs AI, Not Just RSS Feeds

When you’re building an international product, you track dozens of competitors: pricing changes, feature releases, blog posts, hiring moves, and user reviews. The traditional approach — RSS feeds, Twitter monitoring, Google Alerts — tells you what happened, but not what it means.

When tracking 50+ competitors simultaneously, manual analysis becomes impossible. Worse, the most valuable insights are buried in long-form technical blog posts, earnings call transcripts, or scattered forum discussions — content that keyword matching simply misses.

An AI competitive intelligence system provides semantic understanding: it can recognize that “Company A launched usage-based API pricing” and “Company B discontinued monthly subscriptions” represent the same competitive shift, even though the wording is completely different.

This guide shows how to build a complete AI competitive intelligence pipeline on a $10/month VPS (RackNerd, Hostinger, or Vultr): automated collection → Ollama local embeddings → Qdrant vector storage → semantic search API → weekly report generation.

FTC Disclosure: We may earn a commission when you buy through our links. This doesn’t affect our testing methodology or recommendations.

System Architecture

┌──────────────┐     ┌──────────────┐     ┌──────────────┐
│  Scraper     │────▶│  Ollama      │────▶│  Qdrant      │
│  (RSS/API)   │     │  Embeddings  │     │  Vector DB   │
└──────────────┘     └──────────────┘     └──────┬───────┘
                                                  │
┌──────────────┐     ┌──────────────┐             │
│  Cron Job    │◀────│  FastAPI     │◀────────────┘
│  (Report)    │     │  Query API   │
└──────────────┘     └──────────────┘

Four core modules:

Module	Purpose	Memory	CPU
Scraper	Scheduled competitor page/blog/job listing collection	50MB	0.1 core
Ollama	Local text embeddings (nomic-embed-text model)	256MB	0.3 core
Qdrant	Vector database for storage and similarity search	256MB	0.2 core
FastAPI	Query API + report generation endpoint	128MB	0.1 core

Minimum specs: 2 vCPU, 2GB RAM, 40GB SSD — exactly the standard configuration for most $10/month VPS plans.

Step 1: Choose a VPS and Deploy the Base Environment

RackNerd (Best Price-to-Performance)

RackNerd’s AMER-DC2 or EU-DC2 plans start at approximately $9.99/year during promotional periods, providing 1 vCPU, 1GB RAM, 20GB SSD, and 1TB bandwidth. For embeddings + vector search, this is more than sufficient.

Purchase via our affiliate link (aff=19978):

RackNerd Yearly Promotional Plans — look for “Yearly Promotional” series

Hostinger (More Stable Performance)

Hostinger’s Business Shared Cloud plan at $2.99/month (12-month prepay), providing 4 vCPU, 2GB RAM, 100GB NVMe. More headroom if you plan to run multiple embedding models simultaneously.

Use our referral code JZ1ZL8465QCG for exclusive discounts.

Vultr (Flexible Scaling)

Vultr’s Cloud Compute 2GB plan at $6/month, providing 1 vCPU, 2GB RAM, 50GB SSD. The advantage is hourly billing — you can scale up/down on demand, which is useful for burst workloads like training embedding pipelines.

Base Environment Setup

# System updates
sudo apt update && sudo apt upgrade -y

# Install Docker + Docker Compose
curl -fsSL https://get.docker.com | sh
sudo usermod -aG docker $USER

# Install Ollama
curl -fsSL https://ollama.com/install.sh | sh

# Verify Ollama is running
ollama --version

Step 2: Deploy Qdrant Vector Database

Qdrant is a Rust-based vector similarity search engine with filtering, persistent storage, and REST/gRPC APIs. Its Docker image is only ~200MB with minimal memory footprint.

# docker-compose.qdrant.yml
services:
  qdrant:
    image: qdrant/qdrant:latest
    container_name: qdrant
    restart: always
    ports:
      - "6333:6333"
    volumes:
      - qdrant_data:/qdrant/storage
    environment:
      - QDRANT_SERVICE_API_KEY=${QDRANT_API_KEY:-}

volumes:
  qdrant_data:

docker compose -f docker-compose.qdrant.yml up -d

# Verify Qdrant health
curl http://localhost:6333/
# Should return: {"status":"running"}

Create a collection for competitor data:

import requests

COLLECTION_NAME = "competitors"
VECTOR_SIZE = 768  # nomic-embed-text vector dimension

resp = requests.put(
    f"http://localhost:6333/collections/{COLLECTION_NAME}",
    json={
        "vectors": {
            "size": VECTOR_SIZE,
            "distance": "Cosine"
        },
        "hnsw_config": {
            "m": 16,
            "payload_m": 16
        }
    }
)
print(resp.json())

Step 3: Load the Ollama Embedding Model

nomic-embed-text is a lightweight embedding model at only 274MB, generating 768-dimensional vectors that perform exceptionally well on semantic search tasks — far beyond what its size would suggest.

# Pull the model (first time takes ~2-3 minutes)
ollama pull nomic-embed-text

# Verify model is available
ollama list

Test embedding generation:

curl http://localhost:11434/api/embed \
  -d '{
    "model": "nomic-embed-text",
    "input": ["Competitor X raised $50M Series B", "Competitor Y launched enterprise pricing tier"]
  }'

The returned embeddings are 768-element float arrays that can be stored directly in Qdrant.

Step 4: Build the Data Collection and Embedding Pipeline

This is the core of the system. A single Python script handles three tasks: scrape competitor data → generate embeddings → store in vector database.

# ingest_competitors.py
import json
import requests
from datetime import datetime

OLLAMA_URL = "http://localhost:11434"
QDRANT_URL = "http://localhost:6333"
COLLECTION_NAME = "competitors"
VECTOR_SIZE = 768

COMPETITORS = [
    {
        "name": "Competitor A",
        "category": "AI Coding Assistant",
        "url": "https://competitor-a.com/pricing",
        "source_type": "pricing_page",
    },
    {
        "name": "Competitor B",
        "category": "LLM Platform",
        "url": "https://blog.competitor-b.com",
        "source_type": "blog",
    },
    # Add more competitors...
]


def generate_embedding(text: str) -> list[float]:
    """Generate local embeddings via Ollama"""
    resp = requests.post(
        f"{OLLAMA_URL}/api/embed",
        json={"model": "nomic-embed-text", "input": text}
    )
    resp.raise_for_status()
    return resp.json()["embeddings"][0]


def fetch_content(comp: dict) -> str:
    """Fetch competitor page content (simplified; use BeautifulSoup/Playwright in production)"""
    # For pricing pages, use scrapling or playwright
    # For blogs, parse RSS feeds
    return f"{comp['name']} {comp['category']} updated {comp['source_type']}"


def upsert_document(doc_id: str, text: str, metadata: dict):
    """Generate embedding and store in Qdrant"""
    embedding = generate_embedding(text)

    payload = {
        "id": doc_id,
        "vector": embedding,
        "payload": {
            **metadata,
            "indexed_at": datetime.utcnow().isoformat(),
            "text_preview": text[:200],
        }
    }

    resp = requests.put(
        f"{QDRANT_URL}/collections/{COLLECTION_NAME}/points/{doc_id}",
        json=payload
    )
    print(f"Upserted {doc_id}: {resp.status_code}")


def main():
    for comp in COMPETITORS:
        content = fetch_content(comp)
        doc_id = f"{comp['name'].lower().replace(' ', '-')}-{datetime.now().strftime('%Y%m%d')}"
        upsert_document(doc_id, content, {
            "competitor_name": comp["name"],
            "category": comp["category"],
            "source_url": comp["url"],
            "source_type": comp["source_type"],
        })


if __name__ == "__main__":
    main()

For the scraper component in production, recommended tools:

RSS feeds: Most competitor blogs have RSS — lowest parsing overhead
Scrapling: Our preferred crawler tool with stealthy-fetch and Cloudflare bypass
Playwright: For JS-heavy rendered pricing pages
Public APIs: Some competitors (Stripe, Vercel) offer public changelog APIs

Step 5: Build the Query API

FastAPI provides semantic search and hybrid-filtered query capabilities:

# main.py
from fastapi import FastAPI, Query
from pydantic import BaseModel
import requests

app = FastAPI(title="Competitive Intelligence API")


class SearchRequest(BaseModel):
    query: str
    filters: dict = {}
    top_k: int = 10


@app.post("/search")
def search(req: SearchRequest):
    """Semantic search for competitor intelligence"""
    # 1. Generate query embedding
    embed_resp = requests.post(
        "http://localhost:11434/api/embed",
        json={"model": "nomic-embed-text", "input": req.query}
    )
    query_vector = embed_resp.json()["embeddings"][0]

    # 2. Search in Qdrant
    qdrant_filter = {"must": []}
    for k, v in req.filters.items():
        qdrant_filter["must"].append({"key": k, "match": {"value": v}})

    search_resp = requests.post(
        f"{QDRANT_URL}/collections/{COLLECTION_NAME}/points/search",
        json={
            "vector": query_vector,
            "limit": req.top_k,
            "filter": qdrant_filter if qdrant_filter["must"] else None,
        }
    )

    results = search_resp.json()["result"]
    return [
        {
            "score": point.score,
            "competitor": point.payload.get("competitor_name"),
            "category": point.payload.get("category"),
            "text_preview": point.payload.get("text_preview"),
            "source_url": point.payload.get("source_url"),
            "indexed_at": point.payload.get("indexed_at"),
        }
        for point in results
    ]


@app.get("/competitors")
def list_competitors():
    """List all indexed competitors"""
    resp = requests.get(f"{QDRANT_URL}/collections/{COLLECTION_NAME}")
    return resp.json()

Start the service:

uvicorn main:app --host 0.0.0.0 --port 8000

Step 6: Expose Securely via Cloudflare Tunnel

Never expose your VPS API port directly to the internet. Use Cloudflare Tunnel for encrypted access:

# Install cloudflared
wget https://github.com/cloudflare/cloudflared/releases/latest/download/cloudflared-linux-amd64.deb
sudo dpkg -i cloudflared-linux-amd64.deb

# Authenticate (first run guides you through)
cloudflared tunnel login

# Create tunnel
cloudflared tunnel create ci-tunnel

# Configure routing
cat > ~/.cloudflared/config.yml << EOF
tunnel: ci-tunnel
credentials-file: /home/$USER/.cloudflared/<tunnel-id>.json

ingress:
  - hostname: api.yourdomain.com
    service: http://localhost:8000
  - service: http_status:404
EOF

# Start tunnel
cloudflared tunnel run

Your query API is now accessible at https://api.yourdomain.com/search via HTTPS, with zero inbound ports opened on the VPS.

Step 7: Scheduled Collection and Report Generation

Automate with cron jobs:

# Edit crontab
crontab -e

# Daily data collection at 2 AM
0 2 * * * cd /opt/ci-system && python3 ingest_competitors.py >> /var/log/ci-ingest.log 2>&1

# Weekly report every Monday at 9 AM
0 9 * * 1 cd /opt/ci-system && python3 generate_report.py >> /var/log/ci-report.log 2>&1

The report generator uses Ollama’s llama3.2 model (3B parameters, ~2GB memory) to summarize vector search results into natural language:

# generate_report.py
import requests

def summarize_findings(search_results: list[dict]) -> str:
    """Summarize vector search results into a natural language report"""
    text_context = "\n".join(
        f"- [{r['competitor']}] {r['text_preview']}"
        for r in search_results
    )

    prompt = f"""Based on the following competitor intelligence data collected this week,
generate a concise competitive intelligence report in English. Highlight pricing changes,
feature launches, funding news, and strategic shifts.

{text_context}

Format as a markdown report with executive summary, key changes, and actionable insights."""

    resp = requests.post(
        "http://localhost:11434/api/generate",
        json={
            "model": "llama3.2",
            "prompt": prompt,
            "stream": False,
        }
    )
    return resp.json()["response"]

Cost Analysis and Capacity Estimation

Monthly Cost Comparison

Component	RackNerd ($9.99/yr)	Hostinger ($2.99/mo)	Vultr ($6/mo)
VPS	$0.83	$2.99	$6.00
Cloudflare Tunnel	Free	Free	Free
Ollama (CPU embeddings)	Included	Included	Included
Qdrant (256MB RAM)	Included	Included	Included
FastAPI	Included	Included	Included
Total	$0.83/month	$2.99/month	$6.00/month

Capacity Estimation

For Qdrant, each competitor entry occupies approximately 3KB (768-dim float32 vector + payload):

Competitors	Collection Frequency	Monthly Vectors	Storage
20	Daily	600	~2MB
50	Daily	1,500	~4.5MB
100	Daily	3,000	~9MB
100	Hourly	72,000	~216MB

Even tracking 100 competitors hourly, storage stays under 1GB. Qdrant’s bottleneck is in-memory retrieval speed, not disk space.

Advanced: Multi-Source Fusion and Incremental Indexing

As your data grows, consider more refined strategies:

Incremental Indexing Strategy

Don’t re-embed everything every time. Track last_updated timestamps and only embed changed content:

def incremental_ingest():
    """Only process competitor pages that changed since last collection"""
    for comp in COMPETITORS:
        last_hash = get_last_hash(comp["name"])
        current_hash = compute_hash(fetch_content(comp))

        if last_hash != current_hash:
            doc_id = f"{comp['name']}-current"
            upsert_document(doc_id, fetch_content(comp), {
                "competitor_name": comp["name"],
                "category": comp["category"],
                "change_detected": True,
            })
            save_hash(comp["name"], current_hash)

Multilingual Embeddings

If your competitor landscape includes non-English markets (Japan, Korea, Latin America), switch the embedding model:

# Multilingual embedding model (supports 100+ languages)
ollama pull mxbai-embed-large

mxbai-embed-large generates 1024-dimensional vectors supporting cross-lingual semantic search across 100+ languages, with slightly higher memory (~512MB).

Hybrid Search (Keywords + Vectors)

Pure vector search can miss exact matches (e.g., searching for version “v3.2.1”). Combine Qdrant’s payload filtering for hybrid search:

def hybrid_search(query: str, category: str, min_score: float = 0.5):
    """Vector search + payload filtering"""
    query_vector = generate_embedding(query)

    resp = requests.post(
        f"{QDRANT_URL}/collections/{COLLECTION_NAME}/points/search",
        json={
            "vector": query_vector,
            "filter": {
                "must": [
                    {"key": "category", "match": {"value": category}},
                    {"key": "min_score", "range": {"gte": min_score}},
                ]
            },
            "limit": 10,
        }
    )
    return resp.json()["result"]

Who This Is For (And Who It Isn’t)

This setup is right for you if:

You track 20+ competitors and manual reading can’t keep pace
You care about semantic-level insights (“competitor is shifting to enterprise pricing”) rather than simple keyword matching
You want fully private data, without relying on third-party SaaS tools like Crayon, Kompyte, or Klue ($100-$500/month)
You have technical ability to maintain Docker containers and Python scripts
Your team has limited budget but needs professional competitive intelligence

This setup is NOT for you if:

You only track 1-2 competitors and manual monitoring suffices
You have zero DevOps capability and would prefer a managed SaaS solution
You need real-time (second-level) competitor monitoring — this system operates at hourly/daily granularity
Your competitor data comes from non-public sources (paid industry reports) that can’t be scraped

Summary

On a $10/month VPS, combined with Ollama local embeddings, Qdrant vector database, and FastAPI query endpoints, you can build a fully functional AI competitive intelligence system. Compared to commercial competitor monitoring SaaS like Crayon, Kompyte, or Klue ($100-$500/month), this solution costs less than 10% of the price — with fully private data and complete query logic control.

Key takeaways:

A 2 vCPU + 2GB RAM VPS is sufficient for daily intelligence collection across 50+ competitors
Ollama + nomic-embed-text delivers high-quality local embeddings with zero external API costs
Qdrant vector storage overhead is minimal — 100 competitors × daily updates × 1 year ≈ 200MB
Cloudflare Tunnel ensures secure API exposure with zero inbound ports opened
Cron jobs + LLM summarization enable fully automated weekly report delivery

Choose your VPS:

Best value: RackNerd (affiliate=19978) — from $9.99/year
Stable performance: Hostinger (referral code JZ1ZL8465QCG) — from $2.99/month
Flexible scaling: Vultr (ref=9706229) — from $6/month