Add project files:

- Add database initialization scripts - Add configuration files - Add documentation - Add public assets - Add source code structure - Update README
2025-01-04 17:22:46 -07:00 · 2025-01-04 17:22:46 -07:00 · fde5b5e318
commit fde5b5e318
parent 372943801d
39 changed files with 10099 additions and 187 deletions
--- a/.env.example
+++ b/.env.example
@ -0,0 +1,11 @@
+PORT=3000
+NODE_ENV=development
+SUPABASE_URL=your_supabase_url
+SUPABASE_KEY=your_supabase_key
+OLLAMA_URL=http://localhost:11434
+OLLAMA_MODEL=llama2
+SEARXNG_URL=http://localhost:4000
+SEARXNG_INSTANCES=["http://localhost:4000"]
+MAX_RESULTS_PER_QUERY=50
+CACHE_DURATION_HOURS=24
+CACHE_DURATION_DAYS=7
--- a/README.md
+++ b/README.md
@ -1,178 +1,120 @@
-# 🚀 Perplexica - An AI-powered search engine 🔎 <!-- omit in toc -->
+# BizSearch

-[![Discord](https://dcbadge.vercel.app/api/server/26aArMy8tT?style=flat&compact=true)](https://discord.gg/26aArMy8tT)
+A tool for finding and analyzing local businesses using AI-powered data extraction.

+## Prerequisites

-![preview](.assets/perplexica-screenshot.png?)
-
-## Table of Contents <!-- omit in toc -->
-
- [Overview](#overview)
- [Preview](#preview)
- [Features](#features)
- [Installation](#installation)
-  - [Getting Started with Docker (Recommended)](#getting-started-with-docker-recommended)
-  - [Non-Docker Installation](#non-docker-installation)
-  - [Ollama Connection Errors](#ollama-connection-errors)
- [Using as a Search Engine](#using-as-a-search-engine)
- [Using Perplexica's API](#using-perplexicas-api)
- [Expose Perplexica to a network](#expose-perplexica-to-network)
- [One-Click Deployment](#one-click-deployment)
- [Upcoming Features](#upcoming-features)
- [Support Us](#support-us)
-  - [Donations](#donations)
- [Contribution](#contribution)
- [Help and Support](#help-and-support)
-
-## Overview
-
-Perplexica is an open-source AI-powered searching tool or an AI-powered search engine that goes deep into the internet to find answers. Inspired by Perplexity AI, it's an open-source option that not just searches the web but understands your questions. It uses advanced machine learning algorithms like similarity searching and embeddings to refine results and provides clear answers with sources cited.
-
-Using SearxNG to stay current and fully open source, Perplexica ensures you always get the most up-to-date information without compromising your privacy.
-
-Want to know more about its architecture and how it works? You can read it [here](https://github.com/ItzCrazyKns/Perplexica/tree/master/docs/architecture/README.md).
-
-## Preview
-
-![video-preview](.assets/perplexica-preview.gif)
-
-## Features
-
- **Local LLMs**: You can make use local LLMs such as Llama3 and Mixtral using Ollama.
- **Two Main Modes:**
-  - **Copilot Mode:** (In development) Boosts search by generating different queries to find more relevant internet sources. Like normal search instead of just using the context by SearxNG, it visits the top matches and tries to find relevant sources to the user's query directly from the page.
-  - **Normal Mode:** Processes your query and performs a web search.
- **Focus Modes:** Special modes to better answer specific types of questions. Perplexica currently has 6 focus modes:
-  - **All Mode:** Searches the entire web to find the best results.
-  - **Writing Assistant Mode:** Helpful for writing tasks that does not require searching the web.
-  - **Academic Search Mode:** Finds articles and papers, ideal for academic research.
-  - **YouTube Search Mode:** Finds YouTube videos based on the search query.
-  - **Wolfram Alpha Search Mode:** Answers queries that need calculations or data analysis using Wolfram Alpha.
-  - **Reddit Search Mode:** Searches Reddit for discussions and opinions related to the query.
- **Current Information:** Some search tools might give you outdated info because they use data from crawling bots and convert them into embeddings and store them in a index. Unlike them, Perplexica uses SearxNG, a metasearch engine to get the results and rerank and get the most relevant source out of it, ensuring you always get the latest information without the overhead of daily data updates.
- **API**: Integrate Perplexica into your existing applications and make use of its capibilities.
-
-It has many more features like image and video search. Some of the planned features are mentioned in [upcoming features](#upcoming-features).
+- Node.js 16+
+- Ollama (for local LLM)
+- SearxNG instance

 ## Installation

-There are mainly 2 ways of installing Perplexica - With Docker, Without Docker. Using Docker is highly recommended.
+1. Install Ollama:
+```bash
+# On macOS
+brew install ollama
+```

-### Getting Started with Docker (Recommended)
+2. Start Ollama:
+```bash
+# Start and enable on login
+brew services start ollama

-1. Ensure Docker is installed and running on your system.
-2. Clone the Perplexica repository:
+# Or run without auto-start
+/usr/local/opt/ollama/bin/ollama serve
+```

-   ```bash
-   git clone https://github.com/ItzCrazyKns/Perplexica.git
-   ```
+3. Pull the required model:
+```bash
+ollama pull mistral
+```

-3. After cloning, navigate to the directory containing the project files.
+4. Clone and set up the project:
+```bash
+git clone https://github.com/yourusername/bizsearch.git
+cd bizsearch
+npm install
+```

-4. Rename the `sample.config.toml` file to `config.toml`. For Docker setups, you need only fill in the following fields:
+5. Configure environment:
+```bash
+cp .env.example .env
+# Edit .env with your settings
+```

-   - `OPENAI`: Your OpenAI API key. **You only need to fill this if you wish to use OpenAI's models**.
-   - `OLLAMA`: Your Ollama API URL. You should enter it as `http://host.docker.internal:PORT_NUMBER`. If you installed Ollama on port 11434, use `http://host.docker.internal:11434`. For other ports, adjust accordingly. **You need to fill this if you wish to use Ollama's models instead of OpenAI's**.
-   - `GROQ`: Your Groq API key. **You only need to fill this if you wish to use Groq's hosted models**.
-   - `ANTHROPIC`: Your Anthropic API key. **You only need to fill this if you wish to use Anthropic models**.
+6. Start the application:
+```bash
+npm run dev
+```

-     **Note**: You can change these after starting Perplexica from the settings dialog.
+7. Open http://localhost:3000 in your browser

-   - `SIMILARITY_MEASURE`: The similarity measure to use (This is filled by default; you can leave it as is if you are unsure about it.)
+## Troubleshooting

-5. Ensure you are in the directory containing the `docker-compose.yaml` file and execute:
+If Ollama fails to start:
+```bash
+# Stop any existing instance
+brew services stop ollama
+# Wait a few seconds
+sleep 5
+# Start again
+brew services start ollama
+```

-   ```bash
-   docker compose up -d
-   ```
+To verify Ollama is running:
+```bash
+curl http://localhost:11434/api/version
+```

-6. Wait a few minutes for the setup to complete. You can access Perplexica at http://localhost:3000 in your web browser.
+## Features

-**Note**: After the containers are built, you can start Perplexica directly from Docker without having to open a terminal.
+- Business search with location filtering
+- Contact information extraction
+- AI-powered data validation
+- Clean, user-friendly interface
+- Service health monitoring

-### Non-Docker Installation
+## Configuration

-1. Install SearXNG and allow `JSON` format in the SearXNG settings.
-2. Clone the repository and rename the `sample.config.toml` file to `config.toml` in the root directory. Ensure you complete all required fields in this file.
-3. Rename the `.env.example` file to `.env` in the `ui` folder and fill in all necessary fields.
-4. After populating the configuration and environment files, run `npm i` in both the `ui` folder and the root directory.
-5. Install the dependencies and then execute `npm run build` in both the `ui` folder and the root directory.
-6. Finally, start both the frontend and the backend by running `npm run start` in both the `ui` folder and the root directory.
+Key environment variables:
+- `SEARXNG_URL`: Your SearxNG instance URL
+- `OLLAMA_URL`: Ollama API endpoint (default: http://localhost:11434)
+- `SUPABASE_URL`: Your Supabase project URL
+- `SUPABASE_ANON_KEY`: Your Supabase anonymous key
+- `CACHE_DURATION_DAYS`: How long to cache results (default: 7)

-**Note**: Using Docker is recommended as it simplifies the setup process, especially for managing environment variables and dependencies.
+## Supabase Setup

-See the [installation documentation](https://github.com/ItzCrazyKns/Perplexica/tree/master/docs/installation) for more information like exposing it your network, etc.
+1. Create a new Supabase project
+2. Run the SQL commands in `db/init.sql` to create the cache table
+3. Copy your project URL and anon key to `.env`

-### Ollama Connection Errors
+## License

-If you're encountering an Ollama connection error, it is likely due to the backend being unable to connect to Ollama's API. To fix this issue you can:
+MIT

-1. **Check your Ollama API URL:** Ensure that the API URL is correctly set in the settings menu.
-2. **Update API URL Based on OS:**
+## Cache Management

-   - **Windows:** Use `http://host.docker.internal:11434`
-   - **Mac:** Use `http://host.docker.internal:11434`
-   - **Linux:** Use `http://<private_ip_of_host>:11434`
+The application uses Supabase for caching search results. Cache entries expire after 7 days.

-   Adjust the port number if you're using a different one.
+### Manual Cache Cleanup

-3. **Linux Users - Expose Ollama to Network:**
+If automatic cleanup is not available, you can manually clean up expired entries:

-   - Inside `/etc/systemd/system/ollama.service`, you need to add `Environment="OLLAMA_HOST=0.0.0.0"`. Then restart Ollama by `systemctl restart ollama`. For more information see [Ollama docs](https://github.com/ollama/ollama/blob/main/docs/faq.md#setting-environment-variables-on-linux)
+1. Using the API:
+```bash
+curl -X POST http://localhost:3000/api/cleanup
+```

-   - Ensure that the port (default is 11434) is not blocked by your firewall.
+2. Using SQL:
+```sql
+select manual_cleanup();
+```

-## Using as a Search Engine
+### Cache Statistics

-If you wish to use Perplexica as an alternative to traditional search engines like Google or Bing, or if you want to add a shortcut for quick access from your browser's search bar, follow these steps:
-
-1. Open your browser's settings.
-2. Navigate to the 'Search Engines' section.
-3. Add a new site search with the following URL: `http://localhost:3000/?q=%s`. Replace `localhost` with your IP address or domain name, and `3000` with the port number if Perplexica is not hosted locally.
-4. Click the add button. Now, you can use Perplexica directly from your browser's search bar.
-
-## Using Perplexica's API
-
-Perplexica also provides an API for developers looking to integrate its powerful search engine into their own applications. You can run searches, use multiple models and get answers to your queries.
-
-For more details, check out the full documentation [here](https://github.com/ItzCrazyKns/Perplexica/tree/master/docs/API/SEARCH.md).
-
-## Expose Perplexica to network
-
-You can access Perplexica over your home network by following our networking guide [here](https://github.com/ItzCrazyKns/Perplexica/blob/master/docs/installation/NETWORKING.md).
-
-## One-Click Deployment
-
-[![Deploy to RepoCloud](https://d16t0pc4846x52.cloudfront.net/deploylobe.svg)](https://repocloud.io/details/?app_id=267)
-
-## Upcoming Features
-
- [x] Add settings page
- [x] Adding support for local LLMs
- [x] History Saving features
- [x] Introducing various Focus Modes
- [x] Adding API support
- [x] Adding Discover
- [ ] Finalizing Copilot Mode
-
-## Support Us
-
-If you find Perplexica useful, consider giving us a star on GitHub. This helps more people discover Perplexica and supports the development of new features. Your support is greatly appreciated.
-
-### Donations
-
-We also accept donations to help sustain our project. If you would like to contribute, you can use the following options to donate. Thank you for your support!
-
-| Ethereum                                              |
-| ----------------------------------------------------- |
-| Address: `0xB025a84b2F269570Eb8D4b05DEdaA41D8525B6DD` |
-
-## Contribution
-
-Perplexica is built on the idea that AI and large language models should be easy for everyone to use. If you find bugs or have ideas, please share them in via GitHub Issues. For more information on contributing to Perplexica you can read the [CONTRIBUTING.md](CONTRIBUTING.md) file to learn more about Perplexica and how you can contribute to it.
-
-## Help and Support
-
-If you have any questions or feedback, please feel free to reach out to us. You can create an issue on GitHub or join our Discord server. There, you can connect with other users, share your experiences and reviews, and receive more personalized help. [Click here](https://discord.gg/EFwsmQDgAu) to join the Discord server. To discuss matters outside of regular support, feel free to contact me on Discord at `itzcrazykns`.
-
-Thank you for exploring Perplexica, the AI-powered search engine designed to enhance your search experience. We are constantly working to improve Perplexica and expand its capabilities. We value your feedback and contributions which help us make Perplexica even better. Don't forget to check back for updates and new features!
+View cache statistics using:
+```sql
+select * from cache_stats;
+```
--- a/db/init.sql
+++ b/db/init.sql
@ -0,0 +1,171 @@
+-- Enable required extensions
+create extension if not exists "uuid-ossp";      -- For UUID generation
+create extension if not exists pg_cron;          -- For scheduled jobs
+
+-- Create the search_cache table
+create table public.search_cache (
+  id uuid default uuid_generate_v4() primary key,
+  query text not null,
+  results jsonb not null,
+  location text not null,
+  category text not null,
+  created_at timestamp with time zone default timezone('utc'::text, now()) not null,
+  updated_at timestamp with time zone default timezone('utc'::text, now()) not null,
+  expires_at timestamp with time zone default timezone('utc'::text, now() + interval '7 days') not null
+);
+
+-- Create indexes
+create index search_cache_query_idx on public.search_cache (query);
+create index search_cache_location_category_idx on public.search_cache (location, category);
+create index search_cache_expires_at_idx on public.search_cache (expires_at);
+
+-- Enable RLS
+alter table public.search_cache enable row level security;
+
+-- Create policies
+create policy "Allow public read access"
+  on public.search_cache for select
+  using (true);
+
+create policy "Allow service write access"
+  on public.search_cache for insert
+  with check (true);
+
+create policy "Allow service update access"
+  on public.search_cache for update
+  using (true)
+  with check (true);
+
+create policy "Allow delete expired records"
+  on public.search_cache for delete
+  using (expires_at < now());
+
+-- Create function to clean up expired records
+create or replace function cleanup_expired_cache()
+returns void
+language plpgsql
+security definer
+as $$
+begin
+  delete from public.search_cache
+  where expires_at < now();
+end;
+$$;
+
+-- Create a manual cleanup function since pg_cron might not be available
+create or replace function manual_cleanup()
+returns void
+language plpgsql
+security definer
+as $$
+begin
+  delete from public.search_cache
+  where expires_at < now();
+end;
+$$;
+
+-- Create a view for cache statistics
+create or replace view cache_stats as
+select
+  count(*) as total_entries,
+  count(*) filter (where expires_at < now()) as expired_entries,
+  count(*) filter (where expires_at >= now()) as valid_entries,
+  min(created_at) as oldest_entry,
+  max(created_at) as newest_entry,
+  count(distinct category) as unique_categories,
+  count(distinct location) as unique_locations
+from public.search_cache;
+
+-- Grant permissions to access the view
+grant select on cache_stats to postgres;
+
+-- Create table if not exists businesses
+create table if not exists businesses (
+  id text primary key,
+  name text not null,
+  phone text,
+  email text,
+  address text,
+  rating numeric,
+  website text,
+  logo text,
+  source text,
+  description text,
+  latitude numeric,
+  longitude numeric,
+  last_updated timestamp with time zone default timezone('utc'::text, now()),
+  search_count integer default 1,
+  created_at timestamp with time zone default timezone('utc'::text, now())
+);
+
+-- Create indexes for common queries
+create index if not exists businesses_name_idx on businesses (name);
+create index if not exists businesses_rating_idx on businesses (rating desc);
+create index if not exists businesses_search_count_idx on businesses (search_count desc);
+create index if not exists businesses_last_updated_idx on businesses (last_updated desc);
+
+-- Create tables if they don't exist
+CREATE TABLE IF NOT EXISTS businesses (
+    id TEXT PRIMARY KEY,
+    name TEXT NOT NULL,
+    phone TEXT,
+    email TEXT,
+    address TEXT,
+    rating INTEGER,
+    website TEXT,
+    logo TEXT,
+    source TEXT,
+    description TEXT,
+    location JSONB,
+    place_id TEXT,
+    photos TEXT[],
+    opening_hours TEXT[],
+    distance JSONB,
+    last_updated TIMESTAMP WITH TIME ZONE DEFAULT CURRENT_TIMESTAMP,
+    search_count INTEGER DEFAULT 0
+);
+
+CREATE TABLE IF NOT EXISTS searches (
+    id SERIAL PRIMARY KEY,
+    query TEXT NOT NULL,
+    location TEXT NOT NULL,
+    timestamp TIMESTAMP WITH TIME ZONE DEFAULT CURRENT_TIMESTAMP,
+    results_count INTEGER
+);
+
+CREATE TABLE IF NOT EXISTS cache (
+    key TEXT PRIMARY KEY,
+    value JSONB NOT NULL,
+    created_at TIMESTAMP WITH TIME ZONE DEFAULT CURRENT_TIMESTAMP,
+    expires_at TIMESTAMP WITH TIME ZONE NOT NULL
+);
+
+-- Create indexes
+CREATE INDEX IF NOT EXISTS idx_businesses_location ON businesses USING GIN (location);
+CREATE INDEX IF NOT EXISTS idx_businesses_search ON businesses USING GIN (to_tsvector('english', name || ' ' || COALESCE(description, '')));
+CREATE INDEX IF NOT EXISTS idx_cache_expires ON cache (expires_at);
+
+-- Set up RLS (Row Level Security)
+ALTER TABLE businesses ENABLE ROW LEVEL SECURITY;
+ALTER TABLE searches ENABLE ROW LEVEL SECURITY;
+ALTER TABLE cache ENABLE ROW LEVEL SECURITY;
+
+-- Create policies
+CREATE POLICY "Allow anonymous select" ON businesses FOR SELECT USING (true);
+CREATE POLICY "Allow service role insert" ON businesses FOR INSERT WITH CHECK (true);
+CREATE POLICY "Allow service role update" ON businesses FOR UPDATE USING (true);
+
+CREATE POLICY "Allow anonymous select" ON searches FOR SELECT USING (true);
+CREATE POLICY "Allow service role insert" ON searches FOR INSERT WITH CHECK (true);
+
+CREATE POLICY "Allow anonymous select" ON cache FOR SELECT USING (true);
+CREATE POLICY "Allow service role all" ON cache USING (true);
+
+-- Add place_id column to businesses table if it doesn't exist
+ALTER TABLE businesses ADD COLUMN IF NOT EXISTS place_id TEXT;
+CREATE INDEX IF NOT EXISTS idx_businesses_place_id ON businesses(place_id);
+
+-- Create a unique constraint on place_id (excluding nulls)
+CREATE UNIQUE INDEX IF NOT EXISTS idx_businesses_place_id_unique 
+ON businesses(place_id) 
+WHERE place_id IS NOT NULL; 
--- a/db/schema.sql
+++ b/db/schema.sql
@ -0,0 +1,44 @@
+-- Create the businesses table
+create table businesses (
+  id uuid primary key,
+  name text not null,
+  phone text,
+  address text,
+  city text,
+  state text,
+  zip text,
+  category text[],
+  rating numeric,
+  review_count integer,
+  license text,
+  services text[],
+  hours jsonb,
+  website text,
+  email text,
+  verified boolean default false,
+  last_updated timestamp with time zone,
+  search_query text,
+  search_location text,
+  search_timestamp timestamp with time zone,
+  reliability_score integer,
+  
+  -- Create a composite index for deduplication
+  constraint unique_business unique (phone, address)
+);
+
+-- Create indexes for common queries
+create index idx_business_location on businesses (city, state);
+create index idx_business_category on businesses using gin (category);
+create index idx_search_query on businesses using gin (search_query gin_trgm_ops);
+create index idx_search_location on businesses using gin (search_location gin_trgm_ops);
+create index idx_reliability on businesses (reliability_score);
+
+-- Enable full text search
+alter table businesses add column search_vector tsvector 
+  generated always as (
+    setweight(to_tsvector('english', coalesce(name, '')), 'A') ||
+    setweight(to_tsvector('english', coalesce(search_query, '')), 'B') ||
+    setweight(to_tsvector('english', coalesce(search_location, '')), 'C')
+  ) stored;
+
+create index idx_business_search on businesses using gin(search_vector); 
--- a/db/verify.sql
+++ b/db/verify.sql
@ -0,0 +1,15 @@
+-- Check if table exists
+SELECT EXISTS (
+   SELECT FROM information_schema.tables 
+   WHERE table_schema = 'public'
+   AND table_name = 'businesses'
+);
+
+-- Check table structure
+SELECT column_name, data_type, is_nullable
+FROM information_schema.columns
+WHERE table_schema = 'public'
+AND table_name = 'businesses';
+
+-- Check row count
+SELECT count(*) FROM businesses; 
--- a/docker-compose.yml
+++ b/docker-compose.yml
@ -0,0 +1,26 @@
+version: '3'
+services:
+  searxng:
+    image: searxng/searxng
+    ports:
+      - "4000:8080"
+    volumes:
+      - ./searxng:/etc/searxng
+    environment:
+      - INSTANCE_NAME=perplexica-searxng
+      - BASE_URL=http://localhost:4000/
+      - SEARXNG_SECRET=your_secret_key_here
+    restart: unless-stopped
+
+  app:
+    build:
+      context: .
+      dockerfile: backend.dockerfile
+    ports:
+      - "3000:3000"
+    environment:
+      - SEARXNG_URL=http://searxng:8080
+    volumes:
+      - ./config.toml:/home/perplexica/config.toml
+    depends_on:
+      - searxng 
--- a/docs/ETHICAL_SCRAPING.md
+++ b/docs/ETHICAL_SCRAPING.md
@ -0,0 +1,108 @@
+# Ethical Web Scraping Guidelines
+
+## Core Principles
+
+1. **Respect Robots.txt**
+   - Always check and honor robots.txt directives
+   - Cache robots.txt to reduce server load
+   - Default to conservative behavior when uncertain
+
+2. **Proper Identification**
+   - Use clear, identifiable User-Agent strings
+   - Provide contact information
+   - Be transparent about your purpose
+
+3. **Rate Limiting**
+   - Implement conservative rate limits
+   - Use exponential backoff for errors
+   - Distribute requests over time
+
+4. **Data Usage**
+   - Only collect publicly available business information
+   - Respect privacy and data protection laws
+   - Provide clear opt-out mechanisms
+   - Keep data accurate and up-to-date
+
+5. **Technical Considerations**
+   - Cache results to minimize requests
+   - Handle errors gracefully
+   - Monitor and log access patterns
+   - Use structured data when available
+
+## Implementation
+
+1. **Request Headers**
+```typescript
+const headers = {
+  'User-Agent': 'BizSearch/1.0 (+https://bizsearch.com/about)',
+  'Accept': 'text/html,application/xhtml+xml',
+  'From': 'contact@bizsearch.com'
+};
+```
+
+2. **Rate Limiting**
+```typescript
+const rateLimits = {
+  requestsPerMinute: 10,
+  requestsPerHour: 100,
+  requestsPerDomain: 20
+};
+```
+
+3. **Caching**
+```typescript
+const cacheSettings = {
+  ttl: 24 * 60 * 60, // 24 hours
+  maxSize: 1000 // entries
+};
+```
+
+## Opt-Out Process
+
+1. Business owners can opt-out by:
+   - Submitting a form on our website
+   - Emailing opt-out@bizsearch.com
+   - Adding a meta tag: `<meta name="bizsearch" content="noindex">`
+
+2. We honor opt-outs within:
+   - 24 hours for direct requests
+   - 72 hours for cached data
+
+## Legal Compliance
+
+1. **Data Protection**
+   - GDPR compliance for EU businesses
+   - CCPA compliance for California businesses
+   - Regular data audits and cleanup
+
+2. **Attribution**
+   - Clear source attribution
+   - Last-updated timestamps
+   - Data accuracy disclaimers
+
+## Best Practices
+
+1. **Before Scraping**
+   - Check robots.txt
+   - Verify site status
+   - Review terms of service
+   - Look for API alternatives
+
+2. **During Scraping**
+   - Monitor response codes
+   - Respect server hints
+   - Implement backoff strategies
+   - Log access patterns
+
+3. **After Scraping**
+   - Verify data accuracy
+   - Update cache entries
+   - Clean up old data
+   - Monitor opt-out requests
+
+## Contact
+
+For questions or concerns about our scraping practices:
+- Email: ethics@bizsearch.com
+- Phone: (555) 123-4567
+- Web: https://bizsearch.com/ethics 
--- a/package-lock.json
+++ b/package-lock.json
--- a/package.json
+++ b/package.json
@ -9,7 +9,9 @@
    "dev": "nodemon --ignore uploads/ src/app.ts ",
    "db:push": "drizzle-kit push sqlite",
    "format": "prettier . --check",
-    "format:write": "prettier . --write"
+    "format:write": "prettier . --write",
+    "test:search": "ts-node src/tests/testSearch.ts",
+    "test:supabase": "ts-node src/tests/supabaseTest.ts"
  },
  "devDependencies": {
    "@types/better-sqlite3": "^7.6.10",
@ -30,15 +32,17 @@
    "@iarna/toml": "^2.2.5",
    "@langchain/anthropic": "^0.2.3",
    "@langchain/community": "^0.2.16",
-    "@langchain/openai": "^0.0.25",
    "@langchain/google-genai": "^0.0.23",
+    "@langchain/openai": "^0.0.25",
+    "@supabase/supabase-js": "^2.47.10",
    "@xenova/transformers": "^2.17.1",
    "axios": "^1.6.8",
-    "better-sqlite3": "^11.0.0",
+    "better-sqlite3": "^11.7.0",
+    "cheerio": "^1.0.0",
    "compute-cosine-similarity": "^1.1.0",
    "compute-dot": "^1.1.0",
    "cors": "^2.8.5",
-    "dotenv": "^16.4.5",
+    "dotenv": "^16.4.7",
    "drizzle-orm": "^0.31.2",
    "express": "^4.19.2",
    "html-to-text": "^9.0.5",
@ -46,6 +50,8 @@
    "mammoth": "^1.8.0",
    "multer": "^1.4.5-lts.1",
    "pdf-parse": "^1.1.1",
+    "robots-parser": "^3.0.1",
+    "tesseract.js": "^4.1.4",
    "winston": "^3.13.0",
    "ws": "^8.17.1",
    "zod": "^3.22.4"
--- a/public/index.html
+++ b/public/index.html
@ -0,0 +1,558 @@
+<!DOCTYPE html>
+<html lang="en">
+<head>
+    <meta charset="UTF-8">
+    <meta name="viewport" content="width=device-width, initial-scale=1.0">
+    <title>OffMarket Pro - Find Off-Market Property Services</title>
+    <style>
+        :root {
+            --primary-color: #2c3e50;
+            --secondary-color: #3498db;
+            --accent-color: #e74c3c;
+            --background-color: #f8f9fa;
+            --text-color: #2c3e50;
+            --border-radius: 8px;
+            --card-shadow: 0 2px 4px rgba(0,0,0,0.1);
+        }
+
+        body {
+            font-family: 'Segoe UI', system-ui, -apple-system, sans-serif;
+            margin: 0;
+            padding: 0;
+            background: var(--background-color);
+            color: var(--text-color);
+        }
+
+        .header {
+            background: white;
+            box-shadow: 0 2px 4px rgba(0,0,0,0.1);
+            padding: 1rem;
+        }
+
+        .logo {
+            font-size: 1.8rem;
+            font-weight: bold;
+            color: var(--primary-color);
+            text-decoration: none;
+        }
+
+        .search-container {
+            max-width: 1200px;
+            margin: 3rem auto;
+            padding: 2rem;
+            text-align: center;
+        }
+
+        .search-box {
+            display: flex;
+            gap: 1rem;
+            max-width: 800px;
+            margin: 2rem auto;
+        }
+
+        .search-input {
+            flex: 1;
+            padding: 1rem;
+            border: 2px solid #ddd;
+            border-radius: var(--border-radius);
+            font-size: 1rem;
+        }
+
+        .search-button {
+            padding: 1rem 2rem;
+            background: var(--secondary-color);
+            color: white;
+            border: none;
+            border-radius: var(--border-radius);
+            cursor: pointer;
+            font-size: 1rem;
+            transition: background 0.2s;
+        }
+
+        .search-button:hover {
+            background: #2980b9;
+        }
+
+        .categories-grid {
+            display: grid;
+            grid-template-columns: repeat(auto-fit, minmax(250px, 1fr));
+            gap: 1.5rem;
+            margin: 2rem auto;
+            max-width: 1200px;
+            padding: 0 1rem;
+        }
+
+        .category-card {
+            background: white;
+            border-radius: var(--border-radius);
+            padding: 1.5rem;
+            box-shadow: var(--card-shadow);
+            transition: transform 0.2s;
+            cursor: pointer;
+        }
+
+        .category-card:hover {
+            transform: translateY(-2px);
+        }
+
+        .category-icon {
+            font-size: 2rem;
+            margin-bottom: 1rem;
+        }
+
+        .subcategories {
+            margin-top: 1rem;
+            font-size: 0.9rem;
+        }
+
+        .subcategory {
+            color: var(--secondary-color);
+            cursor: pointer;
+            margin: 0.25rem 0;
+        }
+
+        .subcategory:hover {
+            text-decoration: underline;
+        }
+
+        .results-container {
+            max-width: 1200px;
+            margin: 2rem auto;
+            padding: 0 1rem;
+        }
+
+        .business-card {
+            background: white;
+            padding: 1.5rem;
+            margin-bottom: 1rem;
+            border-radius: var(--border-radius);
+            box-shadow: 0 2px 4px rgba(0,0,0,0.05);
+            display: grid;
+            grid-template-columns: auto 1fr auto;
+            gap: 1.5rem;
+            align-items: start;
+        }
+
+        .business-logo {
+            width: 80px;
+            height: 80px;
+            object-fit: cover;
+            border-radius: var(--border-radius);
+        }
+
+        .business-info h3 {
+            margin: 0 0 0.5rem 0;
+            color: var(--primary-color);
+        }
+
+        .business-contact {
+            text-align: right;
+        }
+
+        .rating-stars {
+            color: #f1c40f;
+            margin-bottom: 0.5rem;
+        }
+
+        @media (max-width: 768px) {
+            .search-box {
+                flex-direction: column;
+            }
+            
+            .business-card {
+                grid-template-columns: 1fr;
+                text-align: center;
+            }
+
+            .business-contact {
+                text-align: center;
+            }
+
+            .business-logo {
+                margin: 0 auto;
+            }
+        }
+
+        .loading {
+            text-align: center;
+            padding: 2rem;
+            color: var(--text-color);
+        }
+
+        .error-message {
+            background: #fee;
+            border: 1px solid #fcc;
+            padding: 1rem;
+            border-radius: var(--border-radius);
+            text-align: center;
+        }
+
+        .no-results {
+            background: #f8f9fa;
+            padding: 2rem;
+            border-radius: var(--border-radius);
+            text-align: center;
+        }
+
+        .no-results ul {
+            text-align: left;
+            display: inline-block;
+            margin: 1rem auto;
+        }
+
+        .results-table {
+            width: 100%;
+            border-collapse: collapse;
+            margin-top: 2rem;
+            background: white;
+            box-shadow: var(--card-shadow);
+            border-radius: var(--border-radius);
+            overflow: hidden;
+        }
+
+        .results-table th {
+            background: #f8f9fa;
+            padding: 1rem;
+            text-align: left;
+            font-weight: 600;
+            color: var(--primary-color);
+            border-bottom: 2px solid #eee;
+        }
+
+        .results-table td {
+            padding: 1rem;
+            border-bottom: 1px solid #eee;
+            vertical-align: top;
+        }
+
+        .business-icon {
+            width: 50px;
+            height: 50px;
+            display: flex;
+            align-items: center;
+            justify-content: center;
+            background: #f0f0f0;
+            border-radius: var(--border-radius);
+            font-size: 1.5rem;
+            color: var(--primary-color);
+        }
+
+        .business-info {
+            display: flex;
+            gap: 1rem;
+            align-items: start;
+        }
+
+        .business-details h3 {
+            margin: 0 0 0.5rem 0;
+            color: var(--primary-color);
+        }
+
+        .business-meta {
+            font-size: 0.9rem;
+            color: #666;
+        }
+
+        .rating {
+            display: flex;
+            align-items: center;
+            gap: 0.25rem;
+            color: #f39c12;
+        }
+
+        .contact-info {
+            text-align: right;
+            white-space: nowrap;
+        }
+
+        .phone {
+            font-weight: 600;
+            color: var(--primary-color);
+            margin-bottom: 0.25rem;
+        }
+
+        .address {
+            color: #666;
+            font-size: 0.9rem;
+        }
+
+        .action-buttons {
+            display: flex;
+            gap: 0.5rem;
+            justify-content: flex-end;
+        }
+
+        .action-button {
+            padding: 0.5rem 1rem;
+            border: none;
+            border-radius: var(--border-radius);
+            cursor: pointer;
+            font-size: 0.9rem;
+        }
+
+        .primary-button {
+            background: var(--secondary-color);
+            color: white;
+        }
+
+        .secondary-button {
+            background: #eee;
+            color: var(--text-color);
+        }
+    </style>
+</head>
+<body>
+    <header class="header">
+        <a href="/" class="logo">OffMarket Pro</a>
+    </header>
+
+    <div class="search-container">
+        <h1>Find Off-Market Property Services</h1>
+        <div class="search-box">
+            <input type="text" id="searchQuery" placeholder="What service are you looking for?">
+            <input type="text" id="searchLocation" placeholder="Location">
+            <button onclick="performSearch()">Search</button>
+        </div>
+    </div>
+
+    <div class="categories-grid">
+        <!-- Categories will be dynamically inserted here -->
+    </div>
+
+    <div class="container">
+        <table class="results-table">
+            <thead>
+                <tr>
+                    <th style="width: 50%">Business</th>
+                    <th style="width: 30%">Contact</th>
+                    <th style="width: 20%">Actions</th>
+                </tr>
+            </thead>
+            <tbody id="resultsBody">
+                <!-- Results will be populated here -->
+            </tbody>
+        </table>
+        <div id="searchProgress" class="search-progress"></div>
+    </div>
+
+    <script>
+        // Load categories
+        fetch('/api/categories')
+            .then(response => response.json())
+            .then(categories => {
+                const grid = document.querySelector('.categories-grid');
+                grid.innerHTML = categories.map(category => `
+                    <div class="category-card" onclick="searchCategory('${category.name}')">
+                        <div class="category-icon">${category.icon}</div>
+                        <h3>${category.name}</h3>
+                        <div class="subcategories">
+                            ${category.subcategories.map(sub => 
+                                `<div class="subcategory" onclick="event.stopPropagation(); searchSubcategory('${sub.name}')">${sub.name}</div>`
+                            ).join('')}
+                        </div>
+                    </div>
+                `).join('');
+            });
+
+        async function performSearch() {
+            const query = document.getElementById('searchQuery').value;
+            const location = document.getElementById('searchLocation').value;
+            
+            if (!query || !location) {
+                alert('Please enter both search query and location');
+                return;
+            }
+
+            await doSearch(query, location);
+        }
+
+        function searchCategory(category) {
+            const location = document.getElementById('searchLocation').value;
+            if (!location) {
+                alert('Please enter a location first');
+                return;
+            }
+            document.getElementById('searchQuery').value = category;
+            performSearch();
+        }
+
+        // Add searchSubcategory function
+        function searchSubcategory(subcategory) {
+            const location = document.getElementById('searchLocation').value;
+            if (!location) {
+                alert('Please enter a location first');
+                return;
+            }
+            document.getElementById('searchQuery').value = subcategory;
+            performSearch();
+        }
+
+        // Update doSearch function
+        async function doSearch(query, location) {
+            const searchTerm = `${query} in ${location}`;
+            const resultsBody = document.getElementById('resultsBody');
+            const progressDiv = document.getElementById('searchProgress');
+            
+            try {
+                resultsBody.innerHTML = `
+                    <tr>
+                        <td colspan="3" class="loading">
+                            <p>Searching for ${query} in ${location}...</p>
+                        </td>
+                    </tr>
+                `;
+
+                const response = await fetch(`/api/search?q=${encodeURIComponent(searchTerm)}`);
+                const reader = response.body.getReader();
+                const decoder = new TextDecoder();
+                let buffer = '';
+                let allResults = new Set(); // Use Set to avoid duplicates
+
+                while (true) {
+                    const { value, done } = await reader.read();
+                    if (done) break;
+                    
+                    buffer += decoder.decode(value, { stream: true });
+                    
+                    // Process complete chunks
+                    const chunks = buffer.split('\n');
+                    buffer = chunks.pop() || ''; // Keep the incomplete chunk
+
+                    for (const chunk of chunks) {
+                        if (!chunk.trim()) continue;
+                        
+                        try {
+                            const data = JSON.parse(chunk);
+                            console.log('Received chunk:', data);
+
+                            if (data.source === 'database' || (data.source === 'search' && data.results)) {
+                                // Add new results to our set
+                                data.results.forEach(result => {
+                                    allResults.add(JSON.stringify(result)); // Convert to string for Set storage
+                                });
+                                
+                                // Display all current results
+                                displayResults(Array.from(allResults).map(str => JSON.parse(str)));
+                            } else if (data.status && data.progress) {
+                                // Update progress
+                                progressDiv.innerHTML = `
+                                    <p>${data.status} (${data.progress}% complete)</p>
+                                `;
+                            }
+                        } catch (e) {
+                            console.error('Error parsing chunk:', e);
+                        }
+                    }
+                }
+
+                // Clear progress when done
+                progressDiv.innerHTML = '';
+
+            } catch (error) {
+                console.error('Search error:', error);
+                resultsBody.innerHTML = `
+                    <tr>
+                        <td colspan="3" class="error-message">
+                            <h3>Search Error</h3>
+                            <p>Sorry, we encountered an error while searching. Please try again.</p>
+                            <p>Error details: ${error.message}</p>
+                            <button onclick="performSearch()" class="search-button">Try Again</button>
+                        </td>
+                    </tr>
+                `;
+            }
+        }
+
+        function displayResults(businesses) {
+            const resultsBody = document.getElementById('resultsBody');
+            
+            if (!businesses || businesses.length === 0) {
+                resultsBody.innerHTML = `
+                    <tr>
+                        <td colspan="3" style="text-align: center; padding: 2rem;">
+                            <h3>No Results Found</h3>
+                            <p>We couldn't find any businesses matching your search.</p>
+                        </td>
+                    </tr>
+                `;
+                return;
+            }
+
+            resultsBody.innerHTML = businesses.map(business => {
+                const icon = getBusinessIcon(business.name);
+                const rating = business.rating ? (business.rating / 20).toFixed(1) : 0; // Convert to 5-star scale
+                
+                return `
+                    <tr>
+                        <td>
+                            <div class="business-info">
+                                <div class="business-icon">${icon}</div>
+                                <div class="business-details">
+                                    <h3>${business.name}</h3>
+                                    <div class="business-meta">
+                                        <div class="rating">
+                                            ${getRatingStars(rating)}
+                                            <span>(${rating})</span>
+                                        </div>
+                                        <div class="description">${business.description || ''}</div>
+                                    </div>
+                                </div>
+                            </div>
+                        </td>
+                        <td>
+                            <div class="contact-info">
+                                <div class="phone">${business.phone || 'No phone available'}</div>
+                                <div class="address">${business.address || 'Address not available'}</div>
+                            </div>
+                        </td>
+                        <td>
+                            <div class="action-buttons">
+                                ${business.website ? 
+                                    `<a href="${business.website}" target="_blank" class="action-button primary-button">Visit Website</a>` :
+                                    '<button class="action-button secondary-button" disabled>No Website</button>'
+                                }
+                                <button onclick="contactBusiness('${business.id}')" class="action-button secondary-button">Contact</button>
+                            </div>
+                        </td>
+                    </tr>
+                `;
+            }).join('');
+        }
+
+        // Helper function to get business icon
+        function getBusinessIcon(businessName) {
+            // Map of business types to icons
+            const icons = {
+                'real estate': '🏢',
+                'legal': '⚖️',
+                'financial': '💰',
+                'contractor': '🔨',
+                'property': '🏠',
+                'marketing': '📢',
+                'tech': '💻',
+                'default': '🏢'
+            };
+
+            // Determine business type from name
+            const businessType = Object.keys(icons).find(type => 
+                businessName.toLowerCase().includes(type)
+            ) || 'default';
+
+            return icons[businessType];
+        }
+
+        // Helper function to generate rating stars
+        function getRatingStars(rating) {
+            const fullStars = Math.floor(rating);
+            const hasHalfStar = rating % 1 >= 0.5;
+            const emptyStars = 5 - fullStars - (hasHalfStar ? 1 : 0);
+
+            return `
+                ${'★'.repeat(fullStars)}
+                ${hasHalfStar ? '½' : ''}
+                ${'☆'.repeat(emptyStars)}
+            `;
+        }
+    </script>
+</body>
+</html> 
--- a/src/app.ts
+++ b/src/app.ts
@ -1,38 +1,48 @@
-import { startWebSocketServer } from './websocket';
 import express from 'express';
 import cors from 'cors';
-import http from 'http';
-import routes from './routes';
-import { getPort } from './config';
-import logger from './utils/logger';
-
-const port = getPort();
+import path from 'path';
+import './config/env'; // Load environment variables first
+import apiRoutes from './routes/api';
+import { HealthCheckService } from './lib/services/healthCheck';

 const app = express();
-const server = http.createServer(app);
+const port = process.env.PORT || 3000;

-const corsOptions = {
-  origin: '*',
-};
-
-app.use(cors(corsOptions));
+// Middleware
+app.use(cors());
 app.use(express.json());

-app.use('/api', routes);
-app.get('/api', (_, res) => {
-  res.status(200).json({ status: 'ok' });
+// API routes first
+app.use('/api', apiRoutes);
+
+// Then static files
+app.use(express.static(path.join(__dirname, '../public')));
+
+// Finally, catch-all route for SPA
+app.get('*', (req, res) => {
+  res.sendFile(path.join(__dirname, '../public/index.html'));
 });

-server.listen(port, () => {
-  logger.info(`Server is running on port ${port}`);
-});
+// Start server with health checks
+async function startServer() {
+  console.log('\n🔍 Checking required services...');
+  
+  const ollamaStatus = await HealthCheckService.checkOllama();
+  const searxngStatus = await HealthCheckService.checkSearxNG();
+  const supabaseStatus = await HealthCheckService.checkSupabase();

-startWebSocketServer(server);
+  console.log('\n📊 Service Status:');
+  console.log('- Ollama:', ollamaStatus ? '✅ Running' : '❌ Not Running');
+  console.log('- SearxNG:', searxngStatus ? '✅ Running' : '❌ Not Running');
+  console.log('- Supabase:', supabaseStatus ? '✅ Connected' : '❌ Not Connected');

-process.on('uncaughtException', (err, origin) => {
-  logger.error(`Uncaught Exception at ${origin}: ${err}`);
-});
+  app.listen(port, () => {
+    console.log(`\n🚀 Server running at http://localhost:${port}`);
+    console.log('-------------------------------------------');
+  });
+}

-process.on('unhandledRejection', (reason, promise) => {
-  logger.error(`Unhandled Rejection at: ${promise}, reason: ${reason}`);
+startServer().catch(error => {
+  console.error('Failed to start server:', error);
+  process.exit(1);
 });
--- a/src/config.ts
+++ b/src/config.ts
@ -77,3 +77,16 @@ export const updateConfig = (config: RecursivePartial<Config>) => {
    toml.stringify(config),
  );
 };
+
+export const config = {
+  ollama: {
+    url: process.env.OLLAMA_URL || 'http://localhost:11434',
+    model: process.env.OLLAMA_MODEL || 'mistral',
+    options: {
+      temperature: 0.1,
+      top_p: 0.9,
+      timeout: 30000 // 30 seconds timeout
+    }
+  },
+  // ... other config
+};
--- a/src/config/env.ts
+++ b/src/config/env.ts
@ -0,0 +1,68 @@
+import { config } from 'dotenv';
+import { z } from 'zod';
+
+config();
+
+// Define the environment schema
+const envSchema = z.object({
+  PORT: z.string().default('3000'),
+  NODE_ENV: z.string().default('development'),
+  SUPABASE_URL: z.string(),
+  SUPABASE_KEY: z.string(),
+  OLLAMA_URL: z.string().default('http://localhost:11434'),
+  OLLAMA_MODEL: z.string().default('llama2'),
+  SEARXNG_URL: z.string().default('http://localhost:4000'),
+  SEARXNG_INSTANCES: z.string().default('["http://localhost:4000"]'),
+  MAX_RESULTS_PER_QUERY: z.string().default('50'),
+  CACHE_DURATION_HOURS: z.string().default('24'),
+  CACHE_DURATION_DAYS: z.string().default('7')
+});
+
+// Define the final environment type
+export interface EnvConfig {
+  PORT: string;
+  NODE_ENV: string;
+  searxng: {
+    currentUrl: string;
+    instances: string[];
+  };
+  ollama: {
+    url: string;
+    model: string;
+  };
+  supabase: {
+    url: string;
+    anonKey: string;
+  };
+  cache: {
+    maxResultsPerQuery: number;
+    durationHours: number;
+    durationDays: number;
+  };
+}
+
+// Parse and transform the environment variables
+const rawEnv = envSchema.parse(process.env);
+
+// Create the final environment object with parsed configurations
+export const env: EnvConfig = {
+  PORT: rawEnv.PORT,
+  NODE_ENV: rawEnv.NODE_ENV,
+  searxng: {
+    currentUrl: rawEnv.SEARXNG_URL,
+    instances: JSON.parse(rawEnv.SEARXNG_INSTANCES)
+  },
+  ollama: {
+    url: rawEnv.OLLAMA_URL,
+    model: rawEnv.OLLAMA_MODEL
+  },
+  supabase: {
+    url: rawEnv.SUPABASE_URL,
+    anonKey: rawEnv.SUPABASE_KEY
+  },
+  cache: {
+    maxResultsPerQuery: parseInt(rawEnv.MAX_RESULTS_PER_QUERY),
+    durationHours: parseInt(rawEnv.CACHE_DURATION_HOURS),
+    durationDays: parseInt(rawEnv.CACHE_DURATION_DAYS)
+  }
+}; 
--- a/src/config/index.ts
+++ b/src/config/index.ts
@ -0,0 +1,77 @@
+import dotenv from 'dotenv';
+import path from 'path';
+
+// Load .env file
+dotenv.config({ path: path.resolve(__dirname, '../../.env') });
+
+export interface Config {
+  supabase: {
+    url: string;
+    anonKey: string;
+  };
+  server: {
+    port: number;
+    nodeEnv: string;
+  };
+  search: {
+    maxResultsPerQuery: number;
+    cacheDurationHours: number;
+    searxngUrl?: string;
+  };
+  rateLimit: {
+    windowMs: number;
+    maxRequests: number;
+  };
+  security: {
+    corsOrigin: string;
+    jwtSecret: string;
+  };
+  proxy?: {
+    http?: string;
+    https?: string;
+  };
+  logging: {
+    level: string;
+  };
+}
+
+const config: Config = {
+  supabase: {
+    url: process.env.SUPABASE_URL || '',
+    anonKey: process.env.SUPABASE_ANON_KEY || '',
+  },
+  server: {
+    port: parseInt(process.env.PORT || '3000', 10),
+    nodeEnv: process.env.NODE_ENV || 'development',
+  },
+  search: {
+    maxResultsPerQuery: parseInt(process.env.MAX_RESULTS_PER_QUERY || '20', 10),
+    cacheDurationHours: parseInt(process.env.CACHE_DURATION_HOURS || '24', 10),
+    searxngUrl: process.env.SEARXNG_URL
+  },
+  rateLimit: {
+    windowMs: parseInt(process.env.RATE_LIMIT_WINDOW_MS || '900000', 10),
+    maxRequests: parseInt(process.env.RATE_LIMIT_MAX_REQUESTS || '100', 10),
+  },
+  security: {
+    corsOrigin: process.env.CORS_ORIGIN || 'http://localhost:3000',
+    jwtSecret: process.env.JWT_SECRET || 'your_jwt_secret_key',
+  },
+  logging: {
+    level: process.env.LOG_LEVEL || 'info',
+  },
+};
+
+// Validate required configuration
+const validateConfig = () => {
+  if (!config.supabase.url) {
+    throw new Error('SUPABASE_URL is required');
+  }
+  if (!config.supabase.anonKey) {
+    throw new Error('SUPABASE_ANON_KEY is required');
+  }
+};
+
+validateConfig();
+
+export { config }; 
--- a/src/lib/categories.ts
+++ b/src/lib/categories.ts
@ -0,0 +1,116 @@
+export interface Category {
+  id: string;
+  name: string;
+  icon: string;
+  subcategories: SubCategory[];
+}
+
+export interface SubCategory {
+  id: string;
+  name: string;
+}
+
+export const categories: Category[] = [
+  {
+    id: 'real-estate-pros',
+    name: 'Real Estate Professionals',
+    icon: '🏢',
+    subcategories: [
+      { id: 'wholesalers', name: 'Real Estate Wholesalers' },
+      { id: 'agents', name: 'Real Estate Agents' },
+      { id: 'attorneys', name: 'Real Estate Attorneys' },
+      { id: 'scouts', name: 'Property Scouts' },
+      { id: 'brokers', name: 'Real Estate Brokers' },
+      { id: 'consultants', name: 'Real Estate Consultants' }
+    ]
+  },
+  {
+    id: 'legal-title',
+    name: 'Legal & Title Services',
+    icon: '⚖️',
+    subcategories: [
+      { id: 'title-companies', name: 'Title Companies' },
+      { id: 'closing-attorneys', name: 'Closing Attorneys' },
+      { id: 'zoning-consultants', name: 'Zoning Consultants' },
+      { id: 'probate-specialists', name: 'Probate Specialists' },
+      { id: 'eviction-specialists', name: 'Eviction Specialists' }
+    ]
+  },
+  {
+    id: 'financial',
+    name: 'Financial Services',
+    icon: '💰',
+    subcategories: [
+      { id: 'hard-money', name: 'Hard Money Lenders' },
+      { id: 'private-equity', name: 'Private Equity Investors' },
+      { id: 'mortgage-brokers', name: 'Mortgage Brokers' },
+      { id: 'tax-advisors', name: 'Tax Advisors' },
+      { id: 'appraisers', name: 'Appraisers' }
+    ]
+  },
+  {
+    id: 'contractors',
+    name: 'Specialist Contractors',
+    icon: '🔨',
+    subcategories: [
+      { id: 'general', name: 'General Contractors' },
+      { id: 'plumbers', name: 'Plumbers' },
+      { id: 'electricians', name: 'Electricians' },
+      { id: 'hvac', name: 'HVAC Technicians' },
+      { id: 'roofers', name: 'Roofers' },
+      { id: 'foundation', name: 'Foundation Specialists' },
+      { id: 'asbestos', name: 'Asbestos Removal' },
+      { id: 'mold', name: 'Mold Remediation' }
+    ]
+  },
+  {
+    id: 'property-services',
+    name: 'Property Services',
+    icon: '🏠',
+    subcategories: [
+      { id: 'surveyors', name: 'Surveyors' },
+      { id: 'inspectors', name: 'Inspectors' },
+      { id: 'property-managers', name: 'Property Managers' },
+      { id: 'environmental', name: 'Environmental Consultants' },
+      { id: 'junk-removal', name: 'Junk Removal Services' },
+      { id: 'cleaning', name: 'Property Cleaning' }
+    ]
+  },
+  {
+    id: 'marketing',
+    name: 'Marketing & Lead Gen',
+    icon: '📢',
+    subcategories: [
+      { id: 'direct-mail', name: 'Direct Mail Services' },
+      { id: 'social-media', name: 'Social Media Marketing' },
+      { id: 'seo', name: 'SEO Specialists' },
+      { id: 'ppc', name: 'PPC Advertising' },
+      { id: 'lead-gen', name: 'Lead Generation' },
+      { id: 'skip-tracing', name: 'Skip Tracing Services' }
+    ]
+  },
+  {
+    id: 'data-tech',
+    name: 'Data & Technology',
+    icon: '💻',
+    subcategories: [
+      { id: 'data-providers', name: 'Property Data Providers' },
+      { id: 'crm', name: 'CRM Systems' },
+      { id: 'valuation', name: 'Valuation Tools' },
+      { id: 'virtual-tours', name: 'Virtual Tour Services' },
+      { id: 'automation', name: 'Automation Tools' }
+    ]
+  },
+  {
+    id: 'specialty',
+    name: 'Specialty Services',
+    icon: '🎯',
+    subcategories: [
+      { id: 'auction', name: 'Auction Companies' },
+      { id: 'relocation', name: 'Relocation Services' },
+      { id: 'staging', name: 'Home Staging' },
+      { id: 'photography', name: 'Real Estate Photography' },
+      { id: 'virtual-assistant', name: 'Virtual Assistants' }
+    ]
+  }
+]; 
--- a/src/lib/db/optOutDb.ts
+++ b/src/lib/db/optOutDb.ts
@ -0,0 +1,51 @@
+import { Database } from 'better-sqlite3';
+import path from 'path';
+
+interface OptOutEntry {
+  domain: string;
+  email: string;
+  reason?: string;
+  timestamp: Date;
+}
+
+export class OptOutDatabase {
+  private db: Database;
+
+  constructor() {
+    this.db = new Database(path.join(__dirname, '../../../data/optout.db'));
+    this.initializeDatabase();
+  }
+
+  private initializeDatabase() {
+    this.db.exec(`
+      CREATE TABLE IF NOT EXISTS opt_outs (
+        domain TEXT PRIMARY KEY,
+        email TEXT NOT NULL,
+        reason TEXT,
+        timestamp DATETIME DEFAULT CURRENT_TIMESTAMP
+      );
+      CREATE INDEX IF NOT EXISTS idx_domain ON opt_outs(domain);
+    `);
+  }
+
+  async addOptOut(entry: OptOutEntry): Promise<void> {
+    const stmt = this.db.prepare(
+      'INSERT OR REPLACE INTO opt_outs (domain, email, reason, timestamp) VALUES (?, ?, ?, ?)'
+    );
+    stmt.run(entry.domain, entry.email, entry.reason, entry.timestamp.toISOString());
+  }
+
+  isOptedOut(domain: string): boolean {
+    const stmt = this.db.prepare('SELECT 1 FROM opt_outs WHERE domain = ?');
+    return stmt.get(domain) !== undefined;
+  }
+
+  removeOptOut(domain: string): void {
+    const stmt = this.db.prepare('DELETE FROM opt_outs WHERE domain = ?');
+    stmt.run(domain);
+  }
+
+  getOptOutList(): OptOutEntry[] {
+    return this.db.prepare('SELECT * FROM opt_outs').all();
+  }
+} 
--- a/src/lib/db/supabase.ts
+++ b/src/lib/db/supabase.ts
@ -0,0 +1,74 @@
+import { createClient } from '@supabase/supabase-js';
+import { BusinessData } from '../searxng';
+import { env } from '../../config/env';
+
+// Create the Supabase client with validated environment variables
+export const supabase = createClient(
+  env.supabase.url,
+  env.supabase.anonKey,
+  {
+    auth: {
+      persistSession: false // Since this is a server environment
+    }
+  }
+);
+
+// Define the cache record type
+export interface CacheRecord {
+  id: string;
+  query: string;
+  results: BusinessData[];
+  location: string;
+  category: string;
+  created_at: string;
+  updated_at: string;
+  expires_at: string;
+}
+
+// Export database helper functions
+export async function getCacheEntry(
+  category: string,
+  location: string
+): Promise<CacheRecord | null> {
+  const { data, error } = await supabase
+    .from('search_cache')
+    .select('*')
+    .eq('category', category.toLowerCase())
+    .eq('location', location.toLowerCase())
+    .gt('expires_at', new Date().toISOString())
+    .order('created_at', { ascending: false })
+    .limit(1)
+    .single();
+
+  if (error) {
+    console.error('Cache lookup failed:', error);
+    return null;
+  }
+
+  return data;
+}
+
+export async function saveCacheEntry(
+  category: string,
+  location: string,
+  results: BusinessData[],
+  expiresInDays: number = 7
+): Promise<void> {
+  const expiresAt = new Date();
+  expiresAt.setDate(expiresAt.getDate() + expiresInDays);
+
+  const { error } = await supabase
+    .from('search_cache')
+    .insert({
+      query: `${category} in ${location}`,
+      category: category.toLowerCase(),
+      location: location.toLowerCase(),
+      results,
+      expires_at: expiresAt.toISOString()
+    });
+
+  if (error) {
+    console.error('Failed to save cache entry:', error);
+    throw error;
+  }
+} 
--- a/src/lib/emailScraper.ts
+++ b/src/lib/emailScraper.ts
@ -0,0 +1,195 @@
+import axios from 'axios';
+import * as cheerio from 'cheerio';
+import { Cache } from './utils/cache';
+import { RateLimiter } from './utils/rateLimiter';
+import robotsParser from 'robots-parser';
+
+interface ScrapingResult {
+  emails: string[];
+  phones: string[];
+  addresses: string[];
+  socialLinks: string[];
+  source: string;
+  timestamp: Date;
+  attribution: string;
+}
+
+export class EmailScraper {
+  private cache: Cache<ScrapingResult>;
+  private rateLimiter: RateLimiter;
+  private robotsCache = new Map<string, any>();
+
+  constructor(private options = { 
+    timeout: 5000,
+    cacheTTL: 60,
+    rateLimit: { windowMs: 60000, maxRequests: 10 }, // More conservative rate limiting
+    userAgent: 'BizSearch/1.0 (+https://your-domain.com/about) - Business Directory Service'
+  }) {
+    this.cache = new Cache<ScrapingResult>(options.cacheTTL);
+    this.rateLimiter = new RateLimiter(options.rateLimit.windowMs, options.rateLimit.maxRequests);
+  }
+
+  private async checkRobotsPermission(url: string): Promise<boolean> {
+    try {
+      const { protocol, host } = new URL(url);
+      const robotsUrl = `${protocol}//${host}/robots.txt`;
+      
+      let parser = this.robotsCache.get(host);
+      if (!parser) {
+        const response = await axios.get(robotsUrl);
+        parser = robotsParser(robotsUrl, response.data);
+        this.robotsCache.set(host, parser);
+      }
+
+      return parser.isAllowed(url, this.options.userAgent);
+    } catch (error) {
+      console.warn(`Could not check robots.txt for ${url}:`, error);
+      return true; // Assume allowed if robots.txt is unavailable
+    }
+  }
+
+  async scrapeEmails(url: string): Promise<ScrapingResult> {
+    // Check cache first
+    const cached = this.cache.get(url);
+    if (cached) return cached;
+
+    // Check robots.txt
+    const allowed = await this.checkRobotsPermission(url);
+    if (!allowed) {
+      console.log(`Respecting robots.txt disallow for ${url}`);
+      return {
+        emails: [],
+        phones: [],
+        addresses: [],
+        socialLinks: [],
+        source: url,
+        timestamp: new Date(),
+        attribution: 'Restricted by robots.txt'
+      };
+    }
+
+    // Wait for rate limiting slot
+    await this.rateLimiter.waitForSlot();
+
+    try {
+      const response = await axios.get(url, {
+        timeout: this.options.timeout,
+        headers: {
+          'User-Agent': this.options.userAgent,
+          'Accept': 'text/html,application/xhtml+xml,application/xml;q=0.9,*/*;q=0.8',
+        }
+      });
+
+      // Check for noindex meta tag
+      const $ = cheerio.load(response.data);
+      if ($('meta[name="robots"][content*="noindex"]').length > 0) {
+        return {
+          emails: [],
+          phones: [],
+          addresses: [],
+          socialLinks: [],
+          source: url,
+          timestamp: new Date(),
+          attribution: 'Respecting noindex directive'
+        };
+      }
+
+      // Only extract contact information from public contact pages or structured data
+      const isContactPage = /contact|about/i.test(url) || 
+                          $('h1, h2').text().toLowerCase().includes('contact');
+
+      const result = {
+        emails: new Set<string>(),
+        phones: new Set<string>(),
+        addresses: new Set<string>(),
+        socialLinks: new Set<string>(),
+        source: url,
+        timestamp: new Date(),
+        attribution: `Data from public business listing at ${new URL(url).hostname}`
+      };
+
+      // Extract from structured data (Schema.org)
+      $('script[type="application/ld+json"]').each((_, element) => {
+        try {
+          const data = JSON.parse($(element).html() || '{}');
+          if (data['@type'] === 'LocalBusiness' || data['@type'] === 'Organization') {
+            if (data.email) result.emails.add(data.email.toLowerCase());
+            if (data.telephone) result.phones.add(this.formatPhoneNumber(data.telephone));
+            if (data.address) {
+              const fullAddress = this.formatAddress(data.address);
+              if (fullAddress) result.addresses.add(fullAddress);
+            }
+          }
+        } catch (e) {
+          console.error('Error parsing JSON-LD:', e);
+        }
+      });
+
+      // Only scrape additional info if it's a contact page
+      if (isContactPage) {
+        // Extract clearly marked contact information
+        $('[itemprop="email"], .contact-email, .email').each((_, element) => {
+          const email = $(element).text().trim();
+          if (this.isValidEmail(email)) {
+            result.emails.add(email.toLowerCase());
+          }
+        });
+
+        $('[itemprop="telephone"], .phone, .contact-phone').each((_, element) => {
+          const phone = $(element).text().trim();
+          const formatted = this.formatPhoneNumber(phone);
+          if (formatted) result.phones.add(formatted);
+        });
+      }
+
+      const finalResult = {
+        ...result,
+        emails: Array.from(result.emails),
+        phones: Array.from(result.phones),
+        addresses: Array.from(result.addresses),
+        socialLinks: Array.from(result.socialLinks)
+      };
+
+      this.cache.set(url, finalResult);
+      return finalResult;
+
+    } catch (error) {
+      console.error(`Failed to scrape ${url}:`, error);
+      return {
+        emails: [],
+        phones: [],
+        addresses: [],
+        socialLinks: [],
+        source: url,
+        timestamp: new Date(),
+        attribution: 'Error accessing page'
+      };
+    }
+  }
+
+  private isValidEmail(email: string): boolean {
+    return /^[a-zA-Z0-9._%+-]+@[a-zA-Z0-9.-]+\.[a-zA-Z]{2,}$/.test(email);
+  }
+
+  private formatPhoneNumber(phone: string): string {
+    const digits = phone.replace(/\D/g, '');
+    if (digits.length === 10) {
+      return `(${digits.slice(0,3)}) ${digits.slice(3,6)}-${digits.slice(6)}`;
+    }
+    return phone;
+  }
+
+  private formatAddress(address: any): string | null {
+    if (typeof address === 'string') return address;
+    if (typeof address === 'object') {
+      const parts = [
+        address.streetAddress,
+        address.addressLocality,
+        address.addressRegion,
+        address.postalCode
+      ].filter(Boolean);
+      if (parts.length > 0) return parts.join(', ');
+    }
+    return null;
+  }
+} 
--- a/src/lib/providers/business/index.ts
+++ b/src/lib/providers/business/index.ts
@ -0,0 +1,19 @@
+import { Business, SearchParams } from '../../../types/business';
+import { WebScraperProvider } from './webScraper';
+
+export class BusinessProvider {
+  private scraper: WebScraperProvider;
+
+  constructor() {
+    this.scraper = new WebScraperProvider();
+  }
+
+  async search(params: SearchParams): Promise<Business[]> {
+    return this.scraper.search(params);
+  }
+
+  async getDetails(businessId: string): Promise<Business | null> {
+    // Implement detailed business lookup using stored data or additional scraping
+    return null;
+  }
+} 
--- a/src/lib/providers/business/webScraper.ts
+++ b/src/lib/providers/business/webScraper.ts
@ -0,0 +1,111 @@
+import { Business, SearchParams } from '../../../types/business';
+import { searchWeb } from '../search'; // This is Perplexica's existing search function
+import { parseHTML } from '../utils/parser';
+
+export class WebScraperProvider {
+  async search(params: SearchParams): Promise<Business[]> {
+    const searchQueries = this.generateQueries(params);
+    const businesses: Business[] = [];
+
+    for (const query of searchQueries) {
+      // Use Perplexica's existing search functionality
+      const results = await searchWeb(query, {
+        maxResults: 20,
+        type: 'general'  // or 'news' depending on what we want
+      });
+
+      for (const result of results) {
+        try {
+          const html = await fetch(result.url).then(res => res.text());
+          const businessData = await this.extractBusinessData(html, result.url);
+          if (businessData) {
+            businesses.push(businessData);
+          }
+        } catch (error) {
+          console.error(`Failed to extract data from ${result.url}:`, error);
+        }
+      }
+    }
+
+    return this.deduplicateBusinesses(businesses);
+  }
+
+  private generateQueries(params: SearchParams): string[] {
+    const { location, category } = params;
+    return [
+      `${category} in ${location}`,
+      `${category} business ${location}`,
+      `best ${category} near ${location}`,
+      `${category} services ${location} reviews`
+    ];
+  }
+
+  private async extractBusinessData(html: string, sourceUrl: string): Promise<Business | null> {
+    const $ = parseHTML(html);
+    
+    // Different extraction logic based on source
+    if (sourceUrl.includes('yelp.com')) {
+      return this.extractYelpData($);
+    } else if (sourceUrl.includes('yellowpages.com')) {
+      return this.extractYellowPagesData($);
+    }
+    // ... other source-specific extractors
+
+    return null;
+  }
+
+  private extractYelpData($: any): Business | null {
+    try {
+      return {
+        id: crypto.randomUUID(),
+        name: $('.business-name').text().trim(),
+        phone: $('.phone-number').text().trim(),
+        address: $('.address').text().trim(),
+        city: $('.city').text().trim(),
+        state: $('.state').text().trim(),
+        zip: $('.zip').text().trim(),
+        category: $('.category-str-list').text().split(',').map(s => s.trim()),
+        rating: parseFloat($('.rating').text()),
+        reviewCount: parseInt($('.review-count').text()),
+        services: $('.services-list').text().split(',').map(s => s.trim()),
+        hours: this.extractHours($),
+        website: $('.website-link').attr('href'),
+        verified: false,
+        lastUpdated: new Date()
+      };
+    } catch (error) {
+      return null;
+    }
+  }
+
+  private deduplicateBusinesses(businesses: Business[]): Business[] {
+    // Group by phone number and address to identify duplicates
+    const uniqueBusinesses = new Map<string, Business>();
+    
+    for (const business of businesses) {
+      const key = `${business.phone}-${business.address}`.toLowerCase();
+      if (!uniqueBusinesses.has(key)) {
+        uniqueBusinesses.set(key, business);
+      } else {
+        // Merge data if we have additional information
+        const existing = uniqueBusinesses.get(key)!;
+        uniqueBusinesses.set(key, this.mergeBusinessData(existing, business));
+      }
+    }
+
+    return Array.from(uniqueBusinesses.values());
+  }
+
+  private mergeBusinessData(existing: Business, newData: Business): Business {
+    return {
+      ...existing,
+      services: [...new Set([...existing.services, ...newData.services])],
+      rating: (existing.rating + newData.rating) / 2,
+      reviewCount: existing.reviewCount + newData.reviewCount,
+      // Keep the most complete data for other fields
+      website: existing.website || newData.website,
+      email: existing.email || newData.email,
+      hours: existing.hours || newData.hours
+    };
+  }
+} 
--- a/src/lib/search.ts
+++ b/src/lib/search.ts
@ -0,0 +1,54 @@
+import axios from 'axios';
+import { config } from '../config';
+
+interface SearchOptions {
+  maxResults?: number;
+  type?: 'general' | 'news';
+  engines?: string[];
+}
+
+interface SearchResult {
+  url: string;
+  title: string;
+  content: string;
+  score?: number;
+}
+
+export async function searchWeb(
+  query: string,
+  options: SearchOptions = {}
+): Promise<SearchResult[]> {
+  const {
+    maxResults = 20,
+    type = 'general',
+    engines = ['google', 'bing', 'duckduckgo']
+  } = options;
+
+  try {
+    const response = await axios.get(`${config.search.searxngUrl || process.env.SEARXNG_URL}/search`, {
+      params: {
+        q: query,
+        format: 'json',
+        categories: type,
+        engines: engines.join(','),
+        limit: maxResults
+      }
+    });
+
+    if (!response.data || !response.data.results) {
+      console.error('Invalid response from SearxNG:', response.data);
+      return [];
+    }
+
+    return response.data.results.map((result: any) => ({
+      url: result.url,
+      title: result.title,
+      content: result.content || result.snippet || '',
+      score: result.score
+    }));
+
+  } catch (error) {
+    console.error('Search failed:', error);
+    throw error;
+  }
+} 
--- a/src/lib/services/businessCrawler.ts
+++ b/src/lib/services/businessCrawler.ts
@ -0,0 +1,111 @@
+import axios from 'axios';
+import * as cheerio from 'cheerio';
+import { Cache } from '../utils/cache';
+import { RateLimiter } from '../utils/rateLimiter';
+
+interface CrawlResult {
+  mainContent: string;
+  contactInfo: string;
+  aboutInfo: string;
+  structuredData: any;
+}
+
+export class BusinessCrawler {
+  private cache: Cache<CrawlResult>;
+  private rateLimiter: RateLimiter;
+
+  constructor() {
+    this.cache = new Cache<CrawlResult>(60); // 1 hour cache
+    this.rateLimiter = new RateLimiter();
+  }
+
+  async crawlBusinessSite(url: string): Promise<CrawlResult> {
+    // Check cache first
+    const cached = this.cache.get(url);
+    if (cached) return cached;
+
+    await this.rateLimiter.waitForSlot();
+
+    try {
+      const mainPage = await this.fetchPage(url);
+      const $ = cheerio.load(mainPage);
+      
+      // Get all important URLs
+      const contactUrl = this.findContactPage($, url);
+      const aboutUrl = this.findAboutPage($, url);
+
+      // Crawl additional pages
+      const [contactPage, aboutPage] = await Promise.all([
+        contactUrl ? this.fetchPage(contactUrl) : '',
+        aboutUrl ? this.fetchPage(aboutUrl) : ''
+      ]);
+
+      // Extract structured data
+      const structuredData = this.extractStructuredData($);
+
+      const result = {
+        mainContent: $('body').text(),
+        contactInfo: contactPage,
+        aboutInfo: aboutPage,
+        structuredData
+      };
+
+      this.cache.set(url, result);
+      return result;
+    } catch (error) {
+      console.error(`Failed to crawl ${url}:`, error);
+      return {
+        mainContent: '',
+        contactInfo: '',
+        aboutInfo: '',
+        structuredData: {}
+      };
+    }
+  }
+
+  private async fetchPage(url: string): Promise<string> {
+    try {
+      const response = await axios.get(url, {
+        timeout: 10000,
+        headers: {
+          'User-Agent': 'Mozilla/5.0 (compatible; BizSearch/1.0; +http://localhost:3000/about)',
+        }
+      });
+      return response.data;
+    } catch (error) {
+      console.error(`Failed to fetch ${url}:`, error);
+      return '';
+    }
+  }
+
+  private findContactPage($: cheerio.CheerioAPI, baseUrl: string): string | null {
+    const contactLinks = $('a[href*="contact"], a:contains("Contact")');
+    if (contactLinks.length > 0) {
+      const href = contactLinks.first().attr('href');
+      return href ? new URL(href, baseUrl).toString() : null;
+    }
+    return null;
+  }
+
+  private findAboutPage($: cheerio.CheerioAPI, baseUrl: string): string | null {
+    const aboutLinks = $('a[href*="about"], a:contains("About")');
+    if (aboutLinks.length > 0) {
+      const href = aboutLinks.first().attr('href');
+      return href ? new URL(href, baseUrl).toString() : null;
+    }
+    return null;
+  }
+
+  private extractStructuredData($: cheerio.CheerioAPI): any {
+    const structuredData: any[] = [];
+    $('script[type="application/ld+json"]').each((_, element) => {
+      try {
+        const data = JSON.parse($(element).html() || '{}');
+        structuredData.push(data);
+      } catch (error) {
+        console.error('Failed to parse structured data:', error);
+      }
+    });
+    return structuredData;
+  }
+} 
--- a/src/lib/services/cacheService.ts
+++ b/src/lib/services/cacheService.ts
@ -0,0 +1,71 @@
+import { supabase } from '../supabase';
+import { BusinessData } from '../searxng';
+
+export class CacheService {
+  static async getCachedResults(category: string, location: string): Promise<BusinessData[] | null> {
+    try {
+      const { data, error } = await supabase
+        .from('search_cache')
+        .select('results')
+        .eq('category', category.toLowerCase())
+        .eq('location', location.toLowerCase())
+        .gt('expires_at', new Date().toISOString())
+        .order('created_at', { ascending: false })
+        .limit(1)
+        .single();
+
+      if (error) throw error;
+      return data ? data.results : null;
+    } catch (error) {
+      console.error('Cache lookup failed:', error);
+      return null;
+    }
+  }
+
+  static async cacheResults(
+    category: string, 
+    location: string, 
+    results: BusinessData[],
+    expiresInDays: number = 7
+  ): Promise<void> {
+    try {
+      const expiresAt = new Date();
+      expiresAt.setDate(expiresAt.getDate() + expiresInDays);
+
+      const { error } = await supabase
+        .from('search_cache')
+        .insert({
+          query: `${category} in ${location}`,
+          category: category.toLowerCase(),
+          location: location.toLowerCase(),
+          results,
+          expires_at: expiresAt.toISOString()
+        });
+
+      if (error) throw error;
+    } catch (error) {
+      console.error('Failed to cache results:', error);
+    }
+  }
+
+  static async updateCache(
+    category: string,
+    location: string,
+    newResults: BusinessData[]
+  ): Promise<void> {
+    try {
+      const { error } = await supabase
+        .from('search_cache')
+        .update({
+          results: newResults,
+          updated_at: new Date().toISOString()
+        })
+        .eq('category', category.toLowerCase())
+        .eq('location', location.toLowerCase());
+
+      if (error) throw error;
+    } catch (error) {
+      console.error('Failed to update cache:', error);
+    }
+  }
+} 
--- a/src/lib/services/dataValidation.ts
+++ b/src/lib/services/dataValidation.ts
@ -0,0 +1,107 @@
+import { OllamaService } from './ollamaService';
+
+interface ValidatedBusinessData {
+  name: string;
+  phone: string;
+  email: string;
+  address: string;
+  description: string;
+  hours?: string;
+  isValid: boolean;
+}
+
+export class DataValidationService {
+  private ollama: OllamaService;
+
+  constructor() {
+    this.ollama = new OllamaService();
+  }
+
+  async validateAndCleanData(rawText: string): Promise<ValidatedBusinessData> {
+    try {
+      const prompt = `
+        You are a business data validation expert. Extract and validate business information from the following text.
+        Return ONLY a JSON object with the following format, nothing else:
+        {
+          "name": "verified business name",
+          "phone": "formatted phone number or N/A",
+          "email": "verified email address or N/A",
+          "address": "verified physical address or N/A",
+          "description": "short business description",
+          "hours": "business hours if available",
+          "isValid": boolean
+        }
+
+        Rules:
+        1. Phone numbers should be in (XXX) XXX-XXXX format
+        2. Addresses should be properly formatted with street, city, state, zip
+        3. Remove any irrelevant text from descriptions
+        4. Set isValid to true only if name and at least one contact method is found
+        5. Clean up any obvious formatting issues
+        6. Validate email addresses for proper format
+
+        Text to analyze:
+        ${rawText}
+      `;
+
+      const response = await this.ollama.generateResponse(prompt);
+      
+      try {
+        // Find the JSON object in the response
+        const jsonMatch = response.match(/\{[\s\S]*\}/);
+        if (!jsonMatch) {
+          throw new Error('No JSON found in response');
+        }
+        
+        const result = JSON.parse(jsonMatch[0]);
+        return this.validateResult(result);
+      } catch (parseError) {
+        console.error('Failed to parse Ollama response:', parseError);
+        throw parseError;
+      }
+    } catch (error) {
+      console.error('Data validation failed:', error);
+      return {
+        name: 'Unknown',
+        phone: 'N/A',
+        email: 'N/A',
+        address: 'N/A',
+        description: '',
+        hours: '',
+        isValid: false
+      };
+    }
+  }
+
+  private validateResult(result: any): ValidatedBusinessData {
+    // Ensure all required fields are present
+    const validated: ValidatedBusinessData = {
+      name: this.cleanField(result.name) || 'Unknown',
+      phone: this.formatPhone(result.phone) || 'N/A',
+      email: this.cleanField(result.email) || 'N/A',
+      address: this.cleanField(result.address) || 'N/A',
+      description: this.cleanField(result.description) || '',
+      hours: this.cleanField(result.hours),
+      isValid: Boolean(result.isValid)
+    };
+
+    return validated;
+  }
+
+  private cleanField(value: any): string {
+    if (!value || typeof value !== 'string') return '';
+    return value.trim().replace(/\s+/g, ' ');
+  }
+
+  private formatPhone(phone: string): string {
+    if (!phone || phone === 'N/A') return 'N/A';
+    
+    // Extract digits
+    const digits = phone.replace(/\D/g, '');
+    if (digits.length === 10) {
+      return `(${digits.slice(0,3)}) ${digits.slice(3,6)}-${digits.slice(6)}`;
+    }
+    
+    return phone;
+  }
+} 
--- a/src/lib/services/healthCheck.ts
+++ b/src/lib/services/healthCheck.ts
@ -0,0 +1,53 @@
+import axios from 'axios';
+import { env } from '../../config/env';
+import { supabase } from '../supabase';
+
+export class HealthCheckService {
+  static async checkOllama(): Promise<boolean> {
+    try {
+      const response = await axios.get(`${env.ollama.url}/api/tags`);
+      return response.status === 200;
+    } catch (error) {
+      console.error('Ollama health check failed:', error);
+      return false;
+    }
+  }
+
+  static async checkSearxNG(): Promise<boolean> {
+    try {
+      const response = await axios.get(`${env.searxng.currentUrl}/config`);
+      return response.status === 200;
+    } catch (error) {
+      try {
+        const response = await axios.get(`${env.searxng.instances[0]}/config`);
+        return response.status === 200;
+      } catch (fallbackError) {
+        console.error('SearxNG health check failed:', error);
+        return false;
+      }
+    }
+  }
+
+  static async checkSupabase(): Promise<boolean> {
+    try {
+      console.log('Checking Supabase connection...');
+      console.log('URL:', env.supabase.url);
+
+      // Just check if we can connect and query, don't care about results
+      const { error } = await supabase
+        .from('businesses')
+        .select('count', { count: 'planned', head: true });
+
+      if (error) {
+        console.error('Supabase query error:', error);
+        return false;
+      }
+
+      console.log('Supabase connection successful');
+      return true;
+    } catch (error) {
+      console.error('Supabase connection failed:', error);
+      return false;
+    }
+  }
+} 
--- a/src/lib/services/ollamaService.ts
+++ b/src/lib/services/ollamaService.ts
@ -0,0 +1,36 @@
+import axios from 'axios';
+import { env } from '../../config/env';
+
+interface OllamaResponse {
+  response: string;
+  context?: number[];
+}
+
+export class OllamaService {
+  private url: string;
+  private model: string;
+
+  constructor() {
+    this.url = env.ollama.url;
+    this.model = env.ollama.model;
+  }
+
+  async complete(prompt: string): Promise<string> {
+    try {
+      const response = await axios.post(`${this.url}/api/generate`, {
+        model: this.model,
+        prompt: prompt,
+        stream: false,
+        options: {
+          temperature: 0.7,
+          top_p: 0.9
+        }
+      });
+
+      return response.data.response;
+    } catch (error) {
+      console.error('Ollama completion failed:', error);
+      throw error;
+    }
+  }
+} 
--- a/src/lib/services/supabaseService.ts
+++ b/src/lib/services/supabaseService.ts
@ -0,0 +1,93 @@
+import { createClient } from '@supabase/supabase-js';
+import { env } from '../../config/env';
+import { BusinessData } from '../searxng';
+
+export class SupabaseService {
+  private supabase;
+
+  constructor() {
+    this.supabase = createClient(env.supabase.url, env.supabase.anonKey);
+  }
+
+  async upsertBusinesses(businesses: BusinessData[]): Promise<void> {
+    try {
+      console.log('Upserting businesses to Supabase:', businesses.length);
+
+      for (const business of businesses) {
+        try {
+          // Create a unique identifier based on multiple properties
+          const identifier = [
+            business.name.toLowerCase(),
+            business.phone?.replace(/\D/g, ''),
+            business.address?.toLowerCase(),
+            business.website?.toLowerCase()
+          ]
+            .filter(Boolean)  // Remove empty values
+            .join('_')       // Join with underscore
+            .replace(/[^a-z0-9]/g, '_');  // Replace non-alphanumeric chars
+
+          // Log the data being inserted
+          console.log('Upserting business:', {
+            id: identifier,
+            name: business.name,
+            phone: business.phone,
+            email: business.email,
+            address: business.address,
+            rating: business.rating,
+            website: business.website,
+            location: business.location
+          });
+
+          // Check if business exists
+          const { data: existing, error: selectError } = await this.supabase
+            .from('businesses')
+            .select('rating, search_count')
+            .eq('id', identifier)
+            .single();
+
+          if (selectError && selectError.code !== 'PGRST116') {
+            console.error('Error checking existing business:', selectError);
+          }
+
+          // Prepare upsert data
+          const upsertData = {
+            id: identifier,
+            name: business.name,
+            phone: business.phone || null,
+            email: business.email || null,
+            address: business.address || null,
+            rating: existing ? Math.max(business.rating, existing.rating) : business.rating,
+            website: business.website || null,
+            logo: business.logo || null,
+            source: business.source || null,
+            description: business.description || null,
+            latitude: business.location?.lat || null,
+            longitude: business.location?.lng || null,
+            last_updated: new Date().toISOString(),
+            search_count: existing ? existing.search_count + 1 : 1
+          };
+
+          console.log('Upserting with data:', upsertData);
+
+          const { error: upsertError } = await this.supabase
+            .from('businesses')
+            .upsert(upsertData, {
+              onConflict: 'id'
+            });
+
+          if (upsertError) {
+            console.error('Error upserting business:', upsertError);
+            console.error('Failed business data:', upsertData);
+          } else {
+            console.log(`Successfully upserted business: ${business.name}`);
+          }
+        } catch (businessError) {
+          console.error('Error processing business:', business.name, businessError);
+        }
+      }
+    } catch (error) {
+      console.error('Error saving businesses to Supabase:', error);
+      throw error;
+    }
+  }
+} 
--- a/src/lib/supabase.ts
+++ b/src/lib/supabase.ts
@ -0,0 +1,42 @@
+import { createClient } from '@supabase/supabase-js';
+import { env } from '../config/env';
+
+// Validate Supabase configuration
+if (!env.supabase.url || !env.supabase.anonKey) {
+  throw new Error('Missing Supabase configuration');
+}
+
+// Create Supabase client
+export const supabase = createClient(
+  env.supabase.url,
+  env.supabase.anonKey,
+  {
+    auth: {
+      autoRefreshToken: true,
+      persistSession: true
+    }
+  }
+);
+
+// Test the connection on startup
+async function testConnection() {
+  try {
+    console.log('Checking Supabase connection...');
+    console.log('URL:', env.supabase.url);
+    
+    const { error } = await supabase
+      .from('businesses')
+      .select('count', { count: 'planned', head: true });
+
+    if (error) {
+      console.error('❌ Supabase initialization error:', error);
+    } else {
+      console.log('✅ Supabase connection initialized successfully');
+    }
+  } catch (error) {
+    console.error('❌ Failed to initialize Supabase:', error);
+  }
+}
+
+// Run the test
+testConnection().catch(console.error); 
--- a/src/lib/types.ts
+++ b/src/lib/types.ts
@ -0,0 +1,28 @@
+export interface BusinessData {
+    id?: string;
+    name: string;
+    phone?: string;
+    email?: string;
+    address?: string;
+    rating?: number;
+    website?: string;
+    logo?: string;
+    source?: string;
+    description?: string;
+    location?: {
+        lat: number;
+        lng: number;
+    };
+    latitude?: number;
+    longitude?: number;
+    place_id?: string;
+    photos?: string[];
+    openingHours?: string[];
+    distance?: {
+        value: number;
+        unit: string;
+    };
+    last_updated?: string;
+    search_count?: number;
+    created_at?: string;
+} 
--- a/src/lib/utils/cache.ts
+++ b/src/lib/utils/cache.ts
@ -0,0 +1,36 @@
+interface CacheItem<T> {
+  data: T;
+  timestamp: number;
+}
+
+export class Cache<T> {
+  private store = new Map<string, CacheItem<T>>();
+  private ttl: number;
+
+  constructor(ttlMinutes: number = 60) {
+    this.ttl = ttlMinutes * 60 * 1000;
+  }
+
+  set(key: string, value: T): void {
+    this.store.set(key, {
+      data: value,
+      timestamp: Date.now()
+    });
+  }
+
+  get(key: string): T | null {
+    const item = this.store.get(key);
+    if (!item) return null;
+
+    if (Date.now() - item.timestamp > this.ttl) {
+      this.store.delete(key);
+      return null;
+    }
+
+    return item.data;
+  }
+
+  clear(): void {
+    this.store.clear();
+  }
+} 
--- a/src/lib/utils/dataCleanup.ts
+++ b/src/lib/utils/dataCleanup.ts
@ -0,0 +1,30 @@
+export function normalizePhoneNumber(phone: string): string {
+  return phone.replace(/[^\d]/g, '');
+}
+
+export function normalizeAddress(address: string): string {
+  // Remove common suffixes and standardize format
+  return address
+    .toLowerCase()
+    .replace(/(street|st\.?|avenue|ave\.?|road|rd\.?)/g, '')
+    .trim();
+}
+
+export function extractZipCode(text: string): string | null {
+  const match = text.match(/\b\d{5}(?:-\d{4})?\b/);
+  return match ? match[0] : null;
+}
+
+export function calculateReliabilityScore(business: Business): number {
+  let score = 0;
+  
+  // More complete data = higher score
+  if (business.phone) score += 2;
+  if (business.website) score += 1;
+  if (business.email) score += 1;
+  if (business.hours) score += 2;
+  if (business.services.length > 0) score += 1;
+  if (business.reviewCount > 10) score += 2;
+  
+  return score;
+} 
--- a/src/lib/utils/rateLimiter.ts
+++ b/src/lib/utils/rateLimiter.ts
@ -0,0 +1,23 @@
+export class RateLimiter {
+  private timestamps: number[] = [];
+  private readonly windowMs: number;
+  private readonly maxRequests: number;
+
+  constructor(windowMs: number = 60000, maxRequests: number = 30) {
+    this.windowMs = windowMs;
+    this.maxRequests = maxRequests;
+  }
+
+  async waitForSlot(): Promise<void> {
+    const now = Date.now();
+    this.timestamps = this.timestamps.filter(time => now - time < this.windowMs);
+
+    if (this.timestamps.length >= this.maxRequests) {
+      const oldestRequest = this.timestamps[0];
+      const waitTime = this.windowMs - (now - oldestRequest);
+      await new Promise(resolve => setTimeout(resolve, waitTime));
+    }
+
+    this.timestamps.push(now);
+  }
+} 
--- a/src/lib/utils/structuredDataParser.ts
+++ b/src/lib/utils/structuredDataParser.ts
@ -0,0 +1,119 @@
+import * as cheerio from 'cheerio';
+
+interface StructuredData {
+  name?: string;
+  email?: string;
+  phone?: string;
+  address?: string;
+  socialProfiles?: string[];
+  openingHours?: Record<string, string>;
+  description?: string;
+}
+
+export class StructuredDataParser {
+  static parse($: cheerio.CheerioAPI): StructuredData[] {
+    const results: StructuredData[] = [];
+
+    // Parse JSON-LD
+    $('script[type="application/ld+json"]').each((_, element) => {
+      try {
+        const data = JSON.parse($(element).html() || '{}');
+        if (Array.isArray(data)) {
+          data.forEach(item => this.parseStructuredItem(item, results));
+        } else {
+          this.parseStructuredItem(data, results);
+        }
+      } catch (e) {
+        console.error('Error parsing JSON-LD:', e);
+      }
+    });
+
+    // Parse microdata
+    $('[itemtype]').each((_, element) => {
+      const type = $(element).attr('itemtype');
+      if (type?.includes('Organization') || type?.includes('LocalBusiness')) {
+        const data: StructuredData = {
+          name: $('[itemprop="name"]', element).text(),
+          email: $('[itemprop="email"]', element).text(),
+          phone: $('[itemprop="telephone"]', element).text(),
+          address: this.extractMicrodataAddress($, element),
+          socialProfiles: this.extractSocialProfiles($, element)
+        };
+        results.push(data);
+      }
+    });
+
+    // Parse RDFa
+    $('[typeof="Organization"], [typeof="LocalBusiness"]').each((_, element) => {
+      const data: StructuredData = {
+        name: $('[property="name"]', element).text(),
+        email: $('[property="email"]', element).text(),
+        phone: $('[property="telephone"]', element).text(),
+        address: this.extractRdfaAddress($, element),
+        socialProfiles: this.extractSocialProfiles($, element)
+      };
+      results.push(data);
+    });
+
+    return results;
+  }
+
+  private static parseStructuredItem(data: any, results: StructuredData[]): void {
+    if (data['@type'] === 'Organization' || data['@type'] === 'LocalBusiness') {
+      results.push({
+        name: data.name,
+        email: data.email,
+        phone: data.telephone,
+        address: this.formatAddress(data.address),
+        socialProfiles: this.extractSocialUrls(data),
+        openingHours: this.parseOpeningHours(data.openingHours),
+        description: data.description
+      });
+    }
+  }
+
+  private static formatAddress(address: any): string | undefined {
+    if (typeof address === 'string') return address;
+    if (typeof address === 'object') {
+      const parts = [
+        address.streetAddress,
+        address.addressLocality,
+        address.addressRegion,
+        address.postalCode,
+        address.addressCountry
+      ].filter(Boolean);
+      return parts.join(', ');
+    }
+    return undefined;
+  }
+
+  private static extractSocialUrls(data: any): string[] {
+    const urls: string[] = [];
+    if (data.sameAs) {
+      if (Array.isArray(data.sameAs)) {
+        urls.push(...data.sameAs);
+      } else if (typeof data.sameAs === 'string') {
+        urls.push(data.sameAs);
+      }
+    }
+    return urls;
+  }
+
+  private static parseOpeningHours(hours: any): Record<string, string> | undefined {
+    if (!hours) return undefined;
+    
+    if (Array.isArray(hours)) {
+      const schedule: Record<string, string> = {};
+      hours.forEach(spec => {
+        const match = spec.match(/^(\w+)(-\w+)?\s+(\d\d:\d\d)-(\d\d:\d\d)$/);
+        if (match) {
+          schedule[match[1]] = `${match[3]}-${match[4]}`;
+        }
+      });
+      return schedule;
+    }
+    return undefined;
+  }
+
+  // ... helper methods for microdata and RDFa parsing ...
+} 
--- a/src/routes/api.ts
+++ b/src/routes/api.ts
@ -0,0 +1,88 @@
+import { Router } from 'express';
+import { searchBusinesses } from '../lib/searxng';
+import { categories } from '../lib/categories';
+import { supabase } from '../lib/supabase';
+import { BusinessData } from '../lib/types';
+
+const router = Router();
+
+// Categories endpoint
+router.get('/categories', (req, res) => {
+  res.json(categories);
+});
+
+// Search endpoint
+router.get('/search', async (req, res) => {
+    try {
+        const query = req.query.q as string;
+        const [searchTerm, location] = query.split(' in ');
+        
+        if (!query) {
+            return res.status(400).json({ error: 'Search query is required' });
+        }
+
+        // Set headers for streaming response
+        res.setHeader('Content-Type', 'application/json');
+        res.setHeader('Transfer-Encoding', 'chunked');
+
+        // First, search in Supabase
+        const { data: existingResults, error: dbError } = await supabase
+            .from('businesses')
+            .select('*')
+            .or(`name.ilike.%${searchTerm}%, description.ilike.%${searchTerm}%`)
+            .ilike('address', `%${location}%`);
+
+        if (dbError) {
+            console.error('Supabase search error:', dbError);
+        }
+
+        // Send existing results immediately if there are any
+        if (existingResults && existingResults.length > 0) {
+            const chunk = JSON.stringify({ 
+                source: 'database',
+                results: existingResults 
+            }) + '\n';
+            res.write(chunk);
+        }
+
+        // Start background search
+        const searchPromise = searchBusinesses(query, {
+            onProgress: (status, progress) => {
+                const chunk = JSON.stringify({ 
+                    source: 'search',
+                    status,
+                    progress,
+                }) + '\n';
+                res.write(chunk);
+            }
+        });
+
+        const results = await searchPromise;
+        
+        // Send final results
+        const finalChunk = JSON.stringify({ 
+            source: 'search',
+            results,
+            complete: true 
+        }) + '\n';
+        res.write(finalChunk);
+        res.end();
+
+    } catch (error: unknown) {
+        console.error('Search error:', error);
+        const errorResponse = { 
+            error: 'An error occurred while searching',
+            details: error instanceof Error ? error.message : 'Unknown error'
+        };
+        
+        // Only send error response if headers haven't been sent
+        if (!res.headersSent) {
+            res.status(500).json(errorResponse);
+        } else {
+            res.write(JSON.stringify(errorResponse));
+            res.end();
+        }
+    }
+});
+
+export default router; 
--- a/src/test-supabase.ts
+++ b/src/test-supabase.ts
@ -0,0 +1,102 @@
+import { createClient } from '@supabase/supabase-js';
+import dotenv from 'dotenv';
+
+// Load environment variables
+dotenv.config();
+
+async function testSupabaseConnection() {
+    console.log('Testing Supabase connection...');
+    console.log('URL:', process.env.SUPABASE_URL);
+    console.log('Key length:', process.env.SUPABASE_KEY?.length || 0);
+
+    try {
+        const supabase = createClient(
+            process.env.SUPABASE_URL!,
+            process.env.SUPABASE_KEY!,
+            {
+                auth: {
+                    autoRefreshToken: true,
+                    persistSession: true
+                }
+            }
+        );
+
+        // Test businesses table
+        console.log('\nTesting businesses table:');
+        const testBusiness = {
+            id: 'test_' + Date.now(),
+            name: 'Test Business',
+            phone: '123-456-7890',
+            email: 'test@example.com',
+            address: '123 Test St',
+            rating: 5,
+            website: 'https://test.com',
+            source: 'test',
+            description: 'Test description',
+            latitude: 39.7392,
+            longitude: -104.9903,
+            search_count: 1,
+            created_at: new Date().toISOString()
+        };
+
+        const { error: insertBusinessError } = await supabase
+            .from('businesses')
+            .insert([testBusiness])
+            .select();
+
+        if (insertBusinessError) {
+            console.error('❌ INSERT business error:', insertBusinessError);
+        } else {
+            console.log('✅ INSERT business OK');
+            // Clean up
+            await supabase.from('businesses').delete().eq('id', testBusiness.id);
+        }
+
+        // Test searches table
+        console.log('\nTesting searches table:');
+        const testSearch = {
+            query: 'test query',
+            location: 'test location',
+            results_count: 0,
+            timestamp: new Date().toISOString()
+        };
+
+        const { error: insertSearchError } = await supabase
+            .from('searches')
+            .insert([testSearch])
+            .select();
+
+        if (insertSearchError) {
+            console.error('❌ INSERT search error:', insertSearchError);
+        } else {
+            console.log('✅ INSERT search OK');
+        }
+
+        // Test cache table
+        console.log('\nTesting cache table:');
+        const testCache = {
+            key: 'test_key_' + Date.now(),
+            value: { test: true },
+            created_at: new Date().toISOString(),
+            expires_at: new Date(Date.now() + 3600000).toISOString()
+        };
+
+        const { error: insertCacheError } = await supabase
+            .from('cache')
+            .insert([testCache])
+            .select();
+
+        if (insertCacheError) {
+            console.error('❌ INSERT cache error:', insertCacheError);
+        } else {
+            console.log('✅ INSERT cache OK');
+            // Clean up
+            await supabase.from('cache').delete().eq('key', testCache.key);
+        }
+
+    } catch (error: any) {
+        console.error('❌ Unexpected error:', error);
+    }
+}
+
+testSupabaseConnection().catch(console.error); 
--- a/src/tests/supabaseTest.ts
+++ b/src/tests/supabaseTest.ts
@ -0,0 +1,94 @@
+import '../config/env';  // Load env vars first
+import { CacheService } from '../lib/services/cacheService';
+import type { PostgrestError } from '@supabase/supabase-js';
+import { env } from '../config/env';
+
+async function testSupabaseConnection() {
+  console.log('\n🔍 Testing Supabase Connection...');
+  console.log('Using Supabase URL:', env.supabase.url);
+  
+  try {
+    // Test data
+    const testData = {
+      category: 'test_category',
+      location: 'test_location',
+      results: [{
+        name: 'Test Business',
+        phone: '123-456-7890',
+        email: 'test@example.com',
+        address: '123 Test St, Test City, TS 12345',
+        rating: 95,
+        website: 'https://test.com',
+        logo: '',
+        source: 'test',
+        description: 'Test business description'
+      }]
+    };
+
+    console.log('\n1️⃣ Testing write operation...');
+    await CacheService.cacheResults(
+      testData.category,
+      testData.location,
+      testData.results,
+      env.cache.durationDays
+    );
+    console.log('✅ Write successful');
+
+    console.log('\n2️⃣ Testing read operation...');
+    const cachedResults = await CacheService.getCachedResults(
+      testData.category,
+      testData.location
+    );
+    
+    if (cachedResults && cachedResults.length > 0) {
+      console.log('✅ Read successful');
+      console.log('\nCached data:', JSON.stringify(cachedResults[0], null, 2));
+    } else {
+      throw new Error('No results found in cache');
+    }
+
+    console.log('\n3️⃣ Testing update operation...');
+    const updatedResults = [...testData.results];
+    updatedResults[0].rating = 98;
+    await CacheService.updateCache(
+      testData.category,
+      testData.location,
+      updatedResults
+    );
+    console.log('✅ Update successful');
+
+    console.log('\n✨ All tests passed! Supabase connection is working properly.\n');
+
+  } catch (error: unknown) {
+    console.error('\n❌ Test failed:');
+    
+    if (error instanceof Error) {
+      console.error('Error message:', error.message);
+      
+      // Check if it's a Supabase error by looking at the shape of the error object
+      const isSupabaseError = (err: any): err is PostgrestError => 
+        'code' in err && 'details' in err && 'hint' in err && 'message' in err;
+      
+      if (error.message.includes('connection') || isSupabaseError(error)) {
+        console.log('\n📋 Troubleshooting steps:');
+        console.log('1. Check if your SUPABASE_URL and SUPABASE_ANON_KEY are correct in .env');
+        console.log('2. Verify that the search_cache table exists in your Supabase project');
+        console.log('3. Check if RLS policies are properly configured');
+        
+        if (isSupabaseError(error)) {
+          console.log('\nSupabase error details:');
+          console.log('Code:', error.code);
+          console.log('Details:', error.details);
+          console.log('Hint:', error.hint);
+        }
+      }
+    } else {
+      console.error('Unknown error:', error);
+    }
+    
+    process.exit(1);
+  }
+}
+
+// Run the test
+testSupabaseConnection(); 
--- a/src/tests/testSearch.ts
+++ b/src/tests/testSearch.ts
@ -0,0 +1,26 @@
+import { searchSearxng } from '../lib/searxng';
+
+async function testSearchEngine() {
+  try {
+    console.log('Testing SearxNG connection...');
+    
+    const results = await searchSearxng('plumbers in Denver', {
+      engines: ['google', 'bing', 'duckduckgo'],
+      pageno: 1
+    });
+
+    if (results && results.results && results.results.length > 0) {
+      console.log('✅ Search successful!');
+      console.log('Number of results:', results.results.length);
+      console.log('First result:', results.results[0]);
+    } else {
+      console.log('❌ No results found');
+    }
+
+  } catch (error) {
+    console.error('❌ Search test failed:', error);
+    console.error('Make sure SearxNG is running on http://localhost:4000');
+  }
+}
+
+testSearchEngine(); 
--- a/src/types/business.ts
+++ b/src/types/business.ts
@ -0,0 +1,28 @@
+export interface Business {
+  id: string;
+  name: string;
+  phone: string;
+  address: string;
+  city: string;
+  state: string;
+  zip: string;
+  category: string[];
+  rating: number;
+  reviewCount: number;
+  license?: string;
+  services: string[];
+  hours: Record<string, string>;
+  website?: string;
+  email?: string;
+  verified: boolean;
+  lastUpdated: Date;
+}
+
+export interface SearchParams {
+  location: string;
+  category?: string;
+  radius?: number;
+  minRating?: number;
+  sortBy?: 'rating' | 'distance' | 'reviewCount';
+  verified?: boolean;
+} 
--- a/tsconfig.json
+++ b/tsconfig.json
@ -1,18 +1,17 @@
 {
  "compilerOptions": {
-    "lib": ["ESNext"],
-    "module": "Node16",
-    "moduleResolution": "Node16",
-    "target": "ESNext",
-    "outDir": "dist",
-    "sourceMap": false,
+    "target": "ES2020",
+    "module": "commonjs",
+    "lib": ["es2020", "DOM"],
+    "outDir": "./dist",
+    "rootDir": "./src",
+    "strict": true,
    "esModuleInterop": true,
-    "experimentalDecorators": true,
-    "emitDecoratorMetadata": true,
-    "allowSyntheticDefaultImports": true,
    "skipLibCheck": true,
-    "skipDefaultLibCheck": true
+    "forceConsistentCasingInFileNames": true,
+    "moduleResolution": "node",
+    "resolveJsonModule": true
  },
-  "include": ["src"],
-  "exclude": ["node_modules", "**/*.spec.ts"]
+  "include": ["src/**/*"],
+  "exclude": ["node_modules", "dist"]
 }