Files
Telegram_AI_bot/PGVECTOR_SETUP.md
2026-04-13 22:46:12 +00:00

2.8 KiB

pgvector Setup Guide for PostgreSQL 16

1. Install pgvector on your server

Debian / Ubuntu

sudo apt install postgresql-16-pgvector

From source (if package not available)

sudo apt install postgresql-server-dev-16 build-essential git
git clone https://github.com/pgvector/pgvector.git
cd pgvector
make
sudo make install

Arch Linux

yay -S pgvector
# or
paru -S pgvector

Docker (if running PostgreSQL in a container)

Use the official pgvector image instead:

FROM pgvector/pgvector:pg16

2. Enable the extension in your database

Connect to your database and run:

\c manel_bot
CREATE EXTENSION IF NOT EXISTS vector;

The bot also does this automatically on startup via init_db().


3. Create the vector index (do this after ~1000 messages)

Running this on an empty table does nothing useful. Wait until you have a good amount of history, then run:

\c manel_bot
CREATE INDEX idx_history_embedding
    ON history USING ivfflat (embedding vector_cosine_ops)
    WITH (lists = 100);

Why wait? IVFFlat needs enough data to build meaningful clusters. Until then the bot works fine via sequential scan — just slightly slower for very large histories.

For smaller deployments (<100k rows) you can also use HNSW which doesn't need pre-training:

CREATE INDEX idx_history_embedding
    ON history USING hnsw (embedding vector_cosine_ops)
    WITH (m = 16, ef_construction = 64);

4. Check the dimension matches your model

nomic-embed-text-v2-moe outputs 768-dimensional vectors. This is set in .env_credentials as EMBED_DIM=768.

If you ever switch embedding models, update EMBED_DIM to match and recreate the history table (the vector column dimension is fixed):

-- Only if switching models — this drops all history!
ALTER TABLE history DROP COLUMN embedding;
ALTER TABLE history ADD COLUMN embedding vector(NEW_DIM);

5. Verify everything is working

\c manel_bot
SELECT extname, extversion FROM pg_extension WHERE extname = 'vector';
-- Should return: vector | 0.8.x

SELECT COUNT(*), COUNT(embedding) FROM history;
-- After some messages: total count and how many have embeddings

Ollama model setup

Make sure the model is pulled on your Ollama machine:

ollama pull nomic-embed-text-v2-moe

Test it's reachable from the bot server:

curl http://YOUR_OLLAMA_IP:11434/api/embed \
  -d '{"model":"nomic-embed-text-v2-moe","input":"test"}'

Should return a JSON with an embeddings array.

sudo -u postgres psql sqlCREATE USER manel WITH PASSWORD 'choose_a_password'; CREATE DATABASE manel_bot OWNER manel; GRANT ALL PRIVILEGES ON DATABASE manel_bot TO manel; get in the database and CREATE EXTENSION IF NOT EXISTS vector;

python bot.py 2>&1 | tee manel.log