First commit from first beta.
This commit is contained in:
124
PGVECTOR_SETUP.md
Normal file
124
PGVECTOR_SETUP.md
Normal file
@@ -0,0 +1,124 @@
|
||||
# pgvector Setup Guide for PostgreSQL 16
|
||||
|
||||
## 1. Install pgvector on your server
|
||||
|
||||
### Debian / Ubuntu
|
||||
```bash
|
||||
sudo apt install postgresql-16-pgvector
|
||||
```
|
||||
|
||||
### From source (if package not available)
|
||||
```bash
|
||||
sudo apt install postgresql-server-dev-16 build-essential git
|
||||
git clone https://github.com/pgvector/pgvector.git
|
||||
cd pgvector
|
||||
make
|
||||
sudo make install
|
||||
```
|
||||
|
||||
### Arch Linux
|
||||
```bash
|
||||
yay -S pgvector
|
||||
# or
|
||||
paru -S pgvector
|
||||
```
|
||||
|
||||
### Docker (if running PostgreSQL in a container)
|
||||
Use the official pgvector image instead:
|
||||
```dockerfile
|
||||
FROM pgvector/pgvector:pg16
|
||||
```
|
||||
|
||||
---
|
||||
|
||||
## 2. Enable the extension in your database
|
||||
|
||||
Connect to your database and run:
|
||||
```sql
|
||||
\c manel_bot
|
||||
CREATE EXTENSION IF NOT EXISTS vector;
|
||||
```
|
||||
|
||||
The bot also does this automatically on startup via `init_db()`.
|
||||
|
||||
---
|
||||
|
||||
## 3. Create the vector index (do this after ~1000 messages)
|
||||
|
||||
Running this on an empty table does nothing useful. Wait until you have
|
||||
a good amount of history, then run:
|
||||
|
||||
```sql
|
||||
\c manel_bot
|
||||
CREATE INDEX idx_history_embedding
|
||||
ON history USING ivfflat (embedding vector_cosine_ops)
|
||||
WITH (lists = 100);
|
||||
```
|
||||
|
||||
**Why wait?** IVFFlat needs enough data to build meaningful clusters.
|
||||
Until then the bot works fine via sequential scan — just slightly slower
|
||||
for very large histories.
|
||||
|
||||
For smaller deployments (<100k rows) you can also use HNSW which
|
||||
doesn't need pre-training:
|
||||
```sql
|
||||
CREATE INDEX idx_history_embedding
|
||||
ON history USING hnsw (embedding vector_cosine_ops)
|
||||
WITH (m = 16, ef_construction = 64);
|
||||
```
|
||||
|
||||
---
|
||||
|
||||
## 4. Check the dimension matches your model
|
||||
|
||||
nomic-embed-text-v2-moe outputs **768-dimensional** vectors.
|
||||
This is set in `.env_credentials` as `EMBED_DIM=768`.
|
||||
|
||||
If you ever switch embedding models, update `EMBED_DIM` to match
|
||||
and recreate the history table (the vector column dimension is fixed):
|
||||
```sql
|
||||
-- Only if switching models — this drops all history!
|
||||
ALTER TABLE history DROP COLUMN embedding;
|
||||
ALTER TABLE history ADD COLUMN embedding vector(NEW_DIM);
|
||||
```
|
||||
|
||||
---
|
||||
|
||||
## 5. Verify everything is working
|
||||
|
||||
```sql
|
||||
\c manel_bot
|
||||
SELECT extname, extversion FROM pg_extension WHERE extname = 'vector';
|
||||
-- Should return: vector | 0.8.x
|
||||
|
||||
SELECT COUNT(*), COUNT(embedding) FROM history;
|
||||
-- After some messages: total count and how many have embeddings
|
||||
```
|
||||
|
||||
---
|
||||
|
||||
## Ollama model setup
|
||||
|
||||
Make sure the model is pulled on your Ollama machine:
|
||||
```bash
|
||||
ollama pull nomic-embed-text-v2-moe
|
||||
```
|
||||
|
||||
Test it's reachable from the bot server:
|
||||
```bash
|
||||
curl http://YOUR_OLLAMA_IP:11434/api/embed \
|
||||
-d '{"model":"nomic-embed-text-v2-moe","input":"test"}'
|
||||
```
|
||||
|
||||
Should return a JSON with an `embeddings` array.
|
||||
|
||||
|
||||
|
||||
sudo -u postgres psql
|
||||
sqlCREATE USER manel WITH PASSWORD 'choose_a_password';
|
||||
CREATE DATABASE manel_bot OWNER manel;
|
||||
GRANT ALL PRIVILEGES ON DATABASE manel_bot TO manel;
|
||||
get in the database and
|
||||
CREATE EXTENSION IF NOT EXISTS vector;
|
||||
|
||||
python bot.py 2>&1 | tee manel.log
|
||||
Reference in New Issue
Block a user