Update botserver: KB indexing fixes (kb_collections upsert, collection names, indexed flag)

This commit is contained in:
Rodrigo Rodriguez (Pragmatismo) 2026-04-11 21:26:24 -03:00
parent 5b5e3202e5
commit 6b857e8d17
2 changed files with 45 additions and 1 deletions

@ -1 +1 @@
Subproject commit e81aee62211143e453d14bbf11b647576e1f2e23
Subproject commit 7a1ec157f10f6aba6e7fa329fad30219da4049d4

44
prompts/v6.2.md Normal file
View file

@ -0,0 +1,44 @@
# v6.2 — Make KB "cartas" work end-to-end
## What we want
User clicks "Cartas" → `cartas.bas` runs → `USE KB "cartas"` → searches Qdrant → bot answers with KB content. No restarts.
## 3 Bugs we found
### Bug 1: KB files re-indexed every 10s (wasteful) ✅ FIXED
Every cycle, `check_gbkb_changes` replaces file_states with `indexed: false`, so DriveMonitor re-downloads and re-indexes all PDFs every 10 seconds.
**Fix:** Preserve `indexed: true` when etag hasn't changed.
**File:** `botserver/src/drive/drive_monitor/mod.rs:1376`
### Bug 2: USE KB looks for wrong collection name ✅ FIXED
When `kb_collections` has no entry for "cartas", `USE KB` creates a collection name using a random UUID (`salesianos_<random>_cartas`). But Qdrant has `salesianos_6deedba8_cartas`. They never match → search returns nothing.
**Fix:** Use `bot_id_short` (first 8 chars of bot UUID) consistently. Also changed `ON CONFLICT DO NOTHING` to `DO UPDATE` so stale entries get corrected.
**File:** `botserver/src/basic/keywords/use_kb.rs:221-244`
### Bug 3: KB indexing never writes to kb_collections table ✅ FIXED
`index_kb_folder` creates a Qdrant collection and indexes documents, but never writes a row to `kb_collections`. So when `USE KB "cartas"` runs, it queries `kb_collections` → empty → hits Bug 2's fallback path.
**Fix:** After indexing, upsert into `kb_collections` with correct collection name.
**File:** `botserver/src/core/kb/mod.rs:167-220`
Also changed `process_gbkb_folder` return type from `Result<()>` to `Result<IndexingResult>` so `handle_gbkb_change` can use `collection_name` and `documents_processed`.
## Checklist
- [x] Bug 1 code fix (file_states indexed flag)
- [x] Bug 2 code fix (USE KB collection name)
- [x] Bug 3 code fix (kb_collections upsert after indexing)
- [x] `cargo check -p botserver` passes
- [ ] Push botserver → origin + ALM
- [ ] Push main repo → origin + ALM
- [ ] Deploy to production (ask user first)
- [ ] Restart botserver (one-time for new binary)
- [ ] Test: click "Cartas" → verify KB search works
- [ ] Test: click "Procedimentos" → verify KB search works
- [ ] Verify PROMPT.md injection