Update botserver: KB indexing fixes (kb_collections upsert, collection names, indexed flag)
This commit is contained in:
parent
5b5e3202e5
commit
6b857e8d17
2 changed files with 45 additions and 1 deletions
|
|
@ -1 +1 @@
|
|||
Subproject commit e81aee62211143e453d14bbf11b647576e1f2e23
|
||||
Subproject commit 7a1ec157f10f6aba6e7fa329fad30219da4049d4
|
||||
44
prompts/v6.2.md
Normal file
44
prompts/v6.2.md
Normal file
|
|
@ -0,0 +1,44 @@
|
|||
# v6.2 — Make KB "cartas" work end-to-end
|
||||
|
||||
## What we want
|
||||
|
||||
User clicks "Cartas" → `cartas.bas` runs → `USE KB "cartas"` → searches Qdrant → bot answers with KB content. No restarts.
|
||||
|
||||
## 3 Bugs we found
|
||||
|
||||
### Bug 1: KB files re-indexed every 10s (wasteful) ✅ FIXED
|
||||
|
||||
Every cycle, `check_gbkb_changes` replaces file_states with `indexed: false`, so DriveMonitor re-downloads and re-indexes all PDFs every 10 seconds.
|
||||
|
||||
**Fix:** Preserve `indexed: true` when etag hasn't changed.
|
||||
**File:** `botserver/src/drive/drive_monitor/mod.rs:1376`
|
||||
|
||||
### Bug 2: USE KB looks for wrong collection name ✅ FIXED
|
||||
|
||||
When `kb_collections` has no entry for "cartas", `USE KB` creates a collection name using a random UUID (`salesianos_<random>_cartas`). But Qdrant has `salesianos_6deedba8_cartas`. They never match → search returns nothing.
|
||||
|
||||
**Fix:** Use `bot_id_short` (first 8 chars of bot UUID) consistently. Also changed `ON CONFLICT DO NOTHING` to `DO UPDATE` so stale entries get corrected.
|
||||
**File:** `botserver/src/basic/keywords/use_kb.rs:221-244`
|
||||
|
||||
### Bug 3: KB indexing never writes to kb_collections table ✅ FIXED
|
||||
|
||||
`index_kb_folder` creates a Qdrant collection and indexes documents, but never writes a row to `kb_collections`. So when `USE KB "cartas"` runs, it queries `kb_collections` → empty → hits Bug 2's fallback path.
|
||||
|
||||
**Fix:** After indexing, upsert into `kb_collections` with correct collection name.
|
||||
**File:** `botserver/src/core/kb/mod.rs:167-220`
|
||||
|
||||
Also changed `process_gbkb_folder` return type from `Result<()>` to `Result<IndexingResult>` so `handle_gbkb_change` can use `collection_name` and `documents_processed`.
|
||||
|
||||
## Checklist
|
||||
|
||||
- [x] Bug 1 code fix (file_states indexed flag)
|
||||
- [x] Bug 2 code fix (USE KB collection name)
|
||||
- [x] Bug 3 code fix (kb_collections upsert after indexing)
|
||||
- [x] `cargo check -p botserver` passes
|
||||
- [ ] Push botserver → origin + ALM
|
||||
- [ ] Push main repo → origin + ALM
|
||||
- [ ] Deploy to production (ask user first)
|
||||
- [ ] Restart botserver (one-time for new binary)
|
||||
- [ ] Test: click "Cartas" → verify KB search works
|
||||
- [ ] Test: click "Procedimentos" → verify KB search works
|
||||
- [ ] Verify PROMPT.md injection
|
||||
Loading…
Add table
Reference in a new issue