botserver

Author	SHA1	Message	Date
Rodrigo Rodriguez (Pragmatismo)	86939c17d8	fix: stop KB re-indexing every cycle, add kb_indexed_folders tracking All checks were successful BotServer CI/CD / build (push) Successful in 6m13s Details - Add kb_indexed_folders set to track successfully indexed KB folders - Skip re-queuing KB for indexing if already indexed and files unchanged - Remove kb_key from indexed set when files change (forces re-index) - Clear indexed set on KB folder deletion - Fix hardcoded salesianos in drive_monitor prompt key (from previous commit)	2026-04-13 09:37:15 -03:00
Rodrigo Rodriguez (Pragmatismo)	dd68cdbe6c	fix: remove hardcoded salesianos, strip think tags globally, block reasoning_content leak All checks were successful BotServer CI/CD / build (push) Successful in 6m38s Details - drive_monitor: replace hardcoded salesianos.gbot with dynamic bot_name - llm/mod.rs: stop falling back to reasoning_content as content - llm/claude.rs: same fix for Claude handler - deepseek_r3: export strip_think_tags for reuse - gpt_oss_20b: use strip_think_tags so all models strip tags - gpt_oss_120b: use strip_think_tags so all models strip tags	2026-04-13 09:04:22 -03:00
Rodrigo Rodriguez (Pragmatismo)	dbec0df923	fix: DriveMonitor config.csv sync uses Last-Modified in addition to ETag All checks were successful BotServer CI/CD / build (push) Successful in 5m46s Details ETag in MinIO is an MD5 content hash, so re-uploading the same content preserves the ETag. Add last_modified comparison so config.csv changes that don't alter content hash still get synced. Also fixes EmbeddingConfig fallback from previous commit.	2026-04-13 08:33:37 -03:00
Rodrigo Rodriguez (Pragmatismo)	666acb9360	fix: DEADLOCK in check_gbkb_changes - removed nested file_states read lock All checks were successful BotServer CI/CD / build (push) Successful in 3m44s Details Root cause: file_states.write().await was held while trying to acquire file_states.read().await for KB backoff check. Tokio RwLock is not reentrant - this caused permanent deadlock. Fix: Removed the file_states.read() backoff check. KB processor now just checks files_being_indexed set and queues to pending_kb_index. Backoff is handled by the KB processor itself based on fail_count. This fixes salesianos DriveMonitor hanging for 5+ minutes every cycle. Co-authored-by: Qwen-Coder <qwen-coder@alibabacloud.com>	2026-04-12 22:28:02 -03:00
Rodrigo Rodriguez (Pragmatismo)	3322234712	debug: add logging to track check_gbkb_changes hang All checks were successful BotServer CI/CD / build (push) Successful in 3m40s Details Added debug logging at key points in check_gbkb_changes: - ENTER with bot ID and prefix - Object listing results - File states lock acquisition - New/modified file detection - PDF detection - File download batches - Final remaining files download - EXIT confirmation This will help identify exactly where the 5-minute timeout occurs. Co-authored-by: Qwen-Coder <qwen-coder@alibabacloud.com>	2026-04-12 22:09:38 -03:00
Rodrigo Rodriguez (Pragmatismo)	8e539206d4	fix: KB processor works with and without llm/research features All checks were successful BotServer CI/CD / build (push) Successful in 3m55s Details - Added stub start_kb_processor() for non-llm builds - Added _pending_kb_index field for non-llm builds - Extracted KB processor logic into start_kb_processor_inner() - Removed unused is_embedding_server_ready import This ensures DriveMonitor compiles and runs correctly in production where CI builds without --features llm. Co-authored-by: Qwen-Coder <qwen-coder@alibabacloud.com>	2026-04-12 21:40:06 -03:00
Rodrigo Rodriguez (Pragmatismo)	112ac51da3	fix: KB processor runs as background task, no longer blocks check_for_changes All checks were successful BotServer CI/CD / build (push) Successful in 3m50s Details - Added start_kb_processor() method: long-running background task per bot - check_gbkb_changes now queues KB folders to pending_kb_index (non-blocking) - KB processor polls pending_kb_index and processes one at a time per bot - Removed inline tokio::spawn from check_gbkb_changes that was causing 5min timeouts - Added pending_kb_index field to DriveMonitor struct This fixes salesianos DriveMonitor timeout - check_for_changes now completes in seconds instead of hanging on KB embedding/indexing. Co-authored-by: Qwen-Coder <qwen-coder@alibabacloud.com>	2026-04-12 21:28:03 -03:00
Rodrigo Rodriguez (Pragmatismo)	ad998b52d4	fix: check_gbot only scans .gbot/ folder, not entire bucket All checks were successful BotServer CI/CD / build (push) Successful in 4m21s Details - Added prefix filter to list_objects_v2 call: only scans {bot}.gbot/ - Removed scanning of .gbkb and .gbdialog paths which caused 5min timeouts - This fixes salesianos DriveMonitor timeout and embed/index failure Also fixed header detection for name,value CSV format. Co-authored-by: Qwen-Coder <qwen-coder@alibabacloud.com>	2026-04-12 21:02:01 -03:00
Rodrigo Rodriguez (Pragmatismo)	36fdf52780	fix: sync_gbot_config now handles CSV with or without header row All checks were successful BotServer CI/CD / build (push) Successful in 3m32s Details - Removed unconditional .skip(1) that was skipping first config line - Added header detection: skips first line only if it looks like 'key,value' header - Added validation to skip empty keys - Also fixed indentation in drive_monitor gbkb file processing This fixes the issue where config.csv changes on Drive weren't being synced to bot_configuration database table for salesianos bot. Co-authored-by: Qwen-Coder <qwen-coder@alibabacloud.com>	2026-04-12 20:32:30 -03:00
Rodrigo Rodriguez (Pragmatismo)	4cd469afc3	fix: track config.csv ETag to avoid unnecessary syncs All checks were successful BotServer CI/CD / build (push) Successful in 5m2s Details - Add ETag tracking for config.csv files in DriveMonitor - Only download and sync config.csv when ETag changes - Prevents unnecessary database updates on every check - Uses __config__ prefix for config.csv state keys	2026-04-12 19:49:28 -03:00
Rodrigo Rodriguez (Pragmatismo)	af85426ed4	fix: delete orphaned .gbkb files when removed from MinIO All checks were successful BotServer CI/CD / build (push) Successful in 3m6s Details When a .gbkb file is deleted from the bucket, DriveMonitor now: - Deletes the downloaded file from work directory - When entire KB folder is empty, removes the folder too - Prevents disk accumulation of orphaned knowledge base files	2026-04-12 16:49:05 -03:00
Rodrigo Rodriguez (Pragmatismo)	135dfb06d5	fix: delete orphaned .ast files when .bas is removed from MinIO All checks were successful BotServer CI/CD / build (push) Successful in 3m4s Details When a .bas file is deleted from the bucket, DriveMonitor now: - Deletes the corresponding .ast compiled file - Deletes .bas, .mcp.json, .tool.json files from work directory - Removes the path from file_states tracking This prevents stale compiled files from accumulating in production.	2026-04-12 16:43:29 -03:00
Rodrigo Rodriguez (Pragmatismo)	9cf176008d	fix: preserve indexed status after .bas compilation All checks were successful BotServer CI/CD / build (push) Successful in 3m20s Details Fixed bug where DriveMonitor would overwrite indexed=true status after successful compilation, causing files to be recompiled on every cycle. Changes: - Track successful compilations in HashSet before acquiring write lock - Set indexed=true for successfully compiled files in merge loop - Preserve indexed status for unchanged files - Handle compilation failures with proper fail_count tracking This ensures new .bas files are compiled to .ast once and the indexed status is preserved, preventing unnecessary recompilation.	2026-04-12 16:36:03 -03:00
Rodrigo Rodriguez (Pragmatismo)	7c4ec37700	fix: properly track compilation status in DriveMonitor All checks were successful BotServer CI/CD / build (push) Successful in 3m15s Details - Do not mark .bas files as indexed unconditionally - Only set indexed=true when compile_tool() completes successfully - Reset fail_count and last_failed_at on successful compilation - Retry failed compilations automatically on next cycle - Fixes permanent compilation failure state for salesianos start.bas	2026-04-12 16:06:23 -03:00
Rodrigo Rodriguez (Pragmatismo)	73f1898b62	Add fail_count and last_failed_at to kb_documents All checks were successful BotServer CI/CD / build (push) Successful in 3m7s Details Simplified KB indexing state tracking - added columns directly to kb_documents instead of separate table. This enables per-file backoff retry logic.	2026-04-12 09:36:39 -03:00
Rodrigo Rodriguez (Pragmatismo)	256d55fc93	Add smart sleep based on fail_count to prevent excessive monitoring cycles All checks were successful BotServer CI/CD / build (push) Successful in 3m9s Details - fail_count >= 3: sleep 1 hour - fail_count >= 2: sleep 15 min - fail_count >= 1: sleep 5 min - fail_count = 0: sleep 10 sec (default)	2026-04-12 09:20:17 -03:00
Rodrigo Rodriguez (Pragmatismo)	789789e313	Fix backoff logic to be per KB folder instead of global Some checks failed BotServer CI/CD / build (push) Has been cancelled Details - Filter states by kb_folder_pattern (e.g. 'cartas/', 'proc/') - Only apply backoff based on files in that specific KB folder - Each KB folder has independent retry timing	2026-04-12 09:15:32 -03:00
Rodrigo Rodriguez (Pragmatismo)	ee273256fb	Add backoff logic to KB indexing to prevent excessive retries Some checks failed BotServer CI/CD / build (push) Has been cancelled Details - fail_count 1: wait 5 minutes before retry - fail_count 2: wait 15 minutes before retry - fail_count 3+: wait 1 hour before retry This prevents the 'already being indexed, skipping duplicate task' loop.	2026-04-12 09:13:33 -03:00
Rodrigo Rodriguez (Pragmatismo)	f48fa6d5f0	Add fail_count/last_failed_at to FileState for indexing retries All checks were successful BotServer CI/CD / build (push) Successful in 3m21s Details - Skip re-indexing files that failed 3+ times within 1 hour - Update file_states on indexing success (indexed=true, fail_count=0) - Update file_states on indexing failure (fail_count++, last_failed_at=now) - Don't skip KB indexing when embedding server not marked ready yet - Embedding server health will be detected via wait_for_server() in kb_indexer - Remove drive_monitor bypass of embedding check - let kb_indexer handle it	2026-04-12 07:47:13 -03:00
Rodrigo Rodriguez (Pragmatismo)	cdab04e999	Fix embedding health check: behavior-based instead of URL whitelist All checks were successful BotServer CI/CD / build (push) Successful in 3m32s Details - Remove hardcoded URL list for remote API detection - Try /health first, then probe with HEAD if 404/405 - Re-enable embedding server ready check in drive_monitor - No more embedding_key hack that skipped health checks entirely	2026-04-12 07:15:54 -03:00
Rodrigo Rodriguez (Pragmatismo)	2bafd57046	Temp fix: Skip embedding server ready check in DriveMonitor KB indexing All checks were successful BotServer CI/CD / build (push) Successful in 3m19s Details	2026-04-12 06:58:55 -03:00
Rodrigo Rodriguez (Pragmatismo)	7a1ec157f1	Fix KB indexing: upsert kb_collections, consistent collection names, preserve indexed flag All checks were successful BotServer CI/CD / build (push) Successful in 3m23s Details - Bug 1: check_gbkb_changes now preserves indexed=true from previous state when etag matches, preventing redundant re-indexing every cycle - Bug 2: USE KB fallback uses bot_id_short (8 chars) instead of random UUID, matching the collection name convention used by DriveMonitor - Bug 3: handle_gbkb_change now upserts into kb_collections table after successful indexing, so USE KB can find the collection at runtime - Changed ON CONFLICT DO NOTHING to DO UPDATE for kb_collections inserts - Changed process_gbkb_folder return type to Result<IndexingResult>	2026-04-11 21:26:02 -03:00
Rodrigo Rodriguez (Pragmatismo)	e81aee6221	fix: use bucket_name instead of bot_id (UUID) for file_states.json path All checks were successful BotServer CI/CD / build (push) Successful in 3m22s Details File states were stored under /opt/gbo/work/{UUID}/file_states.json but should be under /opt/gbo/work/{bucket_name}/file_states.json like other bot data (e.g. /opt/gbo/work/salesianos.gbai/) Also fixed file_states_static signature to use bucket_name consistently.	2026-04-11 20:40:23 -03:00
Rodrigo Rodriguez (Pragmatismo)	cf4a00e16e	fix: work path uses production /opt/gbo when env exists or path exists; mark .bas files indexed=true after compilation All checks were successful BotServer CI/CD / build (push) Successful in 3m20s Details - get_work_path_default/get_stack_path no longer rely on CWD-relative botserver-stack check which caused wrong output path in production when CI left that directory - DriveMonitor now marks .bas file states as indexed=true after list+compile cycle - Added compile_tool logging for work_dir path	2026-04-11 20:16:22 -03:00
Rodrigo Rodriguez (Pragmatismo)	5fdb3be5b4	fix: save file_states after prompt etag update to stop PROMPT.md download loop All checks were successful BotServer CI/CD / build (push) Successful in 3m41s Details	2026-04-11 19:21:26 -03:00
Rodrigo Rodriguez (Pragmatismo)	f4c99030aa	fix: use get_work_path() instead of get_stack_path()+data/system for work dir, add etag check for PROMPT.md downloads All checks were successful BotServer CI/CD / build (push) Successful in 3m37s Details	2026-04-11 18:42:09 -03:00
Rodrigo Rodriguez (Pragmatismo)	a131120638	Fix KB indexing: bot-specific embedding config, PROMPT.md sync, single-file streaming All checks were successful BotServer CI/CD / build (push) Successful in 4m1s Details	2026-04-11 13:27:48 -03:00
Rodrigo Rodriguez (Pragmatismo)	12988b637d	Fix KB indexing: single file streaming, dedup tracking, .ast cache All checks were successful BotServer CI/CD / build (push) Successful in 12m31s Details	2026-04-11 13:10:09 -03:00
Rodrigo Rodriguez (Pragmatismo)	821dd1d7ab	fix: Use bot-specific embedding config in DriveMonitor KB manager All checks were successful BotServer CI/CD / build (push) Successful in 3m47s Details	2026-04-11 08:55:41 -03:00
Rodrigo Rodriguez (Pragmatismo)	db2dc3fb34	Fix warnings: remove unused variables in drive_monitor All checks were successful BotServer CI/CD / build (push) Successful in 11m32s Details	2026-04-10 12:58:20 -03:00
Rodrigo Rodriguez (Pragmatismo)	26b009d4e6	Fix: Remove duplicate method definitions in DriveMonitor All checks were successful BotServer CI/CD / build (push) Successful in 4m52s Details - Removed duplicate file_state_path() and load_file_states() methods - Kept only new save_file_states_static() helper - Original methods still exist at lines 79-84 and 87-128 - Fixes compilation errors from previous commit	2026-04-10 11:31:17 -03:00
Rodrigo Rodriguez (Pragmatismo)	816d416eee	Fix DriveMonitor dispatch failure in main repo Some checks failed BotServer CI/CD / build (push) Failing after 1m31s Details - Added static save_file_states_static() helper method - Changed tokio::spawn calls to use Arc::clone instead of Arc::new(self.clone()) - This prevents double Arc wrapping which causes 'dispatch failure' errors - Fixes config.csv not syncing from bucket to database for salesianos/default bots	2026-04-10 11:24:56 -03:00
Rodrigo Rodriguez (Pragmatismo)	b5be26591e	feat: add LOAD_ONLY env filter for bots discovery and monitoring All checks were successful BotServer CI/CD / build (push) Successful in 10m39s Details	2026-04-09 23:15:54 -03:00
Rodrigo Rodriguez (Pragmatismo)	8dddc916ff	fix: use Vault config for Qdrant in KB indexer - website_crawler_service: use QdrantConfig::from_config instead of default - local_file_monitor: use QdrantConfig::from_config with DbPool - kb_indexer: KbFolderMonitor now uses SecretsManager for Qdrant config This fixes the issue where Qdrant URL was hardcoded to localhost:6333 instead of reading from Vault (gbo/vectordb).	2026-04-09 18:27:10 -03:00
Rodrigo Rodriguez (Pragmatismo)	f526fa1daa	Fix hardcoded paths for production environment - Update get_work_path_default() to check for .env in /opt/gbo/bin/.env - Update get_stack_path() to check for .env in /opt/gbo/bin/.env - Update DriveMonitor::new() to use get_work_path() instead of hardcoded path - Update start_config_watcher() to use get_work_path() instead of hardcoded path This fixes the issue where botserver was using development paths (/home/rodriguez/src/gb/botserver-stack/data/system/work) in production instead of production paths (/opt/gbo/work).	2026-04-09 18:21:17 -03:00
Rodrigo Rodriguez (Pragmatismo)	5371047fa1	Drive monitor: download PROMPT.md from MinIO to work directory Some checks failed BotServer CI/CD / build (push) Failing after 6m18s Details - When system-prompt-file is configured in config.csv, download the file from MinIO - Save to {bot}.gbai/{bot}.gbot/ folder in work directory - Config loaded from MinIO (gbo-* buckets)	2026-04-08 20:09:39 -03:00
Rodrigo Rodriguez (Pragmatismo)	c5a44f7889	Clean up local-files feature comments Some checks failed BotServer CI/CD / build (push) Failing after 2m48s Details - Keep local-files feature flag for conditional local file monitoring - Keep gbo- bucket filtering in drive - Remove verbose comments	2026-04-08 18:33:39 -03:00
Rodrigo Rodriguez (Pragmatismo)	62d0da3923	Add local-files feature to disable local storage scanning Some checks failed BotServer CI/CD / build (push) Has been cancelled Details - Without local-files feature: only MinIO/Drive is used as bot source - With local-files feature: scans /opt/gbo/data for bots (default behavior) - Bucket filtering (gbo-*) only active when local-files is NOT enabled - LocalFileMonitor and ConfigWatcher only start with local-files feature	2026-04-08 18:29:48 -03:00
Rodrigo Rodriguez (Pragmatismo)	b4a82b6c06	Disable local file monitoring, use drive (MinIO) as sole bot source Some checks failed BotServer CI/CD / build (push) Failing after 13m5s Details - Disable LocalFileMonitor and ConfigWatcher - use S3/MinIO only - Filter S3 buckets to gbo-*.gbai prefix - Auto-create bots in database when new S3 buckets discovered - Change file paths to use work directory instead of /opt/gbo/data - Add RunQueryDsl import for Diesel queries	2026-04-08 17:47:44 -03:00
Rodrigo Rodriguez (Pragmatismo)	9e799dd6b1	Disable /opt/gbo/data loading, use drive (MinIO) only for bot sources Some checks failed BotServer CI/CD / build (push) Failing after 8m28s Details - Remove LocalFileMonitor and ConfigWatcher for /opt/gbo/data - Remove /opt/gbo/data from mount_all_bots() scanning - Change start.bas, tables.bas, and tool paths to use work directory - Filter drive buckets to only gbo-* prefix - Remove unused create_bot_simple method - Fix all warnings (unused imports, variables, dead code)	2026-04-08 16:55:50 -03:00
Rodrigo Rodriguez (Pragmatismo)	9b04af9e7b	Fix USE KB and USE WEBSITE default features compilation Some checks failed BotServer CI/CD / build (push) Failing after 10m2s Details	2026-04-07 20:14:12 -03:00
Rodrigo Rodriguez (Pragmatismo)	73002b36cc	Update botserver: various fixes and improvements All checks were successful BotServer CI/CD / build (push) Successful in 9m59s Details	2026-04-07 13:33:50 -03:00
Rodrigo Rodriguez (Pragmatismo)	3684c862c6	fix drive: add missing diesel imports (QueryableByName, RunQueryDsl) Some checks failed BotServer CI/CD / build (push) Failing after 2m29s Details	2026-04-05 13:19:22 -03:00
Rodrigo Rodriguez (Pragmatismo)	b5d5c576a4	Fix unused imports Some checks failed BotServer CI/CD / build (push) Failing after 5m42s Details	2026-04-05 12:34:33 -03:00
Rodrigo Rodriguez (Pragmatismo)	f6869e6b5c	Fix diesel join queries across schemas and FileItem missing fields Some checks failed BotServer CI/CD / build (push) Failing after 10m1s Details	2026-04-05 12:06:35 -03:00
Rodrigo Rodriguez (Pragmatismo)	155d465b14	Update botserver: Refactor groups module, add Knowledge Base group association logic, and implement Drive tags for KB access. Some checks failed BotServer CI/CD / build (push) Failing after 5m53s Details	2026-04-05 09:11:54 -03:00
Rodrigo Rodriguez (Pragmatismo)	7d8f141fc2	refactor: Replace all hardcoded ./botserver-stack paths with get_stack_path()/get_work_path() Some checks failed BotServer CI/CD / build (push) Failing after 1m28s Details - Adds get_stack_path() helper: returns /opt/gbo in production (.env without botserver-stack), ./botserver-stack in dev - Adds get_work_path() helper: returns /opt/gbo/work in production, ./botserver-stack/data/system/work in dev - Updated 35+ files to use dynamic path resolution - Production system container no longer needs botserver-stack directory - Work files go to /opt/gbo/work instead of /opt/gbo/bin/botserver-stack	2026-04-04 09:24:44 -03:00
Rodrigo Rodriguez (Pragmatismo)	4d7297243e	Fix clippy warnings: reduce 17 warnings to 0 All checks were successful BotServer CI/CD / build (push) Successful in 6m58s Details - Fix double_ended_iterator_last: use next_back() instead of last() - Fix manual_clamp: use .clamp() instead of min().max() - Fix too_many_arguments: create KbInjectionContext struct - Fix needless_borrow: remove unnecessary & reference - Fix let_and_return: return value directly - Fix await_holding_lock: drop guard before await - Fix collapsible_else_if: collapse nested if-else All changes verified with cargo clippy (0 warnings, 0 errors) Note: Local botserver crashes with existing panic during LocalFileMonitor initialization This panic exists in original code too, not caused by these changes	2026-04-03 22:34:43 -03:00
Rodrigo Rodriguez (Pragmatismo)	e992ed3b39	Enforce Vault-only secrets: remove env var fallbacks, all secrets from Vault Some checks are pending BotServer CI/CD / build (push) Waiting to run Details - Remove all std::env::var calls except VAULT_* and PORT - get_from_env returns hardcoded defaults only (no env var reading) - Auth config, rate limits, email, analytics, calendar all use Vault - WORK_PATH replaced with get_work_path() helper reading from Vault - .env on production cleaned to only VAULT_ADDR, VAULT_TOKEN, VAULT_CACERT, PORT - All service IPs/credentials stored in Vault secret/gbo/*	2026-04-03 07:11:40 -03:00
Rodrigo Rodriguez (Pragmatismo)	fb2e5242da	fix: Vault seeding, service health checks, and restart idempotency All checks were successful BotServer CI/CD / build (push) Successful in 55m52s Details - Replace hardcoded passwords with generate_random_string() for all Vault-seeded services - Add valkey-cli, nc to SafeCommand allowlist; fix PATH in all 4 execution methods - Fix empty Vault KV values ('none' placeholder) preventing 'Failed to parse K=V' errors - Fix special chars in generated passwords triggering shell injection false positives - Add ALM app.ini creation with absolute paths for Forgejo CLI - Increase Qdrant timeout 15s→45s, ALM wait 5s→20s - Persist file_states and kb_states to disk for .bas/KB idempotency across restarts - Add duplicate check to use_website registration (debug log for existing) - Remove dead code (SERVER_START_EPOCH, server_epoch) - Add generate_random_string() to shared mod.rs, remove duplicates	2026-04-01 12:22:57 -03:00

1 2 3

120 commits