botserver

Author	SHA1	Message	Date
Rodrigo Rodriguez (Pragmatismo)	f097f000d8	Fix: nested runtime panic in AuthConfig::from_env() Some checks failed BotServer CI/CD / build (push) Failing after 1s Details Root cause: AuthConfig::from_env() was creating a new tokio runtime with Runtime::new() inside an existing runtime during initialization. Impact: Botserver crashed with "Cannot start a runtime from within a runtime" panic right after CORS layer initialization. Fix: Use new_current_thread() + std:🧵:spawn pattern (same as get_database_url_sync fix) to create an isolated thread for async operations. Files: src/security/auth_api/config.rs 🤖 Generated with [Claude Code](https://claude.com/claude-code) Co-Authored-By: Claude <noreply@anthropic.com>	2026-04-03 14:02:08 -03:00
Rodrigo Rodriguez (Pragmatismo)	61642343a8	fix: replace all block_in_place with std:🧵:spawn to fix nested runtime panic Some checks are pending BotServer CI/CD / build (push) Waiting to run Details Root cause: block_in_place + new_current_thread().block_on() panics when called from within tokio runtime (including spawn_blocking). Tokio doesn't allow nested block_on() calls. Fix: Replace ALL block_in_place patterns with std:🧵:spawn + mpsc channel. This creates a completely separate OS thread with its own runtime, avoiding any nesting issues. Works from any context: async, spawn_blocking, or sync. Files: 14 files across secrets, utils, state, calendar, analytics, email, and all keyword handlers (universal_messaging, search, book, create_draft, create_site, hearing/syntax, use_tool, find, admin_email, goals)	2026-04-03 12:54:36 -03:00
Rodrigo Rodriguez (Pragmatismo)	4bdf46bdfc	fix: use Result instead Option for runtime builder in get_work_path Some checks are pending BotServer CI/CD / build (push) Waiting to run Details	2026-04-03 12:16:08 -03:00
Rodrigo Rodriguez (Pragmatismo)	b2a9c8213d	fix: use std:🧵:spawn for sync-to-async bridges to avoid nested block_on panic Some checks are pending BotServer CI/CD / build (push) Waiting to run Details Root cause: new_current_thread().block_on() panics when called from within an existing tokio runtime (including from spawn_blocking). Tokio doesn't allow nested block_on() calls. Fix: Use std:🧵:spawn to create a completely separate OS thread with its own runtime, communicating via mpsc channel. This works from any context: async, spawn_blocking, or sync.	2026-04-03 12:12:59 -03:00
Rodrigo Rodriguez (Pragmatismo)	21170faea9	fix: remove block_in_place wrappers that panic inside spawn_blocking Some checks are pending BotServer CI/CD / build (push) Waiting to run Details Root cause: block_in_place + new_current_thread().block_on() panics when called from within tokio::task::spawn_blocking because block_in_place is designed for async worker threads, not blocking threads. Fix: Remove all block_in_place wrappers and use new_current_thread().build().block_on() directly. This works from both async contexts and spawn_blocking contexts. Affected: utils.rs (get_database_url_sync, get_work_path)	2026-04-03 12:05:18 -03:00
Rodrigo Rodriguez (Pragmatismo)	c2982f2a33	fix: remove unused handle variable warning in get_database_url_sync Some checks are pending BotServer CI/CD / build (push) Waiting to run Details	2026-04-03 11:31:50 -03:00
Rodrigo Rodriguez (Pragmatismo)	263ca4ed11	fix: use new_current_thread runtime in get_database_url_sync to prevent nested block_on panic All checks were successful BotServer CI/CD / build (push) Successful in 6m30s Details	2026-04-03 09:26:23 -03:00
Rodrigo Rodriguez (Pragmatismo)	f6a864aa67	fix: replace nested runtime block_on with new_current_thread to prevent panic All checks were successful BotServer CI/CD / build (push) Successful in 5m32s Details Root cause: Handle::current().block_on() panics inside multi_thread runtime with 'Cannot start a runtime from within a runtime' error. Fix: All sync-to-async bridges now use tokio::runtime::Builder::new_current_thread() instead of Handle::current().block_on(). Also changed SECRETS_MANAGER from tokio::sync::RwLock to std::sync::RwLock to eliminate unnecessary async overhead. Files: 14 files across keywords, secrets, utils, state, calendar, analytics, email Impact: Fixes production crash during bot loading phase	2026-04-03 09:17:23 -03:00
Rodrigo Rodriguez (Pragmatismo)	eece6831b4	Fix: initialize secrets manager when remote Vault detected, even without init.json All checks were successful BotServer CI/CD / build (push) Successful in 5m9s Details - main.rs: Skip init.json check when VAULT_ADDR points to remote server - This allows botserver to read database credentials from Vault in production - Without this fix, database URL falls back to localhost and connection fails	2026-04-03 08:22:06 -03:00
Rodrigo Rodriguez (Pragmatismo)	65e7db5acd	Skip local service install/start when remote Vault detected All checks were successful BotServer CI/CD / build (push) Successful in 5m48s Details - install_all() returns early if VAULT_ADDR is remote - start_all() returns early if VAULT_ADDR is remote - bootstrap.rs treats remote VAULT_ADDR as bootstrap_completed=true - Prevents botserver from trying to install/start local services when all services are running in separate containers	2026-04-03 07:36:15 -03:00
Rodrigo Rodriguez (Pragmatismo)	e992ed3b39	Enforce Vault-only secrets: remove env var fallbacks, all secrets from Vault Some checks are pending BotServer CI/CD / build (push) Waiting to run Details - Remove all std::env::var calls except VAULT_* and PORT - get_from_env returns hardcoded defaults only (no env var reading) - Auth config, rate limits, email, analytics, calendar all use Vault - WORK_PATH replaced with get_work_path() helper reading from Vault - .env on production cleaned to only VAULT_ADDR, VAULT_TOKEN, VAULT_CACERT, PORT - All service IPs/credentials stored in Vault secret/gbo/*	2026-04-03 07:11:40 -03:00
Rodrigo Rodriguez (Pragmatismo)	5d88013ee3	Fix get_from_env: read actual env vars instead of hardcoded localhost values All checks were successful BotServer CI/CD / build (push) Successful in 4m3s Details	2026-04-02 21:17:19 -03:00
Rodrigo Rodriguez (Pragmatismo)	98b204b12e	Fix health checks: replace nc with ss -tln for non-root environments Some checks failed BotServer CI/CD / build (push) Has been cancelled Details	2026-04-02 18:15:07 -03:00
Rodrigo Rodriguez (Pragmatismo)	7b4753af0d	fix: init_redis tries both no-password and password URLs for Valkey All checks were successful BotServer CI/CD / build (push) Successful in 27s Details - Root cause: Valkey in prod runs without password but Vault stores one - Previous code only tried password URL, got AUTH failed - Fix: try no-password URL first, then password URL as fallback - Also removed unused cache_url variable and cleaned up retry logic	2026-04-02 07:36:16 -03:00
Rodrigo Rodriguez (Pragmatismo)	dae0feb6a5	fix: SecretPaths match Vault seeding paths (gbo/cache not gbo/system/cache) All checks were successful BotServer CI/CD / build (push) Successful in 3m49s Details - Root cause: Vault seeding writes to secret/gbo/cache but code reads gbo/system/cache - kv2::read prepends secret/ so it looks for secret/gbo/system/cache (wrong) - Fix: update SecretPaths to match seeding paths (gbo/cache, gbo/drive, etc.) - Testing: compiles clean, paths now match vault kv list output	2026-04-02 07:16:32 -03:00
Rodrigo Rodriguez (Pragmatismo)	f118c74cf1	fix: init_redis uses async Vault call instead of sync block_on (fixes panic) All checks were successful BotServer CI/CD / build (push) Successful in 5m40s Details - Root cause: get_cache_config() uses runtime.block_on() which panics when called from within an async runtime - Fix: call SecretsManager::get_secret() directly with .await - Testing: compiles clean, no runtime nesting issues	2026-04-02 06:59:21 -03:00
Rodrigo Rodriguez (Pragmatismo)	b3edf21d21	fix: init_redis fetches cache password from Vault (fixes connection timeout) All checks were successful BotServer CI/CD / build (push) Successful in 4m59s Details - Root cause: init_redis() used redis://localhost:6379 without password - Valkey requires authentication, causing connection timeouts - Fix: use get_cache_config() from SecretsManager to build URL with password - Falls back to env vars (CACHE_URL/REDIS_URL/VALKEY_URL) if set	2026-04-01 20:17:37 -03:00
Rodrigo Rodriguez (Pragmatismo)	3c9e4ba6e7	fix: cache_health_check uses ss instead of nc (nc missing in prod container) All checks were successful BotServer CI/CD / build (push) Successful in 4m42s Details - Root cause: prod container lacks nc (netcat), causing fallback to valkey-cli ping - valkey-cli ping hangs indefinitely when Valkey requires password auth - Fix: use ss -tlnp as primary check (always available), nc as fallback - Testing: verified ss is available in prod, nc is not	2026-04-01 20:06:13 -03:00
Rodrigo Rodriguez (Pragmatismo)	d098961142	fix: Bootstrap checks stack/.env path in addition to ./.env All checks were successful BotServer CI/CD / build (push) Successful in 4m39s Details - Production has .env in botserver-stack/.env not ./.env - Checks both locations to detect completed bootstrap - Fixes E0716: use let bindings for Path borrows	2026-04-01 19:30:08 -03:00
Rodrigo Rodriguez (Pragmatismo)	8fd3254334	fix: Bootstrap checks stack/.env path in addition to ./.env Some checks failed BotServer CI/CD / build (push) Failing after 1m25s Details - Production has .env in botserver-stack/.env not ./.env - Checks both locations to detect completed bootstrap - Prevents full re-bootstrap on restart in production	2026-04-01 19:26:32 -03:00
Rodrigo Rodriguez (Pragmatismo)	318367d439	fix: Valkey health check uses nc first (avoids password hang) All checks were successful BotServer CI/CD / build (push) Successful in 3m58s Details - nc -z checks port connectivity instantly (no auth needed) - valkey-cli ping as fallback (hangs when password required) - Fixes bootstrap hang on production where Valkey has Vault password	2026-04-01 18:52:04 -03:00
Rodrigo Rodriguez (Pragmatismo)	c26e483cc9	fix: All services check health before starting (idempotent bootstrap) All checks were successful BotServer CI/CD / build (push) Successful in 4m9s Details - Tables (PostgreSQL): pg_isready health check before start - Drive (MinIO): /minio/health/live check before start - ALM (Forgejo): HTTP health check before start - ALM CI (Forgejo Runner): pgrep check before start - Valkey: health check uses absolute path to valkey-cli - Vault, Qdrant, Zitadel: already had health checks - Result: no duplicate starts, no hangs on restart	2026-04-01 18:28:54 -03:00
Rodrigo Rodriguez (Pragmatismo)	ba7f1ba5eb	fix: Valkey health check uses absolute path to valkey-cli Some checks failed BotServer CI/CD / build (push) Has been cancelled Details - Use BOTSERVER_STACK_PATH/bin/cache/bin/valkey-cli instead of relying on PATH - Remove bash /dev/tcp fallback (unreliable in restricted environments) - Falls back to redis-cli and nc if valkey-cli unavailable	2026-04-01 18:11:26 -03:00
Rodrigo Rodriguez (Pragmatismo)	68ef554132	fix: Vault as single source of truth - credentials + location for all services All checks were successful BotServer CI/CD / build (push) Successful in 4m53s Details - Qdrant health check: recognize 'healthz check passed' response (fixes 45s timeout) - seed_vault_defaults: add host/port/url/grpc_port for ALL 10 services - fetch_vault_credentials: fetch ALL services via generic loop (drive, cache, tables, vectordb, directory, llm, meet, alm, encryption) - vectordb URL: fix https://localhost:6334 -> http://localhost:6333 in all config getters - get_from_env: add host/port/grpc_port for vectordb fallback - Tested: .reset (fresh install) + .restart (idempotent) - zero errors	2026-04-01 16:46:16 -03:00
Rodrigo Rodriguez (Pragmatismo)	fb2e5242da	fix: Vault seeding, service health checks, and restart idempotency All checks were successful BotServer CI/CD / build (push) Successful in 55m52s Details - Replace hardcoded passwords with generate_random_string() for all Vault-seeded services - Add valkey-cli, nc to SafeCommand allowlist; fix PATH in all 4 execution methods - Fix empty Vault KV values ('none' placeholder) preventing 'Failed to parse K=V' errors - Fix special chars in generated passwords triggering shell injection false positives - Add ALM app.ini creation with absolute paths for Forgejo CLI - Increase Qdrant timeout 15s→45s, ALM wait 5s→20s - Persist file_states and kb_states to disk for .bas/KB idempotency across restarts - Add duplicate check to use_website registration (debug log for existing) - Remove dead code (SERVER_START_EPOCH, server_epoch) - Add generate_random_string() to shared mod.rs, remove duplicates	2026-04-01 12:22:57 -03:00
Rodrigo Rodriguez (Pragmatismo)	3e46a16469	fix: Seed default credentials into Vault after initialization Some checks failed BotServer CI/CD / build (push) Failing after 3h13m28s Details - Add seed_vault_defaults() to write default creds for all components (drive, cache, tables, directory, email, llm, encryption, meet, vectordb, alm) - Call seed_vault_defaults() after KV2 enable in initialize_vault_local() - Call seed_vault_defaults() in recover_existing_vault() for recovery path - Rewrite fetch_vault_credentials() to use SafeCommand directly instead of safe_sh_command, avoiding '//' shell injection false positive on URLs - Components like Drive now get credentials from Vault instead of 403 errors	2026-03-31 22:19:09 -03:00
Rodrigo Rodriguez (Pragmatismo)	9919a8321c	fix: Use SafeCommand directly for vault health check to avoid shell injection false positive All checks were successful BotServer CI/CD / build (push) Successful in 6m46s Details - Replace safe_sh_command with SafeCommand::new("curl").args() in vault_health_check() - The URL contains https:// which triggered '//' pattern detection in shell command - Direct SafeCommand bypasses shell parsing, URL passed as single argument - Add vault data directory existence check before recovery attempt - Prevents 'Dangerous pattern // detected' errors during bootstrap	2026-03-31 21:34:04 -03:00
Rodrigo Rodriguez (Pragmatismo)	07a6c1edb3	Merge commit '582ea634' All checks were successful BotServer CI/CD / build (push) Successful in 7m38s Details	2026-03-31 21:10:25 -03:00
Rodrigo Rodriguez (Pragmatismo)	582ea634e7	fix: Vault bootstrap recovery for sealed but initialized instances - Fix vault_health_check() stub that always returned false - Add recover_existing_vault() to handle Vault with existing data but no init.json - Add unseal_vault() helper to unseal with existing vault-unseal-keys - Detect initialized Vault via health endpoint or data directory presence - Prevents bootstrap failure when reset.sh deletes init.json but Vault data persists Root cause: vault_health_check() was a stub returning false, causing bootstrap to always try vault operator init on already-initialized (but sealed) Vault, which failed with connection refused. This cascaded to all services failing to fetch credentials from Vault.	2026-03-31 20:49:29 -03:00
Rodrigo Rodriguez (Pragmatismo)	4ae16017ff	Merge commit '644dfe2d' Some checks failed BotServer CI/CD / build (push) Has been cancelled Details	2026-03-31 19:57:57 -03:00
Rodrigo Rodriguez (Pragmatismo)	644dfe2d19	fix: Improve .gbdialog file detection for nested paths	2026-03-31 19:57:33 -03:00
Rodrigo Rodriguez (Pragmatismo)	2fa59057fa	fix: Resolve migration error, Vault 403, cache timeout, and shell injection false positives Some checks failed BotServer CI/CD / build (push) Has been cancelled Details - Fix migration 6.2.5: Create lost_reason column before VIEW that references it - Fix Vault 403: Enable KV2 secrets engine after initialization - Fix cache timeout: Increase Valkey readiness wait from 12s to 30s - Fix command_guard: Remove () from forbidden chars (safe in std::process::Command)	2026-03-31 19:55:16 -03:00
Rodrigo Rodriguez (Pragmatismo)	b83b4ffc4d	fix: Remove server_epoch() from start_bas_executed Redis key The epoch caused a new key to be created every second, bypassing the 'already executed' check and running start.bas multiple times, resulting in triplicated suggestions.	2026-03-21 20:40:25 -03:00
Rodrigo Rodriguez (Pragmatismo)	1132983064	feat(kb): add with_bot_config to load embedding from bot config - Adds KnowledgeBaseManager::with_default_config() as alias to new() - Adds KnowledgeBaseManager::with_bot_config() to load embedding_url, embedding_model, and qdrant config from bot's config.csv - Updates bootstrap to use with_bot_config with default_bot_id - Enables per-bot embedding configuration instead of global env vars	2026-03-21 18:55:36 -03:00
Rodrigo Rodriguez (Pragmatismo)	622f1222dc	fix(websocket): force start.bas execution on connection to restore chat on page reload while preventing duplicate execution	2026-03-21 16:38:03 -03:00
Rodrigo Rodriguez (Pragmatismo)	363c056bab	fix(bootstrap): add strict timeout to Redis connection initialization to prevent hanging on dropped tcp packets	2026-03-21 14:37:04 -03:00
Rodrigo Rodriguez (Pragmatismo)	adb26330d2	fix: Simple 50ms timeout for Redis connection	2026-03-21 10:48:47 -03:00
Rodrigo Rodriguez (Pragmatismo)	9d6c2686f1	fix: Remove connection caching (no Clone)	2026-03-21 10:37:49 -03:00
Rodrigo Rodriguez (Pragmatismo)	b3ce293487	fix: Clean up duplicate Redis code and fix WebSocket log level	2026-03-21 10:30:19 -03:00
Rodrigo Rodriguez (Pragmatismo)	cfe6453d1e	perf: Add shared Redis connection pool with 50ms timeout	2026-03-21 10:14:10 -03:00
Rodrigo Rodriguez (Pragmatismo)	43fd40aed9	fix: Add timeout to Redis get_connection to prevent blocking - Added get_redis_connection() helper with 2s timeout - All cache operations now fail fast if Valkey is not ready - Prevents start.bas from blocking for minutes waiting for cache - Changes: add_suggestion.rs	2026-03-21 09:34:41 -03:00
Rodrigo Rodriguez (Pragmatismo)	e5f3380469	perf: Fix USE TOOL thread contention by removing runtime creation - Replace thread spawn + tokio runtime creation with block_in_place - Eliminates 10+ runtime creations per start.bas execution - Reduces USE TOOL execution from ~2min to milliseconds - Fixes suggestions not appearing due to start.bas timeout	2026-03-20 22:54:19 -03:00
Rodrigo Rodriguez (Pragmatismo)	705d925947	fix: Allow anonymous access to /api/suggestions for bot chat	2026-03-20 18:44:08 -03:00
Rodrigo Rodriguez (Pragmatismo)	d19984fa07	feat: Improve KB keywords and package manager installer	2026-03-20 17:38:47 -03:00
Rodrigo Rodriguez (Pragmatismo)	57a8b7f8f0	Fix: use pgrep to check valkey/qdrant running state - valkey check_cmd: replaced valkey-cli ping (network) with pgrep -x valkey-server - qdrant check_cmd: replaced curl https check (TLS error 35) with pgrep -x qdrant - Prevents duplicate instances on each botserver restart	2026-03-20 15:40:22 -03:00
Rodrigo Rodriguez (Pragmatismo)	3bb115266b	feat: Add GUID prefix to Qdrant collection names for KB security isolation	2026-03-19 19:51:28 -03:00
Rodrigo Rodriguez (Pragmatismo)	d6ebd0cf6e	fix: send suggestions separately from TALK, clear Redis keys for refresh - Remove suggestions fetching from TALK function - WebSocket handler now fetches and sends suggestions after start.bas executes - Clear suggestions and start_bas_executed keys to allow re-run on refresh - Decouple TALK from suggestions handling	2026-03-19 09:53:39 -03:00
Rodrigo Rodriguez (Pragmatismo)	2fcfb05fd6	fix: USE_WEBSITE non-blocking - timeout 3s, never blocks start.bas	2026-03-18 19:41:23 -03:00
Rodrigo Rodriguez (Pragmatismo)	6e594d68dd	Fix: Wait for send_task to be ready before executing start.bas	2026-03-18 14:38:46 -03:00
Rodrigo Rodriguez (Pragmatismo)	8f073a15fd	Fix: Wait for send_task to be ready before executing start.bas so TALK messages work	2026-03-18 14:18:05 -03:00

1 2 3 4 5 ...

997 commits