- Adds get_stack_path() helper: returns /opt/gbo in production (.env without botserver-stack), ./botserver-stack in dev
- Adds get_work_path() helper: returns /opt/gbo/work in production, ./botserver-stack/data/system/work in dev
- Updated 35+ files to use dynamic path resolution
- Production system container no longer needs botserver-stack directory
- Work files go to /opt/gbo/work instead of /opt/gbo/bin/botserver-stack
- Fix double_ended_iterator_last: use next_back() instead of last()
- Fix manual_clamp: use .clamp() instead of min().max()
- Fix too_many_arguments: create KbInjectionContext struct
- Fix needless_borrow: remove unnecessary & reference
- Fix let_and_return: return value directly
- Fix await_holding_lock: drop guard before await
- Fix collapsible_else_if: collapse nested if-else
All changes verified with cargo clippy (0 warnings, 0 errors)
Note: Local botserver crashes with existing panic during LocalFileMonitor initialization
This panic exists in original code too, not caused by these changes
Root cause: block_in_place + new_current_thread().block_on() panics when
called from within tokio runtime (including spawn_blocking). Tokio doesn't
allow nested block_on() calls.
Fix: Replace ALL block_in_place patterns with std:🧵:spawn + mpsc channel.
This creates a completely separate OS thread with its own runtime, avoiding
any nesting issues. Works from any context: async, spawn_blocking, or sync.
Files: 14 files across secrets, utils, state, calendar, analytics, email,
and all keyword handlers (universal_messaging, search, book, create_draft,
create_site, hearing/syntax, use_tool, find, admin_email, goals)
Root cause: new_current_thread().block_on() panics when called from within
an existing tokio runtime (including from spawn_blocking). Tokio doesn't
allow nested block_on() calls.
Fix: Use std:🧵:spawn to create a completely separate OS thread
with its own runtime, communicating via mpsc channel. This works from
any context: async, spawn_blocking, or sync.
Root cause: block_in_place + new_current_thread().block_on() panics when
called from within tokio::task::spawn_blocking because block_in_place is
designed for async worker threads, not blocking threads.
Fix: Remove all block_in_place wrappers and use new_current_thread().build().block_on()
directly. This works from both async contexts and spawn_blocking contexts.
Affected: utils.rs (get_database_url_sync, get_work_path)
Root cause: Handle::current().block_on() panics inside multi_thread runtime
with 'Cannot start a runtime from within a runtime' error.
Fix: All sync-to-async bridges now use tokio::runtime::Builder::new_current_thread()
instead of Handle::current().block_on(). Also changed SECRETS_MANAGER from
tokio::sync::RwLock to std::sync::RwLock to eliminate unnecessary async overhead.
Files: 14 files across keywords, secrets, utils, state, calendar, analytics, email
Impact: Fixes production crash during bot loading phase
- install_all() returns early if VAULT_ADDR is remote
- start_all() returns early if VAULT_ADDR is remote
- bootstrap.rs treats remote VAULT_ADDR as bootstrap_completed=true
- Prevents botserver from trying to install/start local services
when all services are running in separate containers
- Remove all std::env::var calls except VAULT_* and PORT
- get_from_env returns hardcoded defaults only (no env var reading)
- Auth config, rate limits, email, analytics, calendar all use Vault
- WORK_PATH replaced with get_work_path() helper reading from Vault
- .env on production cleaned to only VAULT_ADDR, VAULT_TOKEN, VAULT_CACERT, PORT
- All service IPs/credentials stored in Vault secret/gbo/*
- Root cause: Vault seeding writes to secret/gbo/cache but code reads gbo/system/cache
- kv2::read prepends secret/ so it looks for secret/gbo/system/cache (wrong)
- Fix: update SecretPaths to match seeding paths (gbo/cache, gbo/drive, etc.)
- Testing: compiles clean, paths now match vault kv list output
- Root cause: prod container lacks nc (netcat), causing fallback to valkey-cli ping
- valkey-cli ping hangs indefinitely when Valkey requires password auth
- Fix: use ss -tlnp as primary check (always available), nc as fallback
- Testing: verified ss is available in prod, nc is not
- nc -z checks port connectivity instantly (no auth needed)
- valkey-cli ping as fallback (hangs when password required)
- Fixes bootstrap hang on production where Valkey has Vault password
- Tables (PostgreSQL): pg_isready health check before start
- Drive (MinIO): /minio/health/live check before start
- ALM (Forgejo): HTTP health check before start
- ALM CI (Forgejo Runner): pgrep check before start
- Valkey: health check uses absolute path to valkey-cli
- Vault, Qdrant, Zitadel: already had health checks
- Result: no duplicate starts, no hangs on restart
- Use BOTSERVER_STACK_PATH/bin/cache/bin/valkey-cli instead of relying on PATH
- Remove bash /dev/tcp fallback (unreliable in restricted environments)
- Falls back to redis-cli and nc if valkey-cli unavailable
- Add seed_vault_defaults() to write default creds for all components
(drive, cache, tables, directory, email, llm, encryption, meet, vectordb, alm)
- Call seed_vault_defaults() after KV2 enable in initialize_vault_local()
- Call seed_vault_defaults() in recover_existing_vault() for recovery path
- Rewrite fetch_vault_credentials() to use SafeCommand directly instead of
safe_sh_command, avoiding '//' shell injection false positive on URLs
- Components like Drive now get credentials from Vault instead of 403 errors
- Replace safe_sh_command with SafeCommand::new("curl").args() in vault_health_check()
- The URL contains https:// which triggered '//' pattern detection in shell command
- Direct SafeCommand bypasses shell parsing, URL passed as single argument
- Add vault data directory existence check before recovery attempt
- Prevents 'Dangerous pattern // detected' errors during bootstrap
- Fix vault_health_check() stub that always returned false
- Add recover_existing_vault() to handle Vault with existing data but no init.json
- Add unseal_vault() helper to unseal with existing vault-unseal-keys
- Detect initialized Vault via health endpoint or data directory presence
- Prevents bootstrap failure when reset.sh deletes init.json but Vault data persists
Root cause: vault_health_check() was a stub returning false, causing bootstrap
to always try vault operator init on already-initialized (but sealed) Vault,
which failed with connection refused. This cascaded to all services failing
to fetch credentials from Vault.
The epoch caused a new key to be created every second, bypassing
the 'already executed' check and running start.bas multiple times,
resulting in triplicated suggestions.
- Adds KnowledgeBaseManager::with_default_config() as alias to new()
- Adds KnowledgeBaseManager::with_bot_config() to load embedding_url,
embedding_model, and qdrant config from bot's config.csv
- Updates bootstrap to use with_bot_config with default_bot_id
- Enables per-bot embedding configuration instead of global env vars
- Remove suggestions fetching from TALK function
- WebSocket handler now fetches and sends suggestions after start.bas executes
- Clear suggestions and start_bas_executed keys to allow re-run on refresh
- Decouple TALK from suggestions handling
Removed tokio::runtime::Handle::block_on() calls that were causing panics
when called from within async contexts. Now uses direct fallback to
environment variables instead.
- Add hear_channels: HashMap<Uuid, SyncSender<String>> to AppState
- HEAR now blocks the spawn_blocking thread via sync_channel recv()
- deliver_hear_input() called at top of stream_response() to unblock
- Script continues from exact HEAR position, no side-effect re-execution
- All three HEAR variants (basic, AS TYPE, AS MENU) use same mechanism
- Add --container-only flag to create container without completing full installation
- Exit immediately after container creation
- Useful for manual setup or debugging installation issues