Commit graph

1012 commits

Author SHA1 Message Date
44669c3825 fix: Fix resolve_export_path typo and remove unused PathBuf imports
All checks were successful
BotServer CI/CD / build (push) Successful in 4m28s
2026-04-04 10:23:42 -03:00
5006159008 fix: Fix last E0716 in bootstrap.rs and remove unused PathBuf imports
Some checks failed
BotServer CI/CD / build (push) Failing after 5m47s
2026-04-04 10:16:27 -03:00
be6f0306cc fix: Fix remaining E0716 borrow errors in path refactoring
Some checks failed
BotServer CI/CD / build (push) Has been cancelled
- server.rs: Use PathBuf for cert_dir
- auth_routes.rs: Use PathBuf for pat_path
- qrcode.rs: Bind get_work_path() to local var before unwrap_or
- import_export.rs: Bind get_work_path() to local var in both functions (2 occurrences)
2026-04-04 10:13:40 -03:00
552d58376f fix: Fix compilation errors from path refactoring
Some checks failed
BotServer CI/CD / build (push) Failing after 1m27s
- bootstrap_utils.rs: Change Vec<(&'static str,...)> to Vec<(String,...)> to avoid dangling references
- bootstrap_manager.rs: Use name.as_str() for safe_pkill
- setup.rs: Use PathBuf instead of Path::new with format!
- directory/bootstrap.rs: Use PathBuf for pat_dir
- main.rs: Use PathBuf for vault_init_path_early
2026-04-04 10:04:00 -03:00
7d8f141fc2 refactor: Replace all hardcoded ./botserver-stack paths with get_stack_path()/get_work_path()
Some checks failed
BotServer CI/CD / build (push) Failing after 1m28s
- Adds get_stack_path() helper: returns /opt/gbo in production (.env without botserver-stack), ./botserver-stack in dev
- Adds get_work_path() helper: returns /opt/gbo/work in production, ./botserver-stack/data/system/work in dev
- Updated 35+ files to use dynamic path resolution
- Production system container no longer needs botserver-stack directory
- Work files go to /opt/gbo/work instead of /opt/gbo/bin/botserver-stack
2026-04-04 09:24:44 -03:00
c05e40d35b fix: Use anyhow::anyhow! instead of .into() for error type
All checks were successful
BotServer CI/CD / build (push) Successful in 4m19s
2026-04-04 08:28:54 -03:00
0d3cfbe0f7 fix: Replace Runtime::new().block_on() with thread::spawn in AuthConfig
Some checks failed
BotServer CI/CD / build (push) Failing after 1m14s
- AuthConfig::from_env() was creating a new Runtime and calling block_on
  directly, causing panic when main tokio runtime is already active
- Now uses std:🧵:spawn + new_current_thread().block_on() pattern
- Follows AGENTS.md pattern for async-from-sync bridges
2026-04-04 08:25:43 -03:00
6ec82c27a6 fix: Replace futures::executor::block_on with thread::spawn in SET USER
All checks were successful
BotServer CI/CD / build (push) Successful in 4m25s
- Fixes panic: Cannot start a runtime from within a runtime
- set_user.rs was using futures::executor::block_on directly in Rhai callback
- Now uses std:🧵:spawn + new_current_thread().block_on() pattern
- This is called during bootstrap and was causing startup crash
2026-04-04 08:01:04 -03:00
2a042d400b fix: Replace Handle::try_current().block_on() with thread::spawn pattern
All checks were successful
BotServer CI/CD / build (push) Successful in 2m38s
- Fixes panic: Cannot start a runtime from within a runtime
- kb_statistics.rs: Wrap all async calls in std:🧵:spawn
- post_to.rs: Replace Handle::try_current with thread::spawn + mpsc
- Removes dead Handle::try_current checks from sync functions
- Follows AGENTS.md pattern for async-from-sync callbacks
2026-04-04 07:35:03 -03:00
4d7297243e Fix clippy warnings: reduce 17 warnings to 0
All checks were successful
BotServer CI/CD / build (push) Successful in 6m58s
- Fix double_ended_iterator_last: use next_back() instead of last()
- Fix manual_clamp: use .clamp() instead of min().max()
- Fix too_many_arguments: create KbInjectionContext struct
- Fix needless_borrow: remove unnecessary & reference
- Fix let_and_return: return value directly
- Fix await_holding_lock: drop guard before await
- Fix collapsible_else_if: collapse nested if-else

All changes verified with cargo clippy (0 warnings, 0 errors)
Note: Local botserver crashes with existing panic during LocalFileMonitor initialization
This panic exists in original code too, not caused by these changes
2026-04-03 22:34:43 -03:00
9f55e864ff ci: force rebuild
All checks were successful
BotServer CI/CD / build (push) Successful in 2m31s
2026-04-03 21:42:33 -03:00
eb98574c8a fix(runtime): use TransferResult instead of Result in transfer_to_human
All checks were successful
BotServer CI/CD / build (push) Successful in 5m19s
2026-04-03 20:49:17 -03:00
3f94d23e1f fix(runtime): replace Handle::current().block_on() with std:🧵:spawn in transfer_to_human
Some checks failed
BotServer CI/CD / build (push) Failing after 1m18s
- Handle::current().block_on() panics when called from within a runtime
- replaced all 5 occurrences with std:🧵:spawn + mpsc::channel
- matches the pattern already used across other keyword files
2026-04-03 20:43:48 -03:00
8019107ebf fix: remove last remaining block_in_place in TALK TO keyword
Some checks failed
BotServer CI/CD / build (push) Failing after 19m14s
This was the only block_in_place left causing the production panic during
bot compilation. Replaced with std:🧵:spawn + mpsc channel pattern.
2026-04-03 18:35:27 -03:00
6f183c63d2 feat: dual-mode service configs - Vault first, fallback to DB/localhost
Some checks failed
BotServer CI/CD / build (push) Has been cancelled
All services now try Vault first (remote/distributed mode), then fall back
to database config, then localhost defaults (local/dev mode).

Services fixed:
- Qdrant/VectorDB: kb_indexer.rs, kb_statistics.rs, bootstrap_utils.rs, kb_context.rs
- LLM/Embedding: email/vectordb.rs (was hardcoded localhost:8082)
- All services: security/integration.rs (postgres, cache, drive, directory, qdrant, llm)

Pattern: SecretsManager::get_X_config_sync() → DB config → localhost default
2026-04-03 15:01:37 -03:00
f097f000d8 Fix: nested runtime panic in AuthConfig::from_env()
Some checks failed
BotServer CI/CD / build (push) Failing after 1s
Root cause: AuthConfig::from_env() was creating a new tokio runtime
with Runtime::new() inside an existing runtime during initialization.

Impact: Botserver crashed with "Cannot start a runtime from within a
runtime" panic right after CORS layer initialization.

Fix: Use new_current_thread() + std:🧵:spawn pattern (same as
get_database_url_sync fix) to create an isolated thread for async operations.

Files: src/security/auth_api/config.rs

🤖 Generated with [Claude Code](https://claude.com/claude-code)

Co-Authored-By: Claude <noreply@anthropic.com>
2026-04-03 14:02:08 -03:00
61642343a8 fix: replace all block_in_place with std:🧵:spawn to fix nested runtime panic
Some checks are pending
BotServer CI/CD / build (push) Waiting to run
Root cause: block_in_place + new_current_thread().block_on() panics when
called from within tokio runtime (including spawn_blocking). Tokio doesn't
allow nested block_on() calls.

Fix: Replace ALL block_in_place patterns with std:🧵:spawn + mpsc channel.
This creates a completely separate OS thread with its own runtime, avoiding
any nesting issues. Works from any context: async, spawn_blocking, or sync.

Files: 14 files across secrets, utils, state, calendar, analytics, email,
and all keyword handlers (universal_messaging, search, book, create_draft,
create_site, hearing/syntax, use_tool, find, admin_email, goals)
2026-04-03 12:54:36 -03:00
4bdf46bdfc fix: use Result instead Option for runtime builder in get_work_path
Some checks are pending
BotServer CI/CD / build (push) Waiting to run
2026-04-03 12:16:08 -03:00
b2a9c8213d fix: use std:🧵:spawn for sync-to-async bridges to avoid nested block_on panic
Some checks are pending
BotServer CI/CD / build (push) Waiting to run
Root cause: new_current_thread().block_on() panics when called from within
an existing tokio runtime (including from spawn_blocking). Tokio doesn't
allow nested block_on() calls.

Fix: Use std:🧵:spawn to create a completely separate OS thread
with its own runtime, communicating via mpsc channel. This works from
any context: async, spawn_blocking, or sync.
2026-04-03 12:12:59 -03:00
21170faea9 fix: remove block_in_place wrappers that panic inside spawn_blocking
Some checks are pending
BotServer CI/CD / build (push) Waiting to run
Root cause: block_in_place + new_current_thread().block_on() panics when
called from within tokio::task::spawn_blocking because block_in_place is
designed for async worker threads, not blocking threads.

Fix: Remove all block_in_place wrappers and use new_current_thread().build().block_on()
directly. This works from both async contexts and spawn_blocking contexts.

Affected: utils.rs (get_database_url_sync, get_work_path)
2026-04-03 12:05:18 -03:00
c2982f2a33 fix: remove unused handle variable warning in get_database_url_sync
Some checks are pending
BotServer CI/CD / build (push) Waiting to run
2026-04-03 11:31:50 -03:00
263ca4ed11 fix: use new_current_thread runtime in get_database_url_sync to prevent nested block_on panic
All checks were successful
BotServer CI/CD / build (push) Successful in 6m30s
2026-04-03 09:26:23 -03:00
f6a864aa67 fix: replace nested runtime block_on with new_current_thread to prevent panic
All checks were successful
BotServer CI/CD / build (push) Successful in 5m32s
Root cause: Handle::current().block_on() panics inside multi_thread runtime
with 'Cannot start a runtime from within a runtime' error.

Fix: All sync-to-async bridges now use tokio::runtime::Builder::new_current_thread()
instead of Handle::current().block_on(). Also changed SECRETS_MANAGER from
tokio::sync::RwLock to std::sync::RwLock to eliminate unnecessary async overhead.

Files: 14 files across keywords, secrets, utils, state, calendar, analytics, email
Impact: Fixes production crash during bot loading phase
2026-04-03 09:17:23 -03:00
eece6831b4 Fix: initialize secrets manager when remote Vault detected, even without init.json
All checks were successful
BotServer CI/CD / build (push) Successful in 5m9s
- main.rs: Skip init.json check when VAULT_ADDR points to remote server
- This allows botserver to read database credentials from Vault in production
- Without this fix, database URL falls back to localhost and connection fails
2026-04-03 08:22:06 -03:00
65e7db5acd Skip local service install/start when remote Vault detected
All checks were successful
BotServer CI/CD / build (push) Successful in 5m48s
- install_all() returns early if VAULT_ADDR is remote
- start_all() returns early if VAULT_ADDR is remote
- bootstrap.rs treats remote VAULT_ADDR as bootstrap_completed=true
- Prevents botserver from trying to install/start local services
  when all services are running in separate containers
2026-04-03 07:36:15 -03:00
e992ed3b39 Enforce Vault-only secrets: remove env var fallbacks, all secrets from Vault
Some checks are pending
BotServer CI/CD / build (push) Waiting to run
- Remove all std::env::var calls except VAULT_* and PORT
- get_from_env returns hardcoded defaults only (no env var reading)
- Auth config, rate limits, email, analytics, calendar all use Vault
- WORK_PATH replaced with get_work_path() helper reading from Vault
- .env on production cleaned to only VAULT_ADDR, VAULT_TOKEN, VAULT_CACERT, PORT
- All service IPs/credentials stored in Vault secret/gbo/*
2026-04-03 07:11:40 -03:00
5d88013ee3 Fix get_from_env: read actual env vars instead of hardcoded localhost values
All checks were successful
BotServer CI/CD / build (push) Successful in 4m3s
2026-04-02 21:17:19 -03:00
98b204b12e Fix health checks: replace nc with ss -tln for non-root environments
Some checks failed
BotServer CI/CD / build (push) Has been cancelled
2026-04-02 18:15:07 -03:00
7b4753af0d fix: init_redis tries both no-password and password URLs for Valkey
All checks were successful
BotServer CI/CD / build (push) Successful in 27s
- Root cause: Valkey in prod runs without password but Vault stores one
- Previous code only tried password URL, got AUTH failed
- Fix: try no-password URL first, then password URL as fallback
- Also removed unused cache_url variable and cleaned up retry logic
2026-04-02 07:36:16 -03:00
dae0feb6a5 fix: SecretPaths match Vault seeding paths (gbo/cache not gbo/system/cache)
All checks were successful
BotServer CI/CD / build (push) Successful in 3m49s
- Root cause: Vault seeding writes to secret/gbo/cache but code reads gbo/system/cache
- kv2::read prepends secret/ so it looks for secret/gbo/system/cache (wrong)
- Fix: update SecretPaths to match seeding paths (gbo/cache, gbo/drive, etc.)
- Testing: compiles clean, paths now match vault kv list output
2026-04-02 07:16:32 -03:00
f118c74cf1 fix: init_redis uses async Vault call instead of sync block_on (fixes panic)
All checks were successful
BotServer CI/CD / build (push) Successful in 5m40s
- Root cause: get_cache_config() uses runtime.block_on() which panics
  when called from within an async runtime
- Fix: call SecretsManager::get_secret() directly with .await
- Testing: compiles clean, no runtime nesting issues
2026-04-02 06:59:21 -03:00
b3edf21d21 fix: init_redis fetches cache password from Vault (fixes connection timeout)
All checks were successful
BotServer CI/CD / build (push) Successful in 4m59s
- Root cause: init_redis() used redis://localhost:6379 without password
- Valkey requires authentication, causing connection timeouts
- Fix: use get_cache_config() from SecretsManager to build URL with password
- Falls back to env vars (CACHE_URL/REDIS_URL/VALKEY_URL) if set
2026-04-01 20:17:37 -03:00
3c9e4ba6e7 fix: cache_health_check uses ss instead of nc (nc missing in prod container)
All checks were successful
BotServer CI/CD / build (push) Successful in 4m42s
- Root cause: prod container lacks nc (netcat), causing fallback to valkey-cli ping
- valkey-cli ping hangs indefinitely when Valkey requires password auth
- Fix: use ss -tlnp as primary check (always available), nc as fallback
- Testing: verified ss is available in prod, nc is not
2026-04-01 20:06:13 -03:00
d098961142 fix: Bootstrap checks stack/.env path in addition to ./.env
All checks were successful
BotServer CI/CD / build (push) Successful in 4m39s
- Production has .env in botserver-stack/.env not ./.env
- Checks both locations to detect completed bootstrap
- Fixes E0716: use let bindings for Path borrows
2026-04-01 19:30:08 -03:00
8fd3254334 fix: Bootstrap checks stack/.env path in addition to ./.env
Some checks failed
BotServer CI/CD / build (push) Failing after 1m25s
- Production has .env in botserver-stack/.env not ./.env
- Checks both locations to detect completed bootstrap
- Prevents full re-bootstrap on restart in production
2026-04-01 19:26:32 -03:00
318367d439 fix: Valkey health check uses nc first (avoids password hang)
All checks were successful
BotServer CI/CD / build (push) Successful in 3m58s
- nc -z checks port connectivity instantly (no auth needed)
- valkey-cli ping as fallback (hangs when password required)
- Fixes bootstrap hang on production where Valkey has Vault password
2026-04-01 18:52:04 -03:00
c26e483cc9 fix: All services check health before starting (idempotent bootstrap)
All checks were successful
BotServer CI/CD / build (push) Successful in 4m9s
- Tables (PostgreSQL): pg_isready health check before start
- Drive (MinIO): /minio/health/live check before start
- ALM (Forgejo): HTTP health check before start
- ALM CI (Forgejo Runner): pgrep check before start
- Valkey: health check uses absolute path to valkey-cli
- Vault, Qdrant, Zitadel: already had health checks
- Result: no duplicate starts, no hangs on restart
2026-04-01 18:28:54 -03:00
ba7f1ba5eb fix: Valkey health check uses absolute path to valkey-cli
Some checks failed
BotServer CI/CD / build (push) Has been cancelled
- Use BOTSERVER_STACK_PATH/bin/cache/bin/valkey-cli instead of relying on PATH
- Remove bash /dev/tcp fallback (unreliable in restricted environments)
- Falls back to redis-cli and nc if valkey-cli unavailable
2026-04-01 18:11:26 -03:00
68ef554132 fix: Vault as single source of truth - credentials + location for all services
All checks were successful
BotServer CI/CD / build (push) Successful in 4m53s
- Qdrant health check: recognize 'healthz check passed' response (fixes 45s timeout)
- seed_vault_defaults: add host/port/url/grpc_port for ALL 10 services
- fetch_vault_credentials: fetch ALL services via generic loop (drive, cache, tables, vectordb, directory, llm, meet, alm, encryption)
- vectordb URL: fix https://localhost:6334 -> http://localhost:6333 in all config getters
- get_from_env: add host/port/grpc_port for vectordb fallback
- Tested: .reset (fresh install) + .restart (idempotent) - zero errors
2026-04-01 16:46:16 -03:00
fb2e5242da fix: Vault seeding, service health checks, and restart idempotency
All checks were successful
BotServer CI/CD / build (push) Successful in 55m52s
- Replace hardcoded passwords with generate_random_string() for all Vault-seeded services
- Add valkey-cli, nc to SafeCommand allowlist; fix PATH in all 4 execution methods
- Fix empty Vault KV values ('none' placeholder) preventing 'Failed to parse K=V' errors
- Fix special chars in generated passwords triggering shell injection false positives
- Add ALM app.ini creation with absolute paths for Forgejo CLI
- Increase Qdrant timeout 15s→45s, ALM wait 5s→20s
- Persist file_states and kb_states to disk for .bas/KB idempotency across restarts
- Add duplicate check to use_website registration (debug log for existing)
- Remove dead code (SERVER_START_EPOCH, server_epoch)
- Add generate_random_string() to shared mod.rs, remove duplicates
2026-04-01 12:22:57 -03:00
3e46a16469 fix: Seed default credentials into Vault after initialization
Some checks failed
BotServer CI/CD / build (push) Failing after 3h13m28s
- Add seed_vault_defaults() to write default creds for all components
  (drive, cache, tables, directory, email, llm, encryption, meet, vectordb, alm)
- Call seed_vault_defaults() after KV2 enable in initialize_vault_local()
- Call seed_vault_defaults() in recover_existing_vault() for recovery path
- Rewrite fetch_vault_credentials() to use SafeCommand directly instead of
  safe_sh_command, avoiding '//' shell injection false positive on URLs
- Components like Drive now get credentials from Vault instead of 403 errors
2026-03-31 22:19:09 -03:00
9919a8321c fix: Use SafeCommand directly for vault health check to avoid shell injection false positive
All checks were successful
BotServer CI/CD / build (push) Successful in 6m46s
- Replace safe_sh_command with SafeCommand::new("curl").args() in vault_health_check()
- The URL contains https:// which triggered '//' pattern detection in shell command
- Direct SafeCommand bypasses shell parsing, URL passed as single argument
- Add vault data directory existence check before recovery attempt
- Prevents 'Dangerous pattern // detected' errors during bootstrap
2026-03-31 21:34:04 -03:00
07a6c1edb3 Merge commit '582ea634'
All checks were successful
BotServer CI/CD / build (push) Successful in 7m38s
2026-03-31 21:10:25 -03:00
582ea634e7 fix: Vault bootstrap recovery for sealed but initialized instances
- Fix vault_health_check() stub that always returned false
- Add recover_existing_vault() to handle Vault with existing data but no init.json
- Add unseal_vault() helper to unseal with existing vault-unseal-keys
- Detect initialized Vault via health endpoint or data directory presence
- Prevents bootstrap failure when reset.sh deletes init.json but Vault data persists

Root cause: vault_health_check() was a stub returning false, causing bootstrap
to always try vault operator init on already-initialized (but sealed) Vault,
which failed with connection refused. This cascaded to all services failing
to fetch credentials from Vault.
2026-03-31 20:49:29 -03:00
4ae16017ff Merge commit '644dfe2d'
Some checks failed
BotServer CI/CD / build (push) Has been cancelled
2026-03-31 19:57:57 -03:00
644dfe2d19 fix: Improve .gbdialog file detection for nested paths 2026-03-31 19:57:33 -03:00
2fa59057fa fix: Resolve migration error, Vault 403, cache timeout, and shell injection false positives
Some checks failed
BotServer CI/CD / build (push) Has been cancelled
- Fix migration 6.2.5: Create lost_reason column before VIEW that references it
- Fix Vault 403: Enable KV2 secrets engine after initialization
- Fix cache timeout: Increase Valkey readiness wait from 12s to 30s
- Fix command_guard: Remove () from forbidden chars (safe in std::process::Command)
2026-03-31 19:55:16 -03:00
b83b4ffc4d fix: Remove server_epoch() from start_bas_executed Redis key
The epoch caused a new key to be created every second, bypassing
the 'already executed' check and running start.bas multiple times,
resulting in triplicated suggestions.
2026-03-21 20:40:25 -03:00
1132983064 feat(kb): add with_bot_config to load embedding from bot config
- Adds KnowledgeBaseManager::with_default_config() as alias to new()
- Adds KnowledgeBaseManager::with_bot_config() to load embedding_url,
  embedding_model, and qdrant config from bot's config.csv
- Updates bootstrap to use with_bot_config with default_bot_id
- Enables per-bot embedding configuration instead of global env vars
2026-03-21 18:55:36 -03:00
622f1222dc fix(websocket): force start.bas execution on connection to restore chat on page reload while preventing duplicate execution 2026-03-21 16:38:03 -03:00