Commit graph

4130 commits

Author SHA1 Message Date
eb98574c8a fix(runtime): use TransferResult instead of Result in transfer_to_human
All checks were successful
BotServer CI/CD / build (push) Successful in 5m19s
2026-04-03 20:49:17 -03:00
3f94d23e1f fix(runtime): replace Handle::current().block_on() with std:🧵:spawn in transfer_to_human
Some checks failed
BotServer CI/CD / build (push) Failing after 1m18s
- Handle::current().block_on() panics when called from within a runtime
- replaced all 5 occurrences with std:🧵:spawn + mpsc::channel
- matches the pattern already used across other keyword files
2026-04-03 20:43:48 -03:00
f2f81415e4 fix(ci): use systemctl stop/start instead of killall/nohup
All checks were successful
BotServer CI/CD / build (push) Successful in 46s
2026-04-03 20:39:42 -03:00
684bd87683 fix(ci): remove error masking, show all errors in console
Some checks failed
BotServer CI/CD / build (push) Failing after 3s
2026-04-03 20:36:25 -03:00
72bc18b7de fix(ci): separate deploy steps - backup, kill, transfer, start
All checks were successful
BotServer CI/CD / build (push) Successful in 46s
2026-04-03 20:35:38 -03:00
bf704c0f6e fix(ci): use systemctl restart instead of killall
Some checks failed
BotServer CI/CD / build (push) Has been cancelled
2026-04-03 20:31:45 -03:00
4bf3da36bb fix(ci): wrap all SSH with timeout, combine steps, remove set -e
Some checks failed
BotServer CI/CD / build (push) Has been cancelled
2026-04-03 20:04:31 -03:00
452e674e09 fix(ci): use killall -9 with fuser fallback for reliable process kill
Some checks failed
BotServer CI/CD / build (push) Has been cancelled
2026-04-03 20:01:11 -03:00
bf140a870e fix(ci): resolve deploy step hanging on pkill
Some checks failed
BotServer CI/CD / build (push) Has been cancelled
- pgrep -f botserver matched the SSH command itself causing deadlock
- replaced with pkill -f '/opt/gbo/bin/botserver' || true
- added SSH keepalive (ServerAliveInterval=10, ServerAliveCountMax=3)
- added Step 7: explicitly start botserver after deploy
- fixed unquoted SSH_ARGS causing argument splitting
2026-04-03 19:51:30 -03:00
8019107ebf fix: remove last remaining block_in_place in TALK TO keyword
Some checks failed
BotServer CI/CD / build (push) Failing after 19m14s
This was the only block_in_place left causing the production panic during
bot compilation. Replaced with std:🧵:spawn + mpsc channel pattern.
2026-04-03 18:35:27 -03:00
6f183c63d2 feat: dual-mode service configs - Vault first, fallback to DB/localhost
Some checks failed
BotServer CI/CD / build (push) Has been cancelled
All services now try Vault first (remote/distributed mode), then fall back
to database config, then localhost defaults (local/dev mode).

Services fixed:
- Qdrant/VectorDB: kb_indexer.rs, kb_statistics.rs, bootstrap_utils.rs, kb_context.rs
- LLM/Embedding: email/vectordb.rs (was hardcoded localhost:8082)
- All services: security/integration.rs (postgres, cache, drive, directory, qdrant, llm)

Pattern: SecretsManager::get_X_config_sync() → DB config → localhost default
2026-04-03 15:01:37 -03:00
edff5de662 ci: trigger after fixing all permissions to gbuser
Some checks failed
BotServer CI/CD / build (push) Has been cancelled
2026-04-03 14:27:25 -03:00
750f93e731 ci: trigger after fixing workspace permissions 2026-04-03 14:27:05 -03:00
6e1701b0f3 ci: trigger build with runner 21 online
Some checks failed
BotServer CI/CD / build (push) Failing after 0s
2026-04-03 14:26:08 -03:00
f097f000d8 Fix: nested runtime panic in AuthConfig::from_env()
Some checks failed
BotServer CI/CD / build (push) Failing after 1s
Root cause: AuthConfig::from_env() was creating a new tokio runtime
with Runtime::new() inside an existing runtime during initialization.

Impact: Botserver crashed with "Cannot start a runtime from within a
runtime" panic right after CORS layer initialization.

Fix: Use new_current_thread() + std:🧵:spawn pattern (same as
get_database_url_sync fix) to create an isolated thread for async operations.

Files: src/security/auth_api/config.rs

🤖 Generated with [Claude Code](https://claude.com/claude-code)

Co-Authored-By: Claude <noreply@anthropic.com>
2026-04-03 14:02:08 -03:00
61642343a8 fix: replace all block_in_place with std:🧵:spawn to fix nested runtime panic
Some checks are pending
BotServer CI/CD / build (push) Waiting to run
Root cause: block_in_place + new_current_thread().block_on() panics when
called from within tokio runtime (including spawn_blocking). Tokio doesn't
allow nested block_on() calls.

Fix: Replace ALL block_in_place patterns with std:🧵:spawn + mpsc channel.
This creates a completely separate OS thread with its own runtime, avoiding
any nesting issues. Works from any context: async, spawn_blocking, or sync.

Files: 14 files across secrets, utils, state, calendar, analytics, email,
and all keyword handlers (universal_messaging, search, book, create_draft,
create_site, hearing/syntax, use_tool, find, admin_email, goals)
2026-04-03 12:54:36 -03:00
4bdf46bdfc fix: use Result instead Option for runtime builder in get_work_path
Some checks are pending
BotServer CI/CD / build (push) Waiting to run
2026-04-03 12:16:08 -03:00
b2a9c8213d fix: use std:🧵:spawn for sync-to-async bridges to avoid nested block_on panic
Some checks are pending
BotServer CI/CD / build (push) Waiting to run
Root cause: new_current_thread().block_on() panics when called from within
an existing tokio runtime (including from spawn_blocking). Tokio doesn't
allow nested block_on() calls.

Fix: Use std:🧵:spawn to create a completely separate OS thread
with its own runtime, communicating via mpsc channel. This works from
any context: async, spawn_blocking, or sync.
2026-04-03 12:12:59 -03:00
21170faea9 fix: remove block_in_place wrappers that panic inside spawn_blocking
Some checks are pending
BotServer CI/CD / build (push) Waiting to run
Root cause: block_in_place + new_current_thread().block_on() panics when
called from within tokio::task::spawn_blocking because block_in_place is
designed for async worker threads, not blocking threads.

Fix: Remove all block_in_place wrappers and use new_current_thread().build().block_on()
directly. This works from both async contexts and spawn_blocking contexts.

Affected: utils.rs (get_database_url_sync, get_work_path)
2026-04-03 12:05:18 -03:00
c2982f2a33 fix: remove unused handle variable warning in get_database_url_sync
Some checks are pending
BotServer CI/CD / build (push) Waiting to run
2026-04-03 11:31:50 -03:00
b628313b4c ci: trigger fresh run after runner 20 registration
Some checks are pending
BotServer CI/CD / build (push) Waiting to run
2026-04-03 11:18:41 -03:00
6f89b72a80 ci: trigger test with host runner
Some checks are pending
BotServer CI/CD / build (push) Waiting to run
2026-04-03 11:14:29 -03:00
b61926dbd6 ci: trigger test with debug runner
Some checks are pending
BotServer CI/CD / build (push) Waiting to run
2026-04-03 11:13:30 -03:00
8f1d6411b5 ci: trigger test with runner 19 (Docker label)
Some checks are pending
BotServer CI/CD / build (push) Waiting to run
2026-04-03 11:00:08 -03:00
354e3402fa ci: trigger test with Docker label fix
Some checks are pending
BotServer CI/CD / build (push) Waiting to run
2026-04-03 10:45:34 -03:00
7ed341de71 ci: trigger test with Docker installed
Some checks failed
BotServer CI/CD / build (push) Failing after 1s
2026-04-03 10:41:53 -03:00
10f34b7f2c ci: trigger test after runner restart
Some checks failed
BotServer CI/CD / build (push) Failing after 1s
2026-04-03 10:39:35 -03:00
9440ba46d3 ci: fix deploy SSH to use explicit key for gbuser
Some checks failed
BotServer CI/CD / build (push) Failing after 1s
- Add SSH_KEY variable with -i flag for gbuser identity
- Fix all ssh commands in deploy and verify steps
- Job 902 proved build works with sccache (106s)
- Deploy was failing because gbuser had no SSH key auth to system container
2026-04-03 10:17:40 -03:00
1a5a1298f7 ci: trigger test for gbuser runner
Some checks failed
BotServer CI/CD / build (push) Failing after 1s
2026-04-03 09:50:27 -03:00
71cb8dee2e ci: fix PATH to include rustup toolchain bin directory
All checks were successful
BotServer CI/CD / build (push) Successful in 1m46s
2026-04-03 09:45:02 -03:00
1779f4f2fd ci: run as gbuser, use sccache, rename ci to data
Some checks failed
BotServer CI/CD / build (push) Failing after 2s
- Change runner service from root to gbuser
- Add sccache for build caching (RUSTC_WRAPPER=sccache)
- Rename /opt/gbo/ci to /opt/gbo/data for consistency
- Persist gb-ws clone instead of re-cloning every build
- Add sccache --show-stats to build output for monitoring
2026-04-03 09:40:16 -03:00
263ca4ed11 fix: use new_current_thread runtime in get_database_url_sync to prevent nested block_on panic
All checks were successful
BotServer CI/CD / build (push) Successful in 6m30s
2026-04-03 09:26:23 -03:00
f6a864aa67 fix: replace nested runtime block_on with new_current_thread to prevent panic
All checks were successful
BotServer CI/CD / build (push) Successful in 5m32s
Root cause: Handle::current().block_on() panics inside multi_thread runtime
with 'Cannot start a runtime from within a runtime' error.

Fix: All sync-to-async bridges now use tokio::runtime::Builder::new_current_thread()
instead of Handle::current().block_on(). Also changed SECRETS_MANAGER from
tokio::sync::RwLock to std::sync::RwLock to eliminate unnecessary async overhead.

Files: 14 files across keywords, secrets, utils, state, calendar, analytics, email
Impact: Fixes production crash during bot loading phase
2026-04-03 09:17:23 -03:00
eece6831b4 Fix: initialize secrets manager when remote Vault detected, even without init.json
All checks were successful
BotServer CI/CD / build (push) Successful in 5m9s
- main.rs: Skip init.json check when VAULT_ADDR points to remote server
- This allows botserver to read database credentials from Vault in production
- Without this fix, database URL falls back to localhost and connection fails
2026-04-03 08:22:06 -03:00
65e7db5acd Skip local service install/start when remote Vault detected
All checks were successful
BotServer CI/CD / build (push) Successful in 5m48s
- install_all() returns early if VAULT_ADDR is remote
- start_all() returns early if VAULT_ADDR is remote
- bootstrap.rs treats remote VAULT_ADDR as bootstrap_completed=true
- Prevents botserver from trying to install/start local services
  when all services are running in separate containers
2026-04-03 07:36:15 -03:00
e992ed3b39 Enforce Vault-only secrets: remove env var fallbacks, all secrets from Vault
Some checks are pending
BotServer CI/CD / build (push) Waiting to run
- Remove all std::env::var calls except VAULT_* and PORT
- get_from_env returns hardcoded defaults only (no env var reading)
- Auth config, rate limits, email, analytics, calendar all use Vault
- WORK_PATH replaced with get_work_path() helper reading from Vault
- .env on production cleaned to only VAULT_ADDR, VAULT_TOKEN, VAULT_CACERT, PORT
- All service IPs/credentials stored in Vault secret/gbo/*
2026-04-03 07:11:40 -03:00
5d88013ee3 Fix get_from_env: read actual env vars instead of hardcoded localhost values
All checks were successful
BotServer CI/CD / build (push) Successful in 4m3s
2026-04-02 21:17:19 -03:00
98b204b12e Fix health checks: replace nc with ss -tln for non-root environments
Some checks failed
BotServer CI/CD / build (push) Has been cancelled
2026-04-02 18:15:07 -03:00
11c161fc1d Update botserver
All checks were successful
BotServer CI/CD / build (push) Successful in 27s
2026-04-02 17:03:12 -03:00
521b9b7da4 Update forgejo workflow
Some checks failed
BotServer CI/CD / build (push) Failing after 19s
2026-04-02 16:15:01 -03:00
00e5a3a5ff ci: add Step 7 to restart botserver service after deploy
Some checks failed
BotServer CI/CD / build (push) Failing after 17s
2026-04-02 16:01:15 -03:00
c041fe9cd3 ci: trigger build
All checks were successful
BotServer CI/CD / build (push) Successful in 28s
2026-04-02 15:48:36 -03:00
7b4753af0d fix: init_redis tries both no-password and password URLs for Valkey
All checks were successful
BotServer CI/CD / build (push) Successful in 27s
- Root cause: Valkey in prod runs without password but Vault stores one
- Previous code only tried password URL, got AUTH failed
- Fix: try no-password URL first, then password URL as fallback
- Also removed unused cache_url variable and cleaned up retry logic
2026-04-02 07:36:16 -03:00
dae0feb6a5 fix: SecretPaths match Vault seeding paths (gbo/cache not gbo/system/cache)
All checks were successful
BotServer CI/CD / build (push) Successful in 3m49s
- Root cause: Vault seeding writes to secret/gbo/cache but code reads gbo/system/cache
- kv2::read prepends secret/ so it looks for secret/gbo/system/cache (wrong)
- Fix: update SecretPaths to match seeding paths (gbo/cache, gbo/drive, etc.)
- Testing: compiles clean, paths now match vault kv list output
2026-04-02 07:16:32 -03:00
f118c74cf1 fix: init_redis uses async Vault call instead of sync block_on (fixes panic)
All checks were successful
BotServer CI/CD / build (push) Successful in 5m40s
- Root cause: get_cache_config() uses runtime.block_on() which panics
  when called from within an async runtime
- Fix: call SecretsManager::get_secret() directly with .await
- Testing: compiles clean, no runtime nesting issues
2026-04-02 06:59:21 -03:00
b3edf21d21 fix: init_redis fetches cache password from Vault (fixes connection timeout)
All checks were successful
BotServer CI/CD / build (push) Successful in 4m59s
- Root cause: init_redis() used redis://localhost:6379 without password
- Valkey requires authentication, causing connection timeouts
- Fix: use get_cache_config() from SecretsManager to build URL with password
- Falls back to env vars (CACHE_URL/REDIS_URL/VALKEY_URL) if set
2026-04-01 20:17:37 -03:00
3c9e4ba6e7 fix: cache_health_check uses ss instead of nc (nc missing in prod container)
All checks were successful
BotServer CI/CD / build (push) Successful in 4m42s
- Root cause: prod container lacks nc (netcat), causing fallback to valkey-cli ping
- valkey-cli ping hangs indefinitely when Valkey requires password auth
- Fix: use ss -tlnp as primary check (always available), nc as fallback
- Testing: verified ss is available in prod, nc is not
2026-04-01 20:06:13 -03:00
d098961142 fix: Bootstrap checks stack/.env path in addition to ./.env
All checks were successful
BotServer CI/CD / build (push) Successful in 4m39s
- Production has .env in botserver-stack/.env not ./.env
- Checks both locations to detect completed bootstrap
- Fixes E0716: use let bindings for Path borrows
2026-04-01 19:30:08 -03:00
8fd3254334 fix: Bootstrap checks stack/.env path in addition to ./.env
Some checks failed
BotServer CI/CD / build (push) Failing after 1m25s
- Production has .env in botserver-stack/.env not ./.env
- Checks both locations to detect completed bootstrap
- Prevents full re-bootstrap on restart in production
2026-04-01 19:26:32 -03:00
318367d439 fix: Valkey health check uses nc first (avoids password hang)
All checks were successful
BotServer CI/CD / build (push) Successful in 3m58s
- nc -z checks port connectivity instantly (no auth needed)
- valkey-cli ping as fallback (hangs when password required)
- Fixes bootstrap hang on production where Valkey has Vault password
2026-04-01 18:52:04 -03:00