# Production Environment Guide (Compact) ## CRITICAL RULES — READ FIRST NEVER INCLUDE HERE CREDENTIALS OR COMPANY INFORMATION, THIS IS COMPANY AGNOSTIC. Always manage services with `systemctl` inside the `system` Incus container. Never run `/opt/gbo/bin/botserver` or `/opt/gbo/bin/botui` directly — they will fail because they won't load the `.env` file containing Vault credentials and paths. The correct commands are `sudo incus exec system -- systemctl start|stop|restart|status botserver` and the same for `ui`. Systemctl handles environment loading, auto-restart, logging, and dependencies. Never push secrets (API keys, passwords, tokens) to git. Never commit `init.json` (it contains Vault unseal keys). All secrets must come from Vault — only `VAULT_*` variables are allowed in `.env`. Never deploy manually via scp or ssh; always use CI/CD. Always push all submodules (botserver, botui, botlib) before or alongside the main repo. Always ask before pushing to ALM. --- ## Infrastructure Overview The host machine is ``, accessed via `ssh user@`, running Incus (an LXD fork) as hypervisor. All services run inside named Incus containers. You enter containers with `sudo incus exec -- ` and list them with `sudo incus list`. The containers and their roles are: `system` runs botserver on port 5858 and botui on port 5859; `alm-ci` runs the Forgejo Actions CI runner; `alm` hosts the Forgejo git server; `tables` runs PostgreSQL on port 5432; `cache` runs Valkey/Redis on port 6379; `drive` runs MinIO object storage on port 9100; `vault` runs HashiCorp Vault on port 8200; `vector` runs Qdrant on port 6333. Externally, botserver is reachable at `https://` and botui at `https://`. Internally, botui's `BOTSERVER_URL` must be `http://localhost:5858` — never the external HTTPS URL, because the Rust proxy runs server-side and needs direct localhost access. --- ## Services Detail Botserver runs as user `gbuser`, binary at `/opt/gbo/bin/botserver`, logs at `/opt/gbo/logs/out.log` and `/opt/gbo/logs/err.log`, systemd unit at `/etc/systemd/system/botserver.service`, env loaded from `/opt/gbo/bin/.env`. Bot BASIC scripts live under `/opt/gbo/data/.gbai/.gbdialog/*.bas`; compiled AST cache goes to `/opt/gbo/work/`. The directory service runs Zitadel as user `root`, binary at `/opt/gbo/bin/zitadel`, logs at `/opt/gbo/logs/zitadel.log`, systemd unit at `/etc/systemd/system/directory.service`, and loads environment from the service configuration. Zitadel provides identity management and OAuth2 services for the platform. Internally, Zitadel listens on port 8080 within the directory container. For external access: - Via public domain (HTTPS): `https://` (configured through proxy container) - Via host IP (HTTP): `http://:9000` (direct container port forwarding) - Via container IP (HTTP): `http://:9000` (direct container access) Access the Zitadel console at `https:///ui/console` with admin credentials. Zitadel implements v1 Management API (deprecated) and v2 Organization/User services. Always use the v2 endpoints under `/v2/organizations` and `/v2/users` for all operations. The botserver bootstrap also manages: Vault (secrets), PostgreSQL (database), Valkey (cache, password auth), MinIO (object storage), Zitadel (identity provider), and llama.cpp (LLM). To obtain a PAT for Zitadel API access, check /opt/gbo/conf/directory/admin-pat.txt in the directory container. Use it with curl by setting the Authorization header: `Authorization: Bearer $(cat /opt/gbo/conf/directory/admin-pat.txt)` and include `-H "Host: "` for correct host resolution (replace with your directory container IP). --- ## Directory Management (Zitadel) ### Getting Admin PAT (Personal Access Token) ```bash # Get the admin PAT from directory container PAT=$(ssh administrator@ "sudo incus exec directory -- cat /opt/gbo/conf/directory/admin-pat.txt") ``` ### User Management via API (v2) **Create a Human User:** ```bash curl -X POST "http://:8080/v2/users/human" \ -H "Content-Type: application/json" \ -H "Authorization: Bearer $PAT" \ -H "Host: " \ -d '{ "username": "testuser", "profile": {"givenName": "Test", "familyName": "User"}, "email": {"email": "test@example.com", "isVerified": true}, "password": {"password": "SecurePass123!", "changeRequired": false} }' ``` **List Users:** ```bash curl -X POST "http://:8080/v2/users" \ -H "Content-Type: application/json" \ -H "Authorization: Bearer $PAT" \ -H "Host: " \ -d '{"query": {"offset": 0, "limit": 100}}' ``` **Update User Password:** ```bash curl -X POST "http://:8080/v2/users//password" \ -H "Content-Type: application/json" \ -H "Authorization: Bearer $PAT" \ -H "Host: " \ -d '{ "newPassword": {"password": "NewPass123!", "changeRequired": false} }' ``` **Delete User:** ```bash curl -X DELETE "http://:8080/v2/users/" \ -H "Authorization: Bearer $PAT" \ -H "Host: " ``` ### Directory Quick Reference | Task | Command | |------|---------| | Get PAT | `sudo incus exec directory -- cat /opt/gbo/conf/directory/admin-pat.txt` | | Check health | `curl -sf http://:8080/debug/healthz` | | Console UI | `http://:9000/ui/console` | | Create user | `POST /v2/users/human` | | List users | `POST /v2/users` | | Update password | `POST /v2/users/{id}/password` | ### Zitadel API v2 Usage with PAT **Important:** Zitadel API v2 requires a valid Personal Access Token (PAT) for authentication. The PAT must have the appropriate scopes for the operations you want to perform. **Using PAT with curl:** ```bash # Set your PAT as an environment variable PAT="" # Include the required headers in all API calls curl -X POST "http://:8080/v2/organizations" \ -H "Content-Type: application/json" \ -H "Authorization: Bearer $PAT" \ -H "Host: " \ -d '{ "name": "pragmatismo" }' ``` **Critical Headers:** - `Authorization: Bearer $PAT` - Your PAT token - `Host: ` - Required for gRPC-gateway routing - `Content-Type: application/json` - For POST/PUT/PATCH requests **Common API v2 Endpoints:** Create Organization: ```bash curl -X POST "http://10.157.134.240:8080/v2/organizations" \ -H "Content-Type: application/json" \ -H "Authorization: Bearer $PAT" \ -H "Host: 10.157.134.240" \ -d '{ "name": "organization-name" }' ``` List Organizations (requires body with query): ```bash curl -X POST "http://10.157.134.240:8080/v2/organizations" \ -H "Content-Type: application/json" \ -H "Authorization: Bearer $PAT" \ -H "Host: 10.157.134.240" \ -d '{ "query": { "offset": 0, "limit": 100 } }' ``` Create Human User: ```bash curl -X POST "http://10.157.134.240:8080/v2/users/human" \ -H "Content-Type: application/json" \ -H "Authorization: Bearer $PAT" \ -H "Host: 10.157.134.240" \ -d '{ "username": "johndoe", "profile": { "givenName": "John", "familyName": "Doe" }, "email": { "email": "john@example.com", "isVerified": true }, "password": { "password": "SecurePass123!", "changeRequired": false } }' ``` **Testing PAT Validity:** ```bash # Test if PAT is valid by calling users endpoint curl -X POST "http://10.157.134.240:8080/v2/users" \ -H "Content-Type: application/json" \ -H "Authorization: Bearer $PAT" \ -H "Host: 10.157.134.240" \ -d '{"query": {"offset": 0, "limit": 1}}' # If you get {"code":16,"message":"Errors.Token.Invalid (AUTH-7fs1e)"}, the PAT is invalid ``` **Generating a New PAT via Web Console:** 1. Access: `http://:9000/ui/console` 2. Login with admin credentials 3. Navigate to your profile (top right corner) 4. Go to "Personal Access Tokens" 5. Click "Create" 6. Name the token and select expiration 7. Copy the token (you won't see it again!) 8. Update `/opt/gbo/conf/directory/admin-pat.txt` with the new token ### Production Credentials **Admin Account:** - Username: `admin` - Password: `Admin123!` - Access: `https:///ui/console` **Test User Account (created via API):** - Username: `rodriguez` - Password: `SecurePass2026!` - User ID: `368981346720188144` - Access: Use with any bot login page --- ### Zitadel Setup & Initialization **Database Configuration:** Zitadel connects to PostgreSQL with these credentials (set in `directory.service`): - Database: `PROD-DIRECTORY` - Host: `10.157.134.174` (tables container) - Port: `5432` - User: `postgres` - Password: `67a690df` (from Vault: `secret/gbo/tables`) **Current Production Settings:** - Container IP: `10.157.134.240` - Internal port: `8080` - External port: `9000` - Masterkey: `MasterkeyNeedsToHave32Characters` (CHANGE THIS IN PRODUCTION!) - TLS mode: `disabled` - External domain: `10.157.134.240` **Initialization File:** Location: `/opt/gbo/conf/directory/zitadel-init-steps.yaml` ```yaml FirstInstance: InstanceName: "BotServer" DefaultLanguage: "en" PatPath: "/opt/gbo/conf/directory/admin-pat.txt" Org: Name: "BotServer" Machine: Machine: Username: "admin-sa" Name: "Admin Service Account" Pat: ExpirationDate: "2099-01-01T00:00:00Z" Human: UserName: "admin" FirstName: "Admin" LastName: "User" Email: Address: "admin@localhost" Verified: true Password: "Admin123!" PasswordChangeRequired: false ``` **To Reinitialize Zitadel (if database is empty or corrupted):** ```bash # 1. Stop the service sudo incus exec directory -- systemctl stop directory # 2. Drop and recreate the database sudo incus exec tables -- psql -h localhost -U postgres -d postgres -c "DROP DATABASE IF EXISTS \"PROD-DIRECTORY\";" sudo incus exec tables -- psql -h localhost -U postgres -d postgres -c "CREATE DATABASE \"PROD-DIRECTORY\";" # 3. Run initialization sudo incus exec directory -- bash -c ' export ZITADEL_DATABASE_POSTGRES_HOST=10.157.134.174 export ZITADEL_DATABASE_POSTGRES_PORT=5432 export ZITADEL_DATABASE_POSTGRES_DATABASE=PROD-DIRECTORY export ZITADEL_DATABASE_POSTGRES_USER_USERNAME=postgres export ZITADEL_DATABASE_POSTGRES_USER_PASSWORD=67a690df export ZITADEL_DATABASE_POSTGRES_USER_SSL_MODE=disable /opt/gbo/bin/zitadel setup init \ --config /opt/gbo/conf/directory/zitadel-init-steps.yaml \ --masterkey MasterkeyNeedsToHave32Characters \ --tlsMode disabled ' # 4. Start the service sudo incus exec directory -- systemctl start directory # 5. Verify health curl -sf http://10.157.134.240:8080/debug/healthz ``` **Zitadel Database Schema:** The database uses multiple schemas: - `system` - System tables and configuration - `projections` - Read-only projection tables (orgs, users, sessions, etc.) - `eventstore` - Event sourcing tables - `adminapi`, `auth`, `logstore`, `cache`, `queue` - Specialized schemas To query organizations: ```bash sudo incus exec tables -- psql -h localhost -U postgres -d PROD-DIRECTORY -c \ "SELECT id, name FROM projections.orgs1;" ``` --- ### Zitadel Troubleshooting **Database Connection Errors:** If logs show `failed SASL auth: FATAL: password authentication failed for user "postgres"`: ```bash # Check systemd unit has correct credentials sudo incus exec directory -- cat /etc/systemd/system/directory.service # Verify Vault has the correct credentials TOKEN="${VAULT_TOKEN}" sudo incus exec system -- curl -s --cacert /opt/gbo/conf/system/certificates/ca/ca.crt \ -H "X-Vault-Token: $TOKEN" \ https://10.157.134.250:8200/v1/secret/data/gbo/tables # If credentials changed, update systemd unit and restart sudo incus exec directory -- systemctl daemon-reload sudo incus exec directory -- systemctl restart directory ``` **Empty Database (No Organizations):** If the database was initialized but tables are missing: ```bash # Check if tables exist sudo incus exec tables -- psql -h localhost -U postgres -d PROD-DIRECTORY -c \ "SELECT tablename FROM pg_tables WHERE schemaname = 'projections' LIMIT 5;" # If no tables, reinitialize using the steps above ``` **PAT Token Invalid:** If API calls return `Errors.Token.Invalid (AUTH-7fs1e)`: ```bash # Check if PAT file exists sudo incus exec directory -- cat /opt/gbo/conf/directory/admin-pat.txt # If missing or expired, regenerate via console or API: # 1. Login to console: http://:9000/ui/console # 2. Go to Profile → Personal Access Tokens → Create # 3. Save the new token to admin-pat.txt ``` **Health Check Fails:** ```bash # Check service status sudo incus exec directory -- systemctl status directory # Check logs sudo incus exec directory -- tail -50 /opt/gbo/logs/stderr.log sudo incus exec directory -- tail -50 /opt/gbo/logs/stdout.log # Verify database connectivity sudo incus exec directory -- pg_isready -h 10.157.134.174 -p 5432 -U postgres ``` **Migration Errors:** If migrations fail or database is in bad state: ```bash # Stop service sudo incus exec directory -- systemctl stop directory # Drop and recreate database sudo incus exec tables -- psql -h localhost -U postgres -d postgres -c "DROP DATABASE IF EXISTS \"PROD-DIRECTORY\";" sudo incus exec tables -- psql -h localhost -U postgres -d postgres -c "CREATE DATABASE \"PROD-DIRECTORY\";" # Reinitialize (see initialization steps above) ``` **Systemd Unit Configuration:** The `directory.service` unit contains all environment variables: ```ini [Unit] Description=Directory (Zitadel) After=network.target [Service] User=root Group=root WorkingDirectory=/opt/gbo Environment=ZITADEL_DATABASE_POSTGRES_HOST=10.157.134.174 Environment=ZITADEL_DATABASE_POSTGRES_PORT=5432 Environment=ZITADEL_DATABASE_POSTGRES_DATABASE=PROD-DIRECTORY Environment=ZITADEL_DATABASE_POSTGRES_USER_USERNAME=postgres Environment=ZITADEL_DATABASE_POSTGRES_USER_PASSWORD=67a690df Environment=ZITADEL_DATABASE_POSTGRES_USER_SSL_MODE=disable Environment=ZITADEL_EXTERNALSECURE=false Environment=ZITADEL_EXTERNALDOMAIN=10.157.134.240 Environment=ZITADEL_EXTERNALPORT=9000 Environment=ZITADEL_TLS_ENABLED=false ExecStart=/opt/gbo/bin/zitadel start --masterkey MasterkeyNeedsToHave32Characters --tlsMode disabled --externalDomain 10.157.134.240 --externalPort 9000 Restart=always RestartSec=5 StandardOutput=append:/opt/gbo/logs/stdout.log StandardError=append:/opt/gbo/logs/stderr.log [Install] WantedBy=multi-user.target ``` --- ## Common Operations **Check status:** `sudo incus exec system -- systemctl status botserver --no-pager` (same for `ui`). To check process existence: `sudo incus exec system -- pgrep -f botserver`. **View logs:** For systemd journal: `sudo incus exec system -- journalctl -u botserver --no-pager -n 50`. For application logs: `sudo incus exec system -- tail -50 /opt/gbo/logs/out.log` or `err.log`. For live tail: `sudo incus exec system -- tail -f /opt/gbo/logs/out.log`. **Restart:** `sudo incus exec system -- systemctl restart botserver` and same for `ui`. Never run the binary directly. **Emergency manual deploy:** Kill the old process with `sudo incus exec system -- killall botserver`, copy the new binary from `/opt/gbo/ci/botserver/target/debug/botserver` to `/opt/gbo/bin/botserver`, set permissions with `chmod +x` and `chown gbuser:gbuser`, then start with `systemctl start botserver`. **Transfer bot files:** Archive locally with `tar czf /tmp/bots.tar.gz -C /opt/gbo/data .gbai`, copy to host with `scp`, then extract inside container with `sudo incus exec system -- bash -c 'tar xzf /tmp/bots.tar.gz -C /opt/gbo/data/'`. Clear compiled cache with `find /opt/gbo/data -name "*.ast" -delete` and same for `/opt/gbo/work`. **Snapshots:** `sudo incus snapshot list system` to list, `sudo incus snapshot restore system ` to restore. --- ## CI/CD Pipeline Repositories exist on both GitHub and the internal ALM (Forgejo). The four repos are `gb` (main workspace), `botserver`, `botui`, and `botlib`. Always push submodules first (`cd botserver && git push alm main && git push origin main`), then update submodule references in the root repo and push that too. The CI runner container (`alm-ci`) runs Debian 12 Bookworm with glibc 2.36, same as the `system` container. Binaries compiled on the CI runner are compatible with the system container. The CI workflow (`botserver/.forgejo/workflows/botserver.yaml`) builds in alm-ci (which has Rust toolchain) and deploys binary to system container. The workflow triggers on pushes to `main`, clones repos, builds in alm-ci, transfers binary via scp, and verifies botserver is running. ### ALM/CI Debugging & Monitoring **Access ALM/CI containers:** ```bash ssh administrator@ sudo incus exec alm-ci -- bash # CI runner container sudo incus exec tables -- bash # PostgreSQL (ALM database) sudo incus exec system -- bash # botserver container ``` **Check CI runner status:** ```bash # Runner process sudo incus exec alm-ci -- ps aux | grep forgejo # Runner logs sudo incus exec alm-ci -- cat /opt/gbo/logs/forgejo-runner.log # If runner is down, restart: sudo incus exec alm-ci -- pkill -9 forgejo; sleep 2; cd /opt/gbo/bin && nohup ./forgejo-runner daemon --config config.yaml >> /opt/gbo/logs/forgejo-runner.log 2>&1 & ``` **Monitor CI runs in database:** ```bash # Status codes: 0=pending, 1=success, 2=failure, 3=cancelled, 6=running sudo incus exec tables -- bash -c 'export PGPASSWORD=; psql -h localhost -U postgres -d PROD-ALM -c "SELECT id, status, commit_sha, created FROM action_run ORDER BY id DESC LIMIT 5;"' # Check specific run jobs sudo incus exec tables -- bash -c 'export PGPASSWORD=; psql -h localhost -U postgres -d PROD-ALM -c "SELECT id, status, name FROM action_run_job WHERE run_id = ;"' # Check tasks sudo incus exec tables -- bash -c 'export PGPASSWORD=; psql -h localhost -U postgres -d PROD-ALM -c "SELECT id, status FROM action_task WHERE repo_id = 3 ORDER BY id DESC LIMIT 3;"' # Reset stuck run to re-trigger sudo incus exec tables -- bash -c 'export PGPASSWORD=; psql -h localhost -U postgres -d PROD-ALM -c "UPDATE action_task SET status = 0 WHERE id = ; UPDATE action_run_job SET status = 0 WHERE id = ; UPDATE action_run SET status = 0 WHERE id = ;"' ``` **Fix common CI issues:** ```bash # /tmp permission denied for build.log sudo incus exec alm-ci -- chmod 1777 /tmp sudo incus exec alm-ci -- touch /tmp/build.log && chmod 666 /tmp/build.log # Clean old CI runs (keep recent) sudo incus exec tables -- bash -c 'export PGPASSWORD=; psql -h localhost -U postgres -d PROD-ALM -c "DELETE FROM action_run WHERE id < ;"' sudo incus exec tables -- bash -c 'export PGPASSWORD=; psql -h localhost -U postgres -d PROD-ALM -c "DELETE FROM action_run_job WHERE run_id < ;"' # Check deploy.log missing error - fix workflow step # The Save deploy log step expects /tmp/deploy.log which the workflow doesn't create # Fix: ensure deploy step outputs to /tmp/deploy.log ``` **Watch CI in real-time:** ```bash # Tail runner logs sudo incus exec alm-ci -- tail -f /opt/gbo/logs/forgejo-runner.log # Check if new builds appear watch -n 5 'sudo incus exec tables -- bash -c "export PGPASSWORD=; psql -h localhost -U postgres -d PROD-ALM -c \"SELECT id, status, created FROM action_run ORDER BY id DESC LIMIT 3;\""' # Verify botserver deployed correctly sudo incus exec system -- /opt/gbo/bin/botserver --version 2>&1 | head -3 sudo incus exec system -- tail -5 /opt/gbo/logs/err.log ``` **CI Workflow Structure:** 1. Setup Git (disable SSL verify, add safe directories) 2. Setup Workspace (clone/merge gb workspace Cargo.toml) 3. Install system dependencies 4. Clean up workspaces 5. Build BotServer (output to /tmp/build.log) 6. Save build log 7. Deploy via ssh tar gzip 8. Verify botserver started 9. Save deploy log --- ## DriveMonitor & Bot Configuration DriveMonitor is a background service inside botserver that watches MinIO buckets and syncs changes to the local filesystem and database every 10 seconds. It monitors three directory types per bot: the `.gbdialog/` folder for BASIC scripts (downloads and recompiles on change), the `.gbot/` folder for `config.csv` (syncs to the `bot_configuration` database table), and the `.gbkb/` folder for knowledge base documents (downloads and indexes for vector search). Bot configuration is stored in two PostgreSQL tables inside the `botserver` database. The `bot_configuration` table holds key-value pairs with columns `bot_id`, `config_key`, `config_value`, `config_type`, `is_encrypted`, and `updated_at`. The `gbot_config_sync` table tracks sync state with columns `bot_id`, `config_file_path`, `last_sync_at`, `file_hash`, and `sync_count`. The `config.csv` format is a plain CSV with no header: each line is `key,value`, for example `llm-provider,groq` or `theme-color1,#cc0000`. DriveMonitor syncs it when the file ETag changes in MinIO, on botserver startup, or after a restart. **Check config status:** Query `bot_configuration` via `sudo incus exec tables -- psql -h localhost -U postgres -d botserver -c "SELECT config_key, config_value FROM bot_configuration WHERE bot_id = (SELECT id FROM bots WHERE name = '') ORDER BY config_key;"`. Check sync state via the `gbot_config_sync` table. Inspect the bucket directly with `sudo incus exec drive -- /opt/gbo/bin/mc cat local/.gbai/.gbot/config.csv`. **Debug DriveMonitor:** Monitor live logs with `sudo incus exec system -- tail -f /opt/gbo/logs/out.log | grep -E "(DRIVE_MONITOR|check_gbot|config)"`. An empty `gbot_config_sync` table means DriveMonitor has not synced yet. If no new log entries appear after 30 seconds, the loop may be stuck — restart botserver with systemctl to clear the state. **Common config issues:** If config.csv is missing from the bucket, create and upload it with `mc cp`. If the database shows stale values, restart botserver to force a fresh sync, or as a temporary fix update the database directly with `UPDATE bot_configuration SET config_value = 'groq', updated_at = NOW() WHERE ...`. To force a re-sync without restarting, copy config.csv over itself with `mc cp local/... local/...` to change the ETag. --- ## MinIO (Drive) Operations All bot files live in MinIO buckets. Use the `mc` CLI at `/opt/gbo/bin/mc` from inside the `drive` container. The bucket structure per bot is: `{bot}.gbai/` as root, `{bot}.gbai/{bot}.gbdialog/` for BASIC scripts, `{bot}.gbai/{bot}.gbot/` for config.csv, and `{bot}.gbai/{bot}.gbkb/` for knowledge base folders. Common mc commands: `mc ls local/` lists all buckets; `mc ls local/salesianos.gbai/` lists a bucket; `mc cat local/.../start.bas` prints a file; `mc cp local/.../file /tmp/file` downloads; `mc cp /tmp/file local/.../file` uploads (this triggers DriveMonitor recompile); `mc stat local/.../config.csv` shows ETag and metadata; `mc mb local/newbot.gbai` creates a bucket; `mc rb local/oldbot.gbai` removes an empty bucket. If mc is not found, use the full path `/opt/gbo/bin/mc`. If alias `local` is not configured, check with `mc config host list`. If MinIO is not running, check with `sudo incus exec drive -- systemctl status minio`. --- ## Vault Security Architecture HashiCorp Vault is the single source of truth for all secrets. Botserver reads `VAULT_ADDR` and `VAULT_TOKEN` from `/opt/gbo/bin/.env` at startup, initializes a TLS/mTLS client, then reads credentials from Vault paths. If Vault is unavailable, it falls back to defaults. The `.env` file must only contain `VAULT_*` variables plus `PORT`, `DATA_DIR`, `WORK_DIR`, and `LOAD_ONLY`. **Global Vault paths:** `gbo/tables` holds PostgreSQL credentials; `gbo/drive` holds MinIO access key and secret; `gbo/cache` holds Valkey password; `gbo/llm` holds LLM URL and API keys; `gbo/directory` holds Zitadel config; `gbo/email` holds SMTP credentials; `gbo/vectordb` holds Qdrant config; `gbo/jwt` holds JWT signing secret; `gbo/encryption` holds the master encryption key. Organization-scoped secrets follow patterns like `gbo/orgs/{org_id}/bots/{bot_id}` and tenant infrastructure uses `gbo/tenants/{tenant_id}/infrastructure`. **Credential resolution:** For any service, botserver checks the most specific Vault path first (org+bot level), falls back to a default bot path, then falls back to the global path, and only uses environment variables as a last resort in development. **Verify Vault health:** `sudo incus exec vault -- curl -k -sf https://localhost:8200/v1/sys/health` should return JSON with `"sealed":false`. To read a secret: set `VAULT_ADDR`, `VAULT_TOKEN`, and `VAULT_CACERT` then run `vault kv get secret/gbo/tables`. To test from the system container, use curl with `--cacert /opt/gbo/conf/system/certificates/ca/ca.crt` and `-H "X-Vault-Token: "`. **init.json** is stored at `/opt/gbo/bin/botserver-stack/conf/vault/vault-conf/init.json` and contains the root token and 5 unseal keys (3 needed to unseal). Never commit this file to git. Store it encrypted in a secure location. **Vault troubleshooting — cannot connect:** Check that the vault container's systemd unit is running, verify the token in `.env` is not expired with `vault token lookup`, confirm the CA cert path in `.env` matches the actual file location, and test network connectivity from system to vault container. To generate a new token: `vault token create -policy="botserver" -ttl="8760h" -format=json` then update `.env` and restart botserver. # Get database credentials from Vault v2 API $ ssh user@ "sudo incus exec system -- curl -s --cacert /opt/gbo/conf/system/certificates/ca/ca.crt -H 'X-Vault-Token: ' https://:8200/v1/secret/data/gbo/tables 2>/dev/null" **Vault troubleshooting — secrets missing:** Run `vault kv get secret/gbo/tables` (and other paths) to check if secrets exist. If a path returns NOT FOUND, add secrets with `vault kv put secret/gbo/tables host= port=5432 database=botserver username=gbuser password=` and similar for other paths. **Vault sealed after restart:** Run `vault operator unseal `, repeat with key2 and key3 (3 of 5 keys from init.json), then verify with `vault status`. **TLS certificate errors:** Confirm `/opt/gbo/conf/system/certificates/ca/ca.crt` exists in the system container. If missing, copy it from the vault container using `incus file pull vault/opt/gbo/conf/vault/ca.crt /tmp/ca.crt` then place it at the expected path. **Vault snapshots:** Stop vault, run `sudo incus snapshot create vault backup-$(date +%Y%m%d-%H%M)`, start vault. Restore with `sudo incus snapshot restore vault ` while vault is stopped. --- ## Incus Container Network Configuration ### Static IPv4 Address Assignment When creating new containers, they may not receive IPv4 addresses automatically. To assign permanent static IPs: **Step 1: Set static IP on the container device** ```bash # Choose an unused IP in the 10.157.134.x range sudo incus config device set eth0 ipv4.address 10.157.134. ``` **Step 2: Configure network inside the container** ```bash sudo incus exec -- bash -c 'cat > /etc/network/interfaces << EOF auto lo iface lo inet loopback auto eth0 iface eth0 inet static address 10.157.134. netmask 255.255.255.0 gateway 10.157.134.1 dns-nameservers 8.8.8.8 8.8.4.4 EOF' ``` **Step 3: Restart the container** ```bash sudo incus restart ``` **Step 4: Verify IPv4 assignment** ```bash sudo incus list -c n4 sudo incus exec -- ip addr show eth0 ``` ### Common Network Issues | Problem | Symptom | Fix | |---------|---------|-----| | No IPv4 | Container shows empty IPV4 column | Set static IP via `incus config device set` | | IP conflict | "IP address already defined on another NIC" | Choose different IP, check `incus list` | | Can't reach internet | DNS fails inside container | Configure DNS in `/etc/network/interfaces` | | IPv6 only | Has IPv6 but no IPv4 | Add static IPv4 config as above | | DHCP not working | dhclient fails or returns 169.254.x.x | Use static IP assignment instead | ### Container IP Reference Standard IP assignments (10.157.134.x range): - `system`: 10.157.134.196 - `tables`: 10.157.134.174 - `vault`: 10.157.134.250 - `cache`: 10.157.134.230 - `drive`: 10.157.134.206 - `directory`: 10.157.134.240 - `llm`: 10.157.134.205 - `vectordb`: 10.157.134.210 - `models`: 10.157.134.251 (reserved) - `dns`: 10.157.134.214 - `proxy`: 10.157.134.241 - `email`: 10.157.134.40 - `meet`: 10.157.134.220 ### Creating a New Container with Static IP ```bash # Create container sudo incus launch images:debian/12 # Set static IP (before first boot is best) sudo incus config device set eth0 ipv4.address 10.157.134. # Configure networking inside container sudo incus exec -- bash -c 'cat > /etc/network/interfaces << EOF auto lo iface lo inet loopback auto eth0 iface eth0 inet static address 10.157.134. netmask 255.255.255.0 gateway 10.157.134.1 dns-nameservers 8.8.8.8 EOF' # Restart to apply sudo incus restart # Verify sudo incus list ``` --- ## Troubleshooting Quick Reference **GLIBC mismatch (`GLIBC_2.39 not found`):** The binary was compiled on the CI runner (glibc 2.41) not inside the system container (glibc 2.36). The CI workflow must SSH into the system container to build. Check `botserver.yaml` to confirm this. **botserver won't start:** Run `sudo incus exec system -- ldd /opt/gbo/bin/botserver | grep "not found"` to check for missing libraries. Run `sudo incus exec system -- timeout 10 /opt/gbo/bin/botserver 2>&1` to see startup errors. Confirm `/opt/gbo/data/` exists and is accessible. **botui can't reach botserver:** Check that the `ui.service` systemd file has `BOTSERVER_URL=http://localhost:5858` — not the external HTTPS URL. Fix with `sed -i 's|BOTSERVER_URL=.*|BOTSERVER_URL=http://localhost:5858|'` on the service file, then `systemctl daemon-reload` and `systemctl restart ui`. **Suggestions not showing:** Confirm bot `.bas` files exist under `/opt/gbo/data/.gbai/.gbdialog/`. Check logs for compilation errors. Clear the AST cache in `/opt/gbo/work/` and restart botserver. **IPv6 DNS timeouts on external APIs (Groq, Cloudflare):** The container's DNS may return AAAA records without IPv6 connectivity. The container should have `IPV6=no` in its network config and `gai.conf` set appropriately. Check for `RES_OPTIONS=inet4` in `botserver.service` if issues persist. **Logs show development paths instead of `/opt/gbo/data/`:** Botserver is using hardcoded dev paths. Check `.env` has `DATA_DIR=/opt/gbo/data/` and `WORK_DIR=/opt/gbo/work/`, verify the systemd unit has `EnvironmentFile=/opt/gbo/bin/.env`, and confirm Vault is reachable so service discovery works. Expected startup log lines include `info watcher:Watching data directory /opt/gbo/data` and `info botserver:BotServer started successfully on port 5858`. **Migrations not running after push:** If `stat /opt/gbo/bin/botserver` shows old timestamp and `__diesel_schema_migrations` table has no new entries, CI did not rebuild. Make a trivial code change (e.g., add a comment) in botserver and push again to force rebuild. --- ## Drive (MinIO) File Operations Cheatsheet All `mc` commands run inside the `drive` container with `PATH` set: `sudo incus exec drive -- bash -c 'export PATH=/opt/gbo/bin:$PATH && mc '`. If `local` alias is missing, create it with credentials from Vault path `gbo/drive`. **List bucket contents recursively:** `mc ls local/.gbai/ --recursive` **Read a file from Drive:** `mc cat local/.gbai/.gbdialog/start.bas` **Download a file:** `mc cp local/.gbai/.gbdialog/start.bas /tmp/start.bas` **Upload a file to Drive (triggers DriveMonitor recompile):** Transfer file to host via `scp`, push into drive container with `sudo incus file push /tmp/file drive/tmp/file`, then `mc put /tmp/file local/.gbai/.gbdialog/start.bas` **Full upload workflow example — updating config.csv:** ```bash # 1. Download current config from Drive ssh user@host "sudo incus exec drive -- bash -c 'export PATH=/opt/gbo/bin:\$PATH && mc cat local/salesianos.gbai/salesianos.gbot/config.csv'" > /tmp/config.csv # 2. Edit locally (change model, keys, etc.) sed -i 's/llm-model,old-model/llm-model,new-model/' /tmp/config.csv # 3. Push edited file back to Drive scp /tmp/config.csv user@host:/tmp/config.csv ssh user@host "sudo incus file push /tmp/config.csv drive/tmp/config.csv" ssh user@host "sudo incus exec drive -- bash -c 'export PATH=/opt/gbo/bin:\$PATH && mc put /tmp/config.csv local/salesianos.gbai/salesianos.gbot/config.csv'" # 4. Wait ~15 seconds, then verify DriveMonitor picked up the change ssh user@host "sudo incus exec system -- bash -c 'grep -i \"Model:\" /opt/gbo/logs/err.log | tail -3'" ``` **Force re-sync of config.csv** (change ETag without content change): `mc cp local/.gbai/.gbot/config.csv local/.gbai/.gbot/config.csv` **Create a new bot bucket:** `mc mb local/newbot.gbai` **Check MinIO health:** `sudo incus exec drive -- bash -c '/opt/gbo/bin/mc admin info local'` --- ## Logging Quick Reference **Application logs** (searchable, timestamped, most useful): `sudo incus exec system -- tail -f /opt/gbo/logs/err.log` (errors and debug) or `/opt/gbo/logs/out.log` (stdout). The systemd journal only captures process lifecycle events, not application output. **Search logs for specific bot activity:** `grep -i "salesianos\|llm\|Model:\|KB\|USE_KB\|drive_monitor" /opt/gbo/logs/err.log | tail -30` **Check which LLM model a bot is using:** `grep "Model:" /opt/gbo/logs/err.log | tail -5` **Check DriveMonitor config sync:** `grep "check_gbot\|config.csv\|should_sync" /opt/gbo/logs/err.log | tail -20` **Check KB/vector operations:** `grep -i "gbkb\|qdrant\|embedding\|index" /opt/gbo/logs/err.log | tail -20` **Live tail with filter:** `sudo incus exec system -- bash -c 'tail -f /opt/gbo/logs/err.log | grep --line-buffered -i "salesianos\|error\|KB"'` --- ## Program Access Cheatsheet | Program | Container | Path | Notes | |---------|-----------|------|-------| | botserver | system | `/opt/gbo/bin/botserver` | Run via systemctl only | | botui | system | `/opt/gbo/bin/botui` | Run via systemctl only | | mc (MinIO Client) | drive | `/opt/gbo/bin/mc` | Must set `PATH=/opt/gbo/bin:$PATH` | | psql | tables | `/usr/bin/psql` | `psql -h localhost -U postgres -d botserver` | | vault | vault | `/opt/gbo/bin/vault` | Needs `VAULT_ADDR`, `VAULT_TOKEN`, `VAULT_CACERT` | | zitadel | directory | `/opt/gbo/bin/zitadel` | Runs as root on port 8080 internally | **Quick psql query — bot config:** `sudo incus exec tables -- psql -h localhost -U postgres -d botserver -c "SELECT config_key, config_value FROM bot_configuration WHERE bot_id = (SELECT id FROM bots WHERE name = 'salesianos') ORDER BY config_key;"` **Quick psql query — active KBs for session:** `sudo incus exec tables -- psql -h localhost -U postgres -d botserver -c "SELECT * FROM session_kb_associations WHERE session_id = '' AND is_active = true;"` --- ## BASIC Compilation Architecture Compilation and runtime are now strictly separated. **Compilation** happens only in `BasicCompiler` inside DriveMonitor when it detects `.bas` file changes. The output is a fully preprocessed `.ast` file written to `work/.gbai/.gbdialog/.ast`. **Runtime** (start.bas, TOOL_EXEC, automation, schedule) loads only `.ast` files and calls `ScriptService::run()` which does `engine.compile() + eval_ast_with_scope()` on the already-preprocessed Rhai source — no preprocessing at runtime. The `.ast` file has all transforms applied: `USE KB "cartas"` becomes `USE_KB("cartas")`, `IF/END IF` → `if/{ }`, `WHILE/WEND` → `while/{ }`, `BEGIN TALK/END TALK` → function calls, `SAVE`, `FOR EACH/NEXT`, `SELECT CASE`, `SET SCHEDULE`, `WEBHOOK`, `USE WEBSITE`, `LLM` keyword expansion, variable predeclaration, and keyword lowercasing. Runtime never calls `compile()`, `compile_tool_script()`, or `compile_preprocessed()` — those methods no longer exist. **Tools (TOOL_EXEC) load `.ast` only** — there is no `.bas` fallback. If an `.ast` file is missing, the tool fails with "Failed to read tool .ast file". DriveMonitor must have compiled it first. **Suggestion deduplication** uses Redis `SADD` (set) instead of `RPUSH` (list). This prevents duplicate suggestion buttons when `start.bas` runs multiple times per session. The key format is `suggestions:{bot_id}:{session_id}` and `get_suggestions` uses `SMEMBERS` to read it.