diff --git a/prompts/prod.md b/prompts/prod.md new file mode 100644 index 0000000..d7a7c37 --- /dev/null +++ b/prompts/prod.md @@ -0,0 +1,259 @@ +# Production Environment Guide + +## Infrastructure + +### Servers + +| Host | IP | Purpose | +|------|-----|---------| +| `system` | `10.157.134.196` | Main botserver + botui container | +| `alm-ci` | `10.157.134.200` | CI/CD runner (Forgejo Actions) | +| `alm` | `10.157.134.34` | Forgejo git server | +| `dns` | `10.157.134.214` | DNS container | +| `drive` | `10.157.134.206` | Drive storage | +| `email` | `10.157.134.40` | Email service | +| `proxy` | `10.157.134.241` | Reverse proxy | +| `tables` | `10.157.134.174` | PostgreSQL | +| `table-editor` | `10.157.134.184` | Table editor | +| `webmail` | `10.157.134.86` | Webmail | + +### Port Mapping (system container) + +| Service | Internal Port | External URL | +|---------|--------------|--------------| +| botserver | `5858` | `https://system.pragmatismo.com.br` | +| botui | `5859` | `https://chat.pragmatismo.com.br` | + +### Access + +```bash +# SSH to host +ssh administrator@63.141.255.9 + +# Execute inside system container +sudo incus exec system -- bash -c 'command' + +# SSH from host to container (used by CI) +ssh -o StrictHostKeyChecking=no system "command" +``` + +## Services + +### botserver.service + +- **Binary**: `/opt/gbo/bin/botserver` +- **Port**: `5858` +- **User**: `gbuser` +- **Logs**: `/opt/gbo/logs/out.log`, `/opt/gbo/logs/err.log` +- **Config**: `/etc/systemd/system/botserver.service` +- **Env**: `PORT=5858` + +### ui.service + +- **Binary**: `/opt/gbo/bin/botui` +- **Port**: `5859` +- **Config**: `/etc/systemd/system/ui.service` +- **Env**: `BOTSERVER_URL=http://localhost:5858` + - ⚠️ MUST be `http://localhost:5858` — NOT `https://system.pragmatismo.com.br` + - Rust proxy runs server-side, needs direct localhost access + - JS client uses relative URLs through `chat.pragmatismo.com.br` + +### Data Directory + +- **Path**: `/opt/gbo/data/` +- **Structure**: `.gbai/.gbdialog/*.bas` +- **Bots**: cristo, fema, jucees, oerlabs, poupatempo, pragmatismogb, salesianos, sentient, seplagse +- **Work dir**: `/opt/gbo/work/` (compiled .ast cache) + +### Stack Services (managed by botserver bootstrap) + +- **Vault**: Secrets management +- **PostgreSQL**: Database (port 5432) +- **Valkey**: Cache (port 6379, password auth) +- **MinIO**: Object storage +- **Zitadel**: Identity provider +- **LLM**: llama.cpp + +## CI/CD Pipeline + +### Repositories + +| Repo | ALM URL | GitHub URL | +|------|---------|------------| +| gb | `https://alm.pragmatismo.com.br/GeneralBots/gb.git` | `git@github.com:GeneralBots/gb.git` | +| botserver | `https://alm.pragmatismo.com.br/GeneralBots/BotServer.git` | `git@github.com:GeneralBots/botserver.git` | +| botui | `https://alm.pragmatismo.com.br/GeneralBots/BotUI.git` | `git@github.com:GeneralBots/botui.git` | +| botlib | `https://alm.pragmatismo.com.br/GeneralBots/botlib.git` | `git@github.com:GeneralBots/botlib.git` | + +### Push Order + +```bash +# 1. Push submodules first +cd botserver && git push alm main && git push origin main && cd .. +cd botui && git push alm main && git push origin main && cd .. + +# 2. Update root workspace references +git add botserver botui botlib +git commit -m "Update submodules: " +git push alm main && git push origin main +``` + +### Build Environment + +- **CI runner**: `alm-ci` container (Debian Trixie, glibc 2.41) +- **Target**: `system` container (Debian 12 Bookworm, glibc 2.36) +- **⚠️ GLIBC MISMATCH**: Building on CI runner produces binaries incompatible with system container +- **Solution**: CI workflow transfers source to system container and builds there via SSH + +### Workflow File + +- **Location**: `botserver/.forgejo/workflows/botserver.yaml` +- **Triggers**: Push to `main` branch +- **Steps**: + 1. Setup workspace on CI runner (clone repos) + 2. Transfer source to system container via `tar | ssh` + 3. Build inside system container (matches glibc 2.36) + 4. Deploy binary inside container + 5. Verify botserver is running + +## Common Operations + +### Check Service Status + +```bash +# From host +sudo incus exec system -- systemctl status botserver --no-pager +sudo incus exec system -- systemctl status ui --no-pager + +# Check if running +sudo incus exec system -- pgrep -f botserver +sudo incus exec system -- pgrep -f botui +``` + +### View Logs + +```bash +# Systemd journal +sudo incus exec system -- journalctl -u botserver --no-pager -n 50 +sudo incus exec system -- journalctl -u ui --no-pager -n 50 + +# Application logs +sudo incus exec system -- tail -50 /opt/gbo/logs/out.log +sudo incus exec system -- tail -50 /opt/gbo/logs/err.log + +# Live tail +sudo incus exec system -- tail -f /opt/gbo/logs/out.log +``` + +### Restart Services + +```bash +sudo incus exec system -- systemctl restart botserver +sudo incus exec system -- systemctl restart ui +``` + +### Manual Deploy (emergency) + +```bash +# Kill old process +sudo incus exec system -- killall botserver + +# Copy binary (from host CI workspace or local) +sudo incus exec system -- cp /opt/gbo/ci/botserver/target/debug/botserver /opt/gbo/bin/botserver +sudo incus exec system -- chmod +x /opt/gbo/bin/botserver +sudo incus exec system -- chown gbuser:gbuser /opt/gbo/bin/botserver + +# Start service +sudo incus exec system -- systemctl start botserver +``` + +### Transfer Bot Files to Production + +```bash +# From local to prod host +tar czf /tmp/bots.tar.gz -C /opt/gbo/data .gbai +scp /tmp/bots.tar.gz administrator@63.141.255.9:/tmp/ + +# From host to container +sudo incus exec system -- bash -c 'tar xzf /tmp/bots.tar.gz -C /opt/gbo/data/' + +# Clear compiled cache +sudo incus exec system -- find /opt/gbo/data -name "*.ast" -delete +sudo incus exec system -- find /opt/gbo/work -name "*.ast" -delete +``` + +### Snapshots + +```bash +# List snapshots +sudo incus snapshot list system + +# Restore snapshot +sudo incus snapshot restore system +``` + +## Troubleshooting + +### GLIBC Version Mismatch + +**Symptom**: `GLIBC_2.39 not found` or `GLIBC_2.38 not found` + +**Cause**: Binary compiled on CI runner (glibc 2.41) but runs in system container (glibc 2.36) + +**Fix**: CI workflow must build inside the system container. Check `botserver.yaml` uses SSH to build in container. + +### botserver Not Starting + +```bash +# Check binary +sudo incus exec system -- ldd /opt/gbo/bin/botserver | grep "not found" + +# Check direct execution +sudo incus exec system -- timeout 10 /opt/gbo/bin/botserver 2>&1 + +# Check data directory +sudo incus exec system -- ls -la /opt/gbo/data/ +``` + +### botui Can't Reach botserver + +```bash +# Check BOTSERVER_URL +sudo incus exec system -- grep BOTSERVER_URL /etc/systemd/system/ui.service + +# Must be http://localhost:5858, NOT https://system.pragmatismo.com.br +# Fix: +sudo incus exec system -- sed -i 's|BOTSERVER_URL=.*|BOTSERVER_URL=http://localhost:5858|' /etc/systemd/system/ui.service +sudo incus exec system -- systemctl daemon-reload +sudo incus exec system -- systemctl restart ui +``` + +### Suggestions Not Showing + +```bash +# Check bot files exist +sudo incus exec system -- ls -la /opt/gbo/data/.gbai/.gbdialog/ + +# Check for compilation errors +sudo incus exec system -- tail -50 /opt/gbo/logs/out.log | grep -i "error\|fail\|compile" + +# Clear cache and restart +sudo incus exec system -- find /opt/gbo/work -name "*.ast" -delete +sudo incus exec system -- systemctl restart botserver +``` + +### IPv6 DNS Issues + +**Symptom**: External API calls (Groq, Cloudflare) timeout + +**Cause**: Container DNS returns AAAA records but no IPv6 connectivity + +**Fix**: Container has `IPV6=no` in network config and `gai.conf` labels. If issues persist, check `RES_OPTIONS=inet4` in botserver.service. + +## Security + +- **NEVER** push secrets to git +- **NEVER** commit files to root with credentials +- **Vault** is single source of truth for secrets +- **CI/CD** is the only deployment method — never manually scp binaries +- **ALM** is production — ask before pushing