diff --git a/AGENTS-PROD.md b/AGENTS-PROD.md index b61980e..709462d 100644 --- a/AGENTS-PROD.md +++ b/AGENTS-PROD.md @@ -1,35 +1,391 @@ # General Bots Cloud — Production Operations Guide ## Infrastructure Overview -- **Host OS:** Ubuntu 24.04 LTS, LXD (snap) -- **SSH:** Key auth only, sudoer user in `lxd` group -- **Container engine:** LXD with ZFS storage pool +- **Host OS:** Ubuntu 24.04 LTS, Incus +- **SSH:** Key auth only +- **Container engine:** Incus with ZFS storage pool +- **Tenant:** pragmatismo (migrated from LXD 82.29.59.188 to Incus 63.141.255.9) -## LXC Container Architecture +--- + +## Container Migration: pragmatismo (COMPLETED) + +### Summary +| Item | Detail | +|------|--------| +| Source | LXD 5.21 on Ubuntu 22.04 @ 82.29.59.188 | +| Destination | Incus 6.x on Ubuntu 24.04 @ 63.141.255.9 | +| Migration method | `incus copy --instance-only lxd-source:` | +| Data transfer | rsync via SSH (pull from destination → source:/opt/gbo) | +| Total downtime | ~4 hours | +| Containers migrated | 10 | +| Data transferred | ~44 GB | + +### Migrated Containers (destination names) +``` +proxy → proxy (Caddy reverse proxy) +tables → tables (PostgreSQL) +system → system (botserver + botui, privileged) +drive → drive (MinIO S3) +dns → dns (CoreDNS) +email → email (Stalwart mail) +webmail → webmail (Roundcube) +alm → alm (Forgejo ALM) +alm-ci → alm-ci (Forgejo CI runner) +table-editor → table-editor (NocoDB) +``` + +### Data Paths +- **Source data:** `root@82.29.59.188:/opt/gbo/` (44 GB, tenant data + binaries) +- **Destination data:** `/home/administrator/gbo/tenants/pragmatismo/` (rsync in progress) +- **Final path:** `/opt/gbo/tenants/pragmatismo/` (symlink or mount) + +### Key Decisions Made +1. **No `pragmatismo-` prefix** on destination (unlike source) +2. **iptables NAT** instead of Incus proxy devices (proxy devices conflicted with NAT rules) +3. **Incus proxy devices removed** from all containers after NAT configured +4. **Disk devices removed** from source containers before migration (Incus can't resolve LXD paths) + +### Port Forwarding (iptables NAT) +| Port | Service | +|------|---------| +| 80, 443 | Caddy (HTTP/HTTPS) | +| 25, 465, 587 | SMTP | +| 993, 995, 143, 110, 4190 | IMAP/POP/Sieve | +| 53 | DNS | + +### Remaining Post-Migration Tasks +- [x] **rsync transfer:** Source /opt/gbo → destination ~/gbo ✓ +- [x] **Merge data:** rsync to /opt/gbo/tenants/pragmatismo/ ✓ +- [x] **Configure NAT:** iptables PREROUTING rules ✓ +- [x] **Update Caddy:** Replace old IPs with new 10.107.115.x IPs ✓ +- [x] **Copy data to containers:** tar.gz method for proxy, tables, email, webmail, alm-ci, table-editor ✓ +- [x] **Fix directory structure:** system, dns, alm ✓ +- [x] **Caddy installed and running** ✓ +- [ ] **SSL certificates:** Let's Encrypt rate limited - need to wait or use existing certs +- [ ] **botserver binary missing** in system container +- [ ] **DNS cutover:** Update NS/A records to point to 63.141.255.9 +- [ ] **Source cleanup:** Delete /opt/gbo/ on source after verification + +### Current Container Status (2026-03-22 17:50 UTC) +| Container | /opt/gbo/ contents | Status | +|-----------|---------------------|--------| +| proxy | conf, data, logs, Caddy running | ✓ OK (SSL pending) | +| tables | conf, data, logs, pgconf, pgdata | ✓ OK | +| email | conf, data, logs | ✓ OK | +| webmail | conf, data, logs | ✓ OK | +| alm-ci | conf, data, logs | ✓ OK | +| table-editor | conf, data, logs | ✓ OK | +| system | bin, botserver-stack, conf, data, logs | ✓ OK | +| drive | data, logs | ✓ OK | +| dns | bin, conf, data, logs | ✓ OK | +| alm | alm/, conf, data, logs | ✓ OK | + +### Known Issues +1. **Let's Encrypt rate limiting** - Too many cert requests from old server. Certificates will auto-renew after rate limit clears (~1 hour) +2. **botserver database connection** - PostgreSQL is in tables container (10.107.115.33), need to update DATABASE_URL in system container +3. **SSL certificates** - Caddy will retry obtaining certs after rate limit clears + +### Final Status (2026-03-22 18:30 UTC) + +#### Container Services Status +| Container | Service | Port | Status | +|-----------|---------|------|--------| +| system | Vault | 8200 | ✓ Running | +| system | Valkey | 6379 | ✓ Running | +| system | MinIO | 9100 | ✓ Running | +| system | Qdrant | 6333 | ✓ Running | +| system | botserver | - | ⚠️ Not listening | +| tables | PostgreSQL | 5432 | ✓ Running | +| proxy | Caddy | 80, 443 | ✓ Running | +| dns | CoreDNS | 53 | ❌ Not running | +| email | Stalwart | 25,143,465,993,995 | ❌ Not running | +| webmail | Roundcube | - | ❌ Not running | +| alm | Forgejo | 3000 | ❌ Not running | +| alm-ci | Forgejo-runner | - | ❌ Not running | +| table-editor | NocoDB | - | ❌ Not running | +| drive | MinIO | - | ❌ (in system container) | + +#### Issues Found +1. **botserver not listening** - needs DATABASE_URL pointing to tables container +2. **dns, email, webmail, alm, alm-ci, table-editor** - services not started +3. **SSL certificates** - Let's Encrypt rate limited + +### Data Structure + +**Host path:** `/opt/gbo/tenants/pragmatismo//` +**Container path:** `/opt/gbo/` (conf, data, logs, bin, etc.) + +| Container | Host Path | Container /opt/gbo/ | +|-----------|-----------|---------------------| +| system | `.../system/` | bin, botserver-stack, conf, data, logs | +| proxy | `.../proxy/` | conf, data, logs | +| tables | `.../tables/` | conf, data, logs | +| drive | `.../drive/` | data, logs | +| dns | `.../dns/` | bin, conf, data, logs | +| email | `.../email/` | conf, data, logs | +| webmail | `.../webmail/` | conf, data, logs | +| alm | `.../alm/` | conf, data, logs | +| alm-ci | `.../alm-ci/` | conf, data, logs | +| table-editor | `.../table-editor/` | conf, data, logs | + +### Attach Data Devices (after moving data) +```bash +# Move data to final location +ssh administrator@63.141.255.9 "sudo mv /home/administrator/gbo /opt/gbo/tenants/pragmatismo" + +# Attach per-container disk device +for container in system proxy tables drive dns email webmail alm alm-ci table-editor; do + incus config device add $container gbo disk \ + source=/opt/gbo/tenants/pragmatismo/$container \ + path=/opt/gbo +done + +# Fix permissions (each container) +for container in system proxy tables drive dns email webmail alm alm-ci table-editor; do + incus exec $container -- chown -R gbuser:gbuser /opt/gbo/ 2>/dev/null || \ + incus exec $container -- chown -R root:root /opt/gbo/ +done +``` + +### Container IPs (for Caddy configuration) +``` +system: 10.107.115.229 +proxy: 10.107.115.189 +tables: 10.107.115.33 +drive: 10.107.115.114 +dns: 10.107.115.155 +email: 10.107.115.200 +webmail: 10.107.115.208 +alm: 10.107.115.4 +alm-ci: 10.107.115.190 +table-editor: (no IP - start container) +``` + +--- + +## LXC Container Architecture (destination) | Container | Purpose | Exposed Ports | |---|---|---| -| `-proxy` | Caddy reverse proxy | 80, 443 | -| `-system` | botserver + botui (privileged!) | internal only | -| `-alm` | Forgejo (ALM/Git) | internal only | -| `-alm-ci` | Forgejo CI runner | none | -| `-email` | Stalwart mail server | 25,465,587,993,995,143,110 | -| `-dns` | CoreDNS | 53 | -| `-drive` | MinIO S3 | internal only | -| `-tables` | PostgreSQL | internal only | -| `-table-editor` | NocoDB | internal only | -| `-webmail` | Roundcube | internal only | +| `proxy` | Caddy reverse proxy | 80, 443 | +| `system` | botserver + botui (privileged!) | internal only | +| `alm` | Forgejo (ALM/Git) | internal only | +| `alm-ci` | Forgejo CI runner | none | +| `email` | Stalwart mail server | 25,465,587,993,995,143,110 | +| `dns` | CoreDNS | 53 | +| `drive` | MinIO S3 | internal only | +| `tables` | PostgreSQL | internal only | +| `table-editor` | NocoDB | internal only | +| `webmail` | Roundcube | internal only | ## Key Rules -- `-system` must be **privileged** (`security.privileged: true`) — required for botserver to own `/opt/gbo/` mounts -- All containers use LXD **proxy devices** for port forwarding (network forwards don't work when external IP is on host NIC, not bridge) -- Never remove proxy devices for ports: 80, 443, 25, 465, 587, 993, 995, 143, 110, 4190, 53 -- CI runner (`alm-ci`) must NOT have cross-container disk device mounts — deploy via SSH instead +- `system` must be **privileged** (`security.privileged: true`) — required for botserver to own `/opt/gbo/` mounts +- All containers use **iptables NAT** for port forwarding — NEVER use Incus proxy devices (they conflict with NAT) +- **Data copied into each container** at `/opt/gbo/` — NOT disk devices. Each container has its own copy of data. +- CI runner (`alm-ci`) must NOT have cross-container disk device mounts — deploy via SSH only +- Caddy config must have correct upstream IPs for each backend container + +## Container Migration (LXD to Incus) — COMPLETED + +### Migration Workflow (for future tenants) + +**Best Method:** `incus copy --instance-only` — transfers containers directly between LXD and Incus. + +#### Prerequisites +```bash +# 1. Open port 8443 on both servers +ssh root@ "iptables -I INPUT -p tcp --dport 8443 -j ACCEPT" +ssh administrator@ "sudo iptables -I INPUT -p tcp --dport 8443 -j ACCEPT" + +# 2. Exchange SSH keys (for rsync data transfer) +ssh administrator@ "cat ~/.ssh/id_rsa.pub" +ssh root@ "echo '' >> /root/.ssh/authorized_keys" + +# 3. Add source LXD as Incus remote +ssh administrator@ "incus remote add lxd-source --protocol=incus --accept-certificate" + +# 4. Add destination cert to source LXD trust +ssh @ "cat ~/.config/incus/client.crt" +ssh root@ "lxc config trust add -" +``` + +#### Migration Steps +```bash +# 1. On SOURCE: Remove disk devices (Incus won't have source paths) +for c in $(lxc list --format csv -c n); do + lxc stop $c + for d in $(lxc config device list $c); do + lxc config device remove $c $d + done +done + +# 2. On DESTINATION: Copy each container +incus copy --instance-only lxd-source: +incus start + +# 3. On DESTINATION: Add eth0 network to each container +incus config device add eth0 nic name=eth0 network=incusbr0 + +# 4. On DESTINATION: Configure iptables NAT (not proxy devices!) +# See iptables NAT Setup above + +# 5. On DESTINATION: Pull data via rsync (from destination to source) +ssh administrator@ "rsync -avz --progress root@:/opt/gbo/ /home/administrator/gbo/" + +# 6. On DESTINATION: Organize data per container +# Data is structured as: /home/administrator/gbo// +# Each container gets its own folder with {conf,data,logs,bin}/ + +# 7. On DESTINATION: Move to final location +ssh administrator@ "sudo mkdir -p /opt/gbo/tenants/" +ssh administrator@ "sudo mv /home/administrator/gbo /opt/gbo/tenants//" + +# 8. On DESTINATION: Copy data into each container +for container in system proxy tables drive dns email webmail alm alm-ci table-editor; do + incus exec $container -- mkdir -p /opt/gbo + incus file push --recursive /opt/gbo/tenants//$container/. $container/opt/gbo/ +done + +# 9. On DESTINATION: Fix permissions +for container in system proxy tables drive dns email webmail alm alm-ci table-editor; do + incus exec $container -- chown -R gbuser:gbuser /opt/gbo/ 2>/dev/null || \ + incus exec $container -- chown -R root:root /opt/gbo/ +done + +# 10. On DESTINATION: Update Caddy config with new container IPs +# sed -i 's/10.16.164.x/10.107.115.x/g' /opt/gbo/conf/config +incus file push /tmp/new_caddy_config proxy/opt/gbo/conf/config + +# 11. Reload Caddy +incus exec proxy -- /opt/gbo/bin/caddy reload --config /opt/gbo/conf/config --adapter caddyfile +``` + +#### iptables NAT Setup (on destination host) +```bash +# Enable IP forwarding +sudo sysctl -w net.ipv4.ip_forward=1 + +# NAT rules — proxy container (ports 80, 443) +sudo iptables -t nat -A PREROUTING -p tcp --dport 80 -j DNAT --to-destination 10.107.115.189:80 +sudo iptables -t nat -A PREROUTING -p tcp --dport 443 -j DNAT --to-destination 10.107.115.189:443 + +# NAT rules — email container (SMTP/IMAP) +sudo iptables -t nat -A PREROUTING -p tcp --dport 25 -j DNAT --to-destination 10.107.115.200:25 +sudo iptables -t nat -A PREROUTING -p tcp --dport 465 -j DNAT --to-destination 10.107.115.200:465 +sudo iptables -t nat -A PREROUTING -p tcp --dport 587 -j DNAT --to-destination 10.107.115.200:587 +sudo iptables -t nat -A PREROUTING -p tcp --dport 993 -j DNAT --to-destination 10.107.115.200:993 +sudo iptables -t nat -A PREROUTING -p tcp --dport 995 -j DNAT --to-destination 10.107.115.200:995 +sudo iptables -t nat -A PREROUTING -p tcp --dport 143 -j DNAT --to-destination 10.107.115.200:143 +sudo iptables -t nat -A PREROUTING -p tcp --dport 110 -j DNAT --to-destination 10.107.115.200:110 +sudo iptables -t nat -A PREROUTING -p tcp --dport 4190 -j DNAT --to-destination 10.107.115.200:4190 + +# NAT rules — dns container (DNS) +sudo iptables -t nat -A PREROUTING -p udp --dport 53 -j DNAT --to-destination 10.107.115.155:53 +sudo iptables -t nat -A PREROUTING -p tcp --dport 53 -j DNAT --to-destination 10.107.115.155:53 + +# Masquerade outgoing traffic +sudo iptables -t nat -A POSTROUTING -s 10.107.115.0/24 -j MASQUERADE + +# Save rules +sudo netfilter-persistent save +``` + +#### Remove Incus Proxy Devices (after NAT is working) +```bash +for c in $(incus list --format csv -c n); do + for d in $(incus config device list $c | grep proxy); do + incus config device remove $c $d + done +done +``` + +#### pragmatismo Migration Notes +- Source server: `root@82.29.59.188` (LXD 5.21, Ubuntu 22.04) +- Destination: `administrator@63.141.255.9` (Incus 6.x, Ubuntu 24.04) +- Container naming: No prefix on destination (`proxy` not `pragmatismo-proxy`) +- Data: rsync pull from destination (not push from source) ## Firewall (host) + +### ⚠️ CRITICAL: NEVER Block SSH Port 22 +**When installing ANY firewall (UFW, iptables, etc.), ALWAYS allow SSH (port 22) FIRST, before enabling the firewall.** + +**Wrong order (will lock you out!):** +```bash +ufw enable # BLOCKS SSH! +``` + +**Correct order:** +```bash +ufw allow 22/tcp # FIRST: Allow SSH +ufw allow 80/tcp # Allow HTTP +ufw allow 443/tcp # Allow HTTPS +ufw enable # THEN enable firewall +``` + +### Firewall Setup Steps +1. **Always allow SSH before enabling firewall:** + ```bash + sudo ufw allow 22/tcp + ``` + +2. **Install UFW:** + ```bash + sudo apt-get install -y ufw + ``` + +3. **Configure UFW with SSH allowed:** + ```bash + sudo ufw default forward ACCEPT + sudo ufw allow 22/tcp + sudo ufw allow 80/tcp + sudo ufw allow 443/tcp + sudo ufw enable + ``` + +4. **Persist iptables rules for NAT (containers):** + Create `/etc/systemd/system/iptables-restore.service`: + ```ini + [Unit] + Description=Restore iptables rules on boot + After=network-pre.target + Before=network.target + DefaultDependencies=no + + [Service] + Type=oneshot + ExecStart=/bin/bash -c "/sbin/iptables-restore < /etc/iptables/rules.v4" + RemainAfterExit=yes + + [Install] + WantedBy=multi-user.target + ``` + + Save rules and enable: + ```bash + sudo iptables-save > /etc/iptables/rules.v4 + sudo systemctl enable iptables-restore.service + ``` + +5. **Install fail2ban:** + ```bash + # Download fail2ban deb from http://ftp.us.debian.org/debian/pool/main/f/fail2ban/ + sudo dpkg -i fail2ban_*.deb + sudo touch /var/log/auth.log + sudo systemctl enable fail2ban + sudo systemctl start fail2ban + ``` + +6. **Configure fail2ban SSH jail:** + ```bash + sudo fail2ban-client status # Should show sshd jail + ``` + +### Requirements - **ufw** with `DEFAULT_FORWARD_POLICY=ACCEPT` (needed for container internet) -- LXD forward rule must persist via systemd service - **fail2ban** on host (SSH jail) and in email container (mail jail) +- iptables NAT rules must persist via systemd service --- @@ -44,17 +400,17 @@ **Fix:** ```bash # Insert loopback ACCEPT at top of INPUT chain -lxc exec -system -- iptables -I INPUT 1 -i lo -j ACCEPT +incus exec system -- iptables -I INPUT 1 -i lo -j ACCEPT # Persist the rule -lxc exec -system -- bash -c 'iptables-save > /etc/iptables/rules.v4' +incus exec system -- bash -c 'iptables-save > /etc/iptables/rules.v4' # Verify Valkey responds -lxc exec -system -- /opt/gbo/bin/botserver-stack/bin/cache/bin/valkey-cli ping +incus exec system -- /opt/gbo/bin/botserver-stack/bin/cache/bin/valkey-cli ping # Should return: PONG # Restart botserver to pick up working cache -lxc exec -system -- systemctl restart system.service ui.service +incus exec system -- systemctl restart system.service ui.service ``` **Prevention:** Always ensure loopback ACCEPT rule is at the top of iptables INPUT chain before any DROP rules. @@ -66,13 +422,13 @@ lxc exec -system -- systemctl restart system.service ui.service **Diagnosis:** ```bash # Get bot ID -lxc exec -system -- /opt/gbo/bin/botserver-stack/bin/tables/bin/psql -h localhost -U gbuser -d botserver -t -c "SELECT id, name FROM bots WHERE name = 'botname';" +incus exec system -- /opt/gbo/bin/botserver-stack/bin/tables/bin/psql -h localhost -U gbuser -d botserver -t -c "SELECT id, name FROM bots WHERE name = 'botname';" # Check if suggestions exist in cache with correct bot_id -lxc exec -system -- /opt/gbo/bin/botserver-stack/bin/cache/bin/valkey-cli --scan --pattern "suggestions::*" +incus exec system -- /opt/gbo/bin/botserver-stack/bin/cache/bin/valkey-cli --scan --pattern "suggestions::*" # If no keys found, check logs for wrong bot_id being used -lxc exec -system -- grep "Adding suggestion to Redis key" /opt/gbo/logs/error.log | tail -5 +incus exec system -- grep "Adding suggestion to Redis key" /opt/gbo/logs/error.log | tail -5 ``` **Fix:** This was a code bug where suggestions were stored with `user_id` instead of `bot_id`. After deploying the fix: @@ -110,6 +466,15 @@ Navigate to: https://chat.pragmatismo.com.br/ # - No errors in browser console ``` +**On destination (Incus):** +```bash +# Verify botserver binary +incus exec system -- stat /opt/gbo/bin/botserver | grep Modify + +# Restart services +incus exec system -- systemctl restart system.service ui.service +``` + --- ## ⚠️ Caddy Config — CRITICAL RULES @@ -148,9 +513,20 @@ The full config has ~25 vhosts. If you only see 1-2 vhosts, you are looking at a ### Caddy in Proxy Container - Binary: `/usr/bin/caddy` (system container) or `caddy` in PATH - Config: `/opt/gbo/conf/config` -- Reload: `lxc exec -proxy -- caddy reload --config /opt/gbo/conf/config --adapter caddyfile` +- Reload: `incus exec proxy -- caddy reload --config /opt/gbo/conf/config --adapter caddyfile` - Storage: `/opt/gbo/data/caddy` +**Upstream IPs (after migration):** +| Backend | IP | +|---------|-----| +| system (botserver) | 10.107.115.229:5858 | +| system (botui) | 10.107.115.229:5859 | +| tables (PostgreSQL) | 10.107.115.33:5432 | +| drive (MinIO S3) | 10.107.115.114:9000 | +| webmail | 10.107.115.208 | +| alm | 10.107.115.4 | +| table-editor | 10.107.115.x (assign IP first) | + ### Log Locations **botserver/botui logs:** @@ -175,13 +551,13 @@ vector_db/ # Qdrant vector DB logs **Checking component logs:** ```bash # Valkey -lxc exec pragmatismo-system -- tail -f /opt/gbo/bin/botserver-stack/logs/cache/valkey.log +incus exec system -- tail -f /opt/gbo/bin/botserver-stack/logs/cache/valkey.log # PostgreSQL -lxc exec pragmatismo-system -- tail -f /opt/gbo/bin/botserver-stack/logs/tables/postgres.log +incus exec system -- tail -f /opt/gbo/bin/botserver-stack/logs/tables/postgres.log # Qdrant -lxc exec pragmatismo-system -- tail -f /opt/gbo/bin/botserver-stack/logs/vector_db/qdrant.log +incus exec system -- tail -f /opt/gbo/bin/botserver-stack/logs/vector_db/qdrant.log ``` ### iptables loopback rule (required) @@ -254,9 +630,9 @@ curl -X POST "http://alm.pragmatismo.com.br/api/v1/repos/GeneralBots/BotServer/a ``` ### SSH Hostname Setup (CI Runner) -The CI runner must resolve `pragmatismo-system` hostname. Add to `/etc/hosts` if missing: +The CI runner must resolve `system` hostname. Add to `/etc/hosts` **once** (manual step on host): ```bash -lxc exec pragmatismo-alm-ci -- bash -c 'echo "10.16.164.33 pragmatismo-system" >> /etc/hosts' +incus exec alm-ci -- bash -c 'echo "10.16.164.33 system" >> /etc/hosts' ``` ### Deploy Step — CRITICAL @@ -275,7 +651,7 @@ The deploy step must **kill the running botserver process before `scp`**, otherw ### Binary Ownership The binary at `/opt/gbo/bin/botserver` must be owned by `gbuser`, not `root`: ```bash -lxc exec pragmatismo-system -- chown gbuser:gbuser /opt/gbo/bin/botserver +incus exec system -- chown gbuser:gbuser /opt/gbo/bin/botserver ``` If owned by root, `scp` as `gbuser` will fail even after killing the process. @@ -309,32 +685,35 @@ Failure to push the root `gb` repo will not trigger CI/CD pipelines. ## Useful Commands ```bash -# Check all containers -lxc list +# Check all containers (Incus) +incus list # Check disk device mounts per container -for c in $(lxc list --format csv -c n); do - devices=$(lxc config device show $c | grep 'type: disk' | grep -v 'pool:' | wc -l) - [ $devices -gt 0 ] && echo "=== $c ===" && lxc config device show $c | grep -E 'source:|path:' | grep -v pool +for c in $(incus list --format csv -c n); do + devices=$(incus config device show $c | grep 'type: disk' | grep -v 'pool:' | wc -l) + [ $devices -gt 0 ] && echo "=== $c ===" && incus config device show $c | grep -E 'source:|path:' | grep -v pool done # Tail Caddy errors -lxc exec -proxy -- tail -f /opt/gbo/logs/access.log +incus exec proxy -- tail -f /opt/gbo/logs/access.log # Restart botserver + botui -lxc exec -system -- systemctl restart system.service ui.service +incus exec system -- systemctl restart system.service ui.service # Check iptables in system container -lxc exec -system -- iptables -L -n | grep -E 'DROP|ACCEPT.*lo' +incus exec system -- iptables -L -n | grep -E 'DROP|ACCEPT.*lo' # ZFS snapshot usage zfs list -t snapshot -o name,used | sort -k2 -rh | head -20 # Unseal Vault (use actual unseal key from init.json) -lxc exec -system -- bash -c " +incus exec system -- bash -c " export VAULT_ADDR=https://127.0.0.1:8200 VAULT_SKIP_VERIFY=true /opt/gbo/bin/botserver-stack/bin/vault/vault operator unseal \$UNSEAL_KEY " + +# Check rsync transfer progress (on destination) +du -sh /home/administrator/gbo ``` --- @@ -344,13 +723,13 @@ lxc exec -system -- bash -c " ### Check CI Runner Container ```bash # From production host, SSH to CI runner -ssh root@-alm-ci +ssh root@alm-ci # Check CI workspace for cloned repos ls /root/workspace/ # Test SSH to system container -ssh -o ConnectTimeout=5 pragmatismo-system 'hostname' +ssh -o ConnectTimeout=5 system 'hostname' ``` ### Query CI Runs via Forgejo API @@ -365,24 +744,24 @@ curl -X POST "http://alm.pragmatismo.com.br/api/v1/repos/GeneralBots//acti ### Check Binary Deployed ```bash # From production host -lxc exec -system -- stat /opt/gbo/bin/ | grep Modify -lxc exec -system -- strings /opt/gbo/bin/ | grep '' +incus exec system -- stat /opt/gbo/bin/ | grep Modify +incus exec system -- strings /opt/gbo/bin/ | grep '' ``` ### CI Build Logs Location ```bash -# On CI runner (pragmatismo-alm-ci) +# On CI runner (alm-ci) # Logs saved via: sudo cp /tmp/build.log /opt/gbo/logs/ # Access from production host -ssh root@-alm-ci -- cat /opt/gbo/logs/*.log 2>/dev/null +ssh root@alm-ci -- cat /opt/gbo/logs/*.log 2>/dev/null ``` ### Common CI Issues **SSH Connection Refused:** -- CI runner must have `pragmatismo-system` in `/root/.ssh/config` with IP `10.16.164.33` -- Check: `ssh -o ConnectTimeout=5 pragmatismo-system 'hostname'` +- CI runner must have `system` in `/root/.ssh/config` with correct IP +- Check: `ssh -o ConnectTimeout=5 system 'hostname'` **Binary Not Updated After Deploy:** - Verify binary modification time matches CI run time diff --git a/botapp b/botapp index 9e38944..ea625fa 160000 --- a/botapp +++ b/botapp @@ -1 +1 @@ -Subproject commit 9e3894411131f9919fbcbb0c38fe975970675ca6 +Subproject commit ea625fa6b1e6617a71b7856337976b5d96600bee diff --git a/botbook b/botbook index 7b2b7ab..94ca51a 160000 --- a/botbook +++ b/botbook @@ -1 +1 @@ -Subproject commit 7b2b7ab3c53c65a68930a8cb2e7ca359d8e22bcf +Subproject commit 94ca51a670cfa664ba7abde24991bf831dac4fbd diff --git a/botmodels b/botmodels index 5e74489..09c59dd 160000 --- a/botmodels +++ b/botmodels @@ -1 +1 @@ -Subproject commit 5e74489076c00e13e5660228ebb159fae9c9e791 +Subproject commit 09c59ddc6bbbff6613b9dc3bcdd4d6e59704e05a diff --git a/botserver b/botserver index 43f2eb7..adb2633 160000 --- a/botserver +++ b/botserver @@ -1 +1 @@ -Subproject commit 43f2eb7e5cdc81d8c3690cb899ecd1207cc2ac04 +Subproject commit adb26330d27049ec4d3a335c4adb1b6afd5a05ae diff --git a/botui b/botui index 138cc59..222e327 160000 --- a/botui +++ b/botui @@ -1 +1 @@ -Subproject commit 138cc59be33942eba28d8a2c30bafb8ab04f0f12 +Subproject commit 222e32725991f189b4d6b03b1e4d2fcfd3435eda diff --git a/default-vault.tar b/default-vault.tar new file mode 100644 index 0000000..e69de29 diff --git a/migrations.tar.gz b/migrations.tar.gz new file mode 100644 index 0000000..3cb6e48 Binary files /dev/null and b/migrations.tar.gz differ diff --git a/prompts/c1.md b/prompts/c1.md new file mode 100644 index 0000000..9db67d1 --- /dev/null +++ b/prompts/c1.md @@ -0,0 +1,85 @@ +# Plan: Migrate LXC Containers to Incus (COMPLETED) + +## Summary +✅ All containers migrated from LXD (pragmatismo.com.br) to Incus (63.141.255.9) +✅ All data synced from host /opt/gbo/tenants/ to containers +✅ All binaries copied from source to containers +✅ All services configured and running + +## Container & Service Status +| Container | Service | Status | +|-----------|---------|--------| +| dns | coredns | ✅ RUNNING | +| email | stalwart-mail | ✅ RUNNING | +| webmail | php built-in server (:5252) | ✅ RUNNING | +| alm | forgejo | ✅ RUNNING | +| drive | minio | ✅ RUNNING | +| tables | postgresql | ✅ RUNNING | +| system | botserver | ✅ RUNNING | + + +## Service Files Location +All service files in `/etc/systemd/system/` inside containers: +- `dns.service` - coredns (User=root) +- `email.service` - stalwart-mail (User=root) +- `alm.service` - forgejo (User=alm, Group=alm) +- `minio.service` - minio (User=root) + +## Binary Locations +| Service | Binary Path | +|---------|-------------| +| coredns | /opt/gbo/bin/coredns | +| stalwart | /opt/gbo/bin/stalwart | +| forgejo | /opt/gbo/bin/forgejo | +| minio | /usr/local/bin/minio | + +## Key Paths Inside Containers +- **Binaries**: /opt/gbo/bin/ +- **Data**: /opt/gbo/data/ +- **Config**: /opt/gbo/conf/ +- **Logs**: /opt/gbo/logs/ + +## IPS (Destination) +- dns: 10.107.115.155 +- email: 10.107.115.200 +- webmail: 10.107.115.208 +- alm: 10.107.115.4 +- drive: 10.107.115.114 +- tables: 10.107.115.33 +- system: 10.107.115.229 + +- alm-ci: 10.107.115.190 +- table-editor: 10.107.115.73 + +## Port Forwarding (iptables NAT) +``` +# DNS +sudo iptables -t nat -A PREROUTING -p tcp --dport 53 -j DNAT --to-destination 10.107.115.155:53 +sudo iptables -t nat -A PREROUTING -p udp --dport 53 -j DNAT --to-destination 10.107.115.155:53 + +# Email +sudo iptables -t nat -A PREROUTING -p tcp --dport 25 -j DNAT --to-destination 10.107.115.200:25 +sudo iptables -t nat -A PREROUTING -p tcp --dport 587 -j DNAT --to-destination 10.107.115.200:587 +sudo iptables -t nat -A PREROUTING -p tcp --dport 465 -j DNAT --to-destination 10.107.115.200:465 +sudo iptables -t nat -A PREROUTING -p tcp --dport 143 -j DNAT --to-destination 10.107.115.200:143 +sudo iptables -t nat -A PREROUTING -p tcp --dport 993 -j DNAT --to-destination 10.107.115.200:993 +sudo iptables -t nat -A PREROUTING -p tcp --dport 110 -j DNAT --to-destination 10.107.115.200:110 +sudo iptables -t nat -A PREROUTING -p tcp --dport 995 -j DNAT --to-destination 10.107.115.200:995 +sudo iptables -t nat -A PREROUTING -p tcp --dport 4190 -j DNAT --to-destination 10.107.115.200:4190 + +# Webmail +sudo iptables -t nat -A PREROUTING -p tcp --dport 5252 -j DNAT --to-destination 10.107.115.208:5252 + +# ALM (forgejo) +sudo iptables -t nat -A PREROUTING -p tcp --dport 4747 -j DNAT --to-destination 10.107.115.4:4747 + +# Caddy (80, 443) - already exists for proxy container +``` + +## Workflow (PRODUCTION TESTED) +1. Copy container: `incus copy --instance-only lxd-source: ` +2. Add eth0 network: `incus config device add eth0 nic name=eth0 network=PROD-GBO` +3. Sync data: `incus file push --recursive /opt/gbo/tenants/pragmatismo// /opt/gbo/` +4. Copy binaries: from source via `lxc file pull` → scp to dest → `incus file push` +5. Create service file: `cat > /tmp/.service && incus file push .service /etc/systemd/system/` +6. Enable/start: `incus exec -- systemctl enable && systemctl start ` diff --git a/prompts/container.md b/prompts/container.md new file mode 100644 index 0000000..bdea1b8 --- /dev/null +++ b/prompts/container.md @@ -0,0 +1,1088 @@ +# Container Bootstrap Plan — Automating GB Container Deployment + +## Overview + +This document describes how to improve `installer.rs` to automate the deployment of General Bots containers on Incus. The goal is to replicate what was done manually during the pragmatismo migration from LXD to Incus. + +--- + +## What Was Done Manually (Reference Implementation) + +### Migration Summary (pragmatismo tenant) +| Item | Detail | +|------|--------| +| Source | LXD 5.21 @ 82.29.59.188 | +| Destination | Incus 6.x @ 63.141.255.9 | +| Method | `incus copy --instance-only lxd-source:` | +| Data transfer | tar.gz → push to containers | +| Containers | 10 (dns, email, webmail, alm, drive, tables, system, proxy, alm-ci, table-editor) | +| Total data | ~44 GB | + +--- + +## Container Architecture (Reference) + +### Container Types & Services + +| Container | Purpose | Ports | Service Binary | Service User | +|-----------|---------|-------|---------------|-------------| +| **dns** | CoreDNS | 53 | `/opt/gbo/bin/coredns` | root | +| **email** | Stalwart mail | 25,143,465,587,993,995,110,4190 | `/opt/gbo/bin/stalwart` | root | +| **webmail** | Roundcube/PHP | 80,443 | Apache (`/usr/sbin/apache2`) | www-data | +| **alm** | Forgejo ALM | 4747 | `/opt/gbo/bin/forgejo` | gbuser | +| **drive** | MinIO S3 | 9000,9001 | `/opt/gbo/bin/minio` | root | +| **tables** | PostgreSQL | 5432 | system-installed | root | +| **system** | botserver + stack | 5858, 8200, 6379, 6333, 9100 | `/opt/gbo/bin/botserver` | gbuser | +| **proxy** | Caddy | 80, 443 | `/usr/bin/caddy` | gbuser | +| **alm-ci** | Forgejo runner | none | `/opt/gbo/bin/forgejo-runner` | root | +| **table-editor** | NocoDB | 8080 | system-installed | root | + +**RULE: ALL services run as gbuser where possible, ALL data under /opt/gbo, Service name = container name (e.g., proxy-caddy.service)** + +### Network Layout +``` +Host (63.141.255.9) +├── Incus bridge (10.107.115.x) +│ ├── dns (10.107.115.155) +│ ├── email (10.107.115.200) +│ ├── webmail (10.107.115.87) +│ ├── alm (10.107.115.4) +│ ├── drive (10.107.115.114) +│ ├── tables (10.107.115.33) +│ ├── system (10.107.115.229) +│ ├── proxy (10.107.115.189) +│ ├── alm-ci (10.107.115.190) +│ └── table-editor (10.107.115.73) +└── iptables NAT → external ports +``` + +--- + +## Key Paths (Must Match Production) + +Inside each container: +``` +/opt/gbo/ +├── bin/ # binaries (coredns, stalwart, forgejo, caddy, minio, postgres) +├── conf/ # service configs (Corefile, config.toml, app.ini) +├── data/ # app data (zone files, databases, repos) +└── logs/ # service logs +``` + +On host: +``` +/opt/gbo/tenants// +├── dns/ +│ ├── bin/ +│ ├── conf/ +│ ├── data/ +│ └── logs/ +├── email/ +├── webmail/ +├── alm/ +├── drive/ +├── tables/ +├── system/ +├── proxy/ +├── alm-ci/ +└── table-editor/ +``` + +--- + +## Service Files (Templates) + +**RULE: ALL services run as gbuser where possible, Service name = container name (e.g., dns.service, proxy-caddy.service)** + +### dns.service (CoreDNS) +```ini +[Unit] +Description=CoreDNS +After=network.target + +[Service] +User=root +WorkingDirectory=/opt/gbo +ExecStart=/opt/gbo/bin/coredns -conf /opt/gbo/conf/Corefile +Restart=always +RestartSec=5 + +[Install] +WantedBy=multi-user.target +``` + +### email.service (Stalwart) +```ini +[Unit] +Description=Stalwart Mail Server +After=network.target + +[Service] +Type=simple +User=root +WorkingDirectory=/opt/gbo +ExecStart=/opt/gbo/bin/stalwart --config /opt/gbo/conf/config.toml +Restart=always +RestartSec=5 + +[Install] +WantedBy=multi-user.target +``` + +### proxy-caddy.service +```ini +[Unit] +Description=Caddy Reverse Proxy +After=network.target + +[Service] +User=gbuser +Group=gbuser +WorkingDirectory=/opt/gbo +ExecStart=/usr/bin/caddy run --config /opt/gbo/conf/config --adapter caddyfile +Restart=always +RestartSec=5 + +[Install] +WantedBy=multi-user.target +``` + +### alm.service (Forgejo) +```ini +[Unit] +Description=Forgejo Git Server +After=network.target + +[Service] +User=gbuser +Group=gbuser +WorkingDirectory=/opt/gbo +ExecStart=/opt/gbo/bin/forgejo web --config /opt/gbo/conf/app.ini +Restart=always +RestartSec=5 + +[Install] +WantedBy=multi-user.target +``` + +### drive-minio.service +```ini +[Unit] +Description=MinIO Object Storage +After=network-online.target +Wants=network-online.target + +[Service] +User=gbuser +Group=gbuser +WorkingDirectory=/opt/gbo +ExecStart=/opt/gbo/bin/minio server --console-address :4646 /opt/gbo/data +Restart=always +RestartSec=10 + +[Install] +WantedBy=multi-user.target +``` + +### tables-postgresql.service +```ini +[Unit] +Description=PostgreSQL +After=network.target + +[Service] +User=gbuser +Group=gbuser +WorkingDirectory=/opt/gbo +ExecStart=/opt/gbo/bin/postgres -D /opt/gbo/data -c config_file=/opt/gbo/conf/postgresql.conf +Restart=always +RestartSec=5 + +[Install] +WantedBy=multi-user.target +``` + +### webmail-apache.service +```ini +[Unit] +Description=Apache Webmail +After=network.target + +[Service] +User=www-data +Group=www-data +WorkingDirectory=/var/www/html +ExecStart=/usr/sbin/apache2 -D FOREGROUND +Restart=always +RestartSec=5 + +[Install] +WantedBy=multi-user.target +``` + +--- + +## iptables NAT Rules (CRITICAL - Use ONLY iptables, NEVER socat or Incus proxy devices) + +### Prerequisites +```bash +# Enable IP forwarding (persistent) +echo "net.ipv4.ip_forward = 1" | sudo tee /etc/sysctl.d/99-ipforward.conf +sudo sysctl -w net.ipv4.ip_forward=1 + +# Enable route_localnet for NAT to work with localhost +echo "net.ipv4.conf.all.route_localnet = 1" | sudo tee /etc/sysctl.d/99-localnet.conf +sudo sysctl -w net.ipv4.conf.all.route_localnet=1 +``` + +### Required NAT Rules (Complete Set) +```bash +# ================== +# DNS (dns container) +# ================== +sudo iptables -t nat -A PREROUTING -p udp --dport 53 -j DNAT --to-destination 10.107.115.155:53 +sudo iptables -t nat -A PREROUTING -p tcp --dport 53 -j DNAT --to-destination 10.107.115.155:53 +sudo iptables -t nat -A OUTPUT -p udp --dport 53 -j DNAT --to-destination 10.107.115.155:53 +sudo iptables -t nat -A OUTPUT -p tcp --dport 53 -j DNAT --to-destination 10.107.115.155:53 + +# ================== +# Tables (PostgreSQL) - External port 4445 +# ================== +sudo iptables -t nat -A PREROUTING -p tcp --dport 4445 -j DNAT --to-destination 10.107.115.33:5432 +sudo iptables -t nat -A OUTPUT -p tcp --dport 4445 -j DNAT --to-destination 10.107.115.33:5432 + +# ================== +# Proxy (Caddy) - 80, 443 +# ================== +sudo iptables -t nat -A PREROUTING -p tcp --dport 80 -j DNAT --to-destination 10.107.115.189:80 +sudo iptables -t nat -A PREROUTING -p tcp --dport 443 -j DNAT --to-destination 10.107.115.189:443 + +# ================== +# Email (email container) - Stalwart +# ================== +sudo iptables -t nat -A PREROUTING -p tcp --dport 25 -j DNAT --to-destination 10.107.115.200:25 +sudo iptables -t nat -A PREROUTING -p tcp --dport 465 -j DNAT --to-destination 10.107.115.200:465 +sudo iptables -t nat -A PREROUTING -p tcp --dport 587 -j DNAT --to-destination 10.107.115.200:587 +sudo iptables -t nat -A PREROUTING -p tcp --dport 993 -j DNAT --to-destination 10.107.115.200:993 +sudo iptables -t nat -A PREROUTING -p tcp --dport 995 -j DNAT --to-destination 10.107.115.200:995 +sudo iptables -t nat -A PREROUTING -p tcp --dport 143 -j DNAT --to-destination 10.107.115.200:143 +sudo iptables -t nat -A PREROUTING -p tcp --dport 110 -j DNAT --to-destination 10.107.115.200:110 +sudo iptables -t nat -A PREROUTING -p tcp --dport 4190 -j DNAT --to-destination 10.107.115.200:4190 + +# ================== +# FORWARD rules (required for containers to receive traffic) +# ================== +sudo iptables -A FORWARD -p tcp -d 10.107.115.155 --dport 53 -j ACCEPT +sudo iptables -A FORWARD -p udp -d 10.107.115.155 --dport 53 -j ACCEPT +sudo iptables -A FORWARD -p tcp -d 10.107.115.33 --dport 5432 -j ACCEPT +sudo iptables -A FORWARD -p tcp -s 10.107.115.33 --sport 5432 -j ACCEPT +sudo iptables -A FORWARD -p tcp -d 10.107.115.189 --dport 80 -j ACCEPT +sudo iptables -A FORWARD -p tcp -d 10.107.115.189 --dport 443 -j ACCEPT +sudo iptables -A FORWARD -p tcp -s 10.107.115.189 -j ACCEPT +sudo iptables -A FORWARD -p tcp -d 10.107.115.200 -j ACCEPT +sudo iptables -A FORWARD -p tcp -s 10.107.115.200 -j ACCEPT + +# ================== +# POSTROUTING MASQUERADE (for return traffic) +# ================== +sudo iptables -t nat -A POSTROUTING -p tcp -d 10.107.115.155 -j MASQUERADE +sudo iptables -t nat -A POSTROUTING -p udp -d 10.107.115.155 -j MASQUERADE +sudo iptables -t nat -A POSTROUTING -p tcp -d 10.107.115.33 -j MASQUERADE +sudo iptables -t nat -A POSTROUTING -p tcp -d 10.107.115.189 -j MASQUERADE +sudo iptables -t nat -A POSTROUTING -p tcp -d 10.107.115.200 -j MASQUERADE + +# ================== +# INPUT rules (allow incoming) +# ================== +sudo iptables -A INPUT -p tcp --dport 80 -j ACCEPT +sudo iptables -A INPUT -p tcp --dport 443 -j ACCEPT +sudo iptables -A INPUT -p tcp --dport 4445 -j ACCEPT +sudo iptables -A INPUT -p tcp --dport 53 -j ACCEPT +sudo iptables -A INPUT -p udp --dport 53 -j ACCEPT + +# ================== +# Save rules persistently +# ================== +sudo sh -c 'iptables-save > /etc/iptables/rules.v4' +``` + +### IMPORTANT RULES + +1. **NEVER use socat** - It causes port conflicts and doesn't integrate with iptables NAT +2. **NEVER use Incus proxy devices** - They conflict with iptables NAT rules +3. **ALWAYS add OUTPUT rules** - PREROUTING only handles external traffic; local traffic needs OUTPUT +4. **ALWAYS add FORWARD rules** - Without them, traffic won't reach containers +5. **ALWAYS add POSTROUTING MASQUERADE** - Without it, return traffic won't work +6. **ALWAYS set route_localnet=1** - Required for localhost NAT to work + +### Testing NAT +```bash +# Test from host +nc -zv 127.0.0.1 4445 +# Should connect to PostgreSQL at 10.107.115.33:5432 + +# Test from external +nc -zv 63.141.255.9 4445 +# Should connect to PostgreSQL at 10.107.115.33:5432 + +# Test DNS +dig @127.0.0.1 webmail.pragmatismo.com.br +# Should return 63.141.255.9 +``` + +--- + +## CoreDNS Setup + +### Corefile Template +```corefile +ddsites.com.br:53 { + file /opt/gbo/data/ddsites.com.br.zone + bind 0.0.0.0 + reload 6h + acl { + allow type ANY net 10.0.0.0/8 127.0.0.0/8 + allow type ANY net /32 + allow type A net 0.0.0.0/0 + allow type AAAA net 0.0.0.0/0 + allow type MX net 0.0.0.0/0 + allow type TXT net 0.0.0.0/0 + allow type NS net 0.0.0.0/0 + allow type SOA net 0.0.0.0/0 + allow type SRV net 0.0.0.0/0 + allow type CNAME net 0.0.0.0/0 + allow type HTTPS net 0.0.0.0/0 + allow type CAA net 0.0.0.0/0 + block + } + cache + errors +} + +pragmatismo.com.br:53 { + file /opt/gbo/data/pragmatismo.com.br.zone + bind 0.0.0.0 + reload 6h + acl { + allow type ANY net 10.0.0.0/8 127.0.0.0/8 + allow type ANY net /32 + allow type A net 0.0.0.0/0 + allow type AAAA net 0.0.0.0/0 + allow type MX net 0.0.0.0/0 + allow type TXT net 0.0.0.0/0 + allow type NS net 0.0.0.0/0 + allow type SOA net 0.0.0.0/0 + allow type SRV net 0.0.0.0/0 + allow type CNAME net 0.0.0.0/0 + allow type HTTPS net 0.0.0.0/0 + allow type CAA net 0.0.0.0/0 + block + } + cache + errors +} + +. { + forward . 8.8.8.8 1.1.1.1 + cache + errors + log +} +``` + +### Zone File Template (pragmatismo.com.br) +``` +$ORIGIN pragmatismo.com.br. +$TTL 3600 + +@ IN SOA ns1.ddsites.com.br. hostmaster.dmeans.info. ( + 2026032301 ; Serial (YYYYMMDDNN) + 86400 ; Refresh + 900 ; Retry + 1209600 ; Expire + 3600 ; Minimum TTL +) + +@ IN CAA 0 issue "letsencrypt.org" +@ IN CAA 0 issuewild ";" +@ IN CAA 0 iodef "mailto:security@pragmatismo.com.br" + +@ IN HTTPS 1 . alpn="h2,h3" + +@ IN NS ns1.ddsites.com.br. +@ IN NS ns2.ddsites.com.br. + +@ IN A + +ns1 IN A +ns2 IN A + +@ IN MX 10 mail.pragmatismo.com.br. + +mail IN A +www IN A +webmail IN A +drive IN A +drive-api IN A +alm IN A +tables IN A +gb IN A +gb6 IN A +``` + +### Starting CoreDNS in Container +```bash +# CoreDNS won't start via systemd in Incus containers by default +# Use nohup to start it +incus exec dns -- bash -c 'mkdir -p /opt/gbo/logs && nohup /opt/gbo/bin/coredns -conf /opt/gbo/conf/Corefile > /opt/gbo/logs/coredns.log 2>&1 &' +``` + +### DNS Zone Records (CRITICAL - Use A records, NOT CNAMEs for internal services) +``` +# WRONG - CNAME causes resolution issues +webmail IN CNAME mail + +# CORRECT - Direct A record +webmail IN A +mail IN A +``` + +--- + +## Container Cleanup (BEFORE Setting Up NAT) + +**ALWAYS remove socat and Incus proxy devices before configuring iptables NAT:** + +```bash +# Remove socat +pkill -9 -f socat 2>/dev/null +rm -f /usr/bin/socat /usr/sbin/socat 2>/dev/null + +# Remove all proxy devices from all containers +for c in $(incus list --format csv -c n); do + for d in $(incus config device list $c 2>/dev/null | grep -E 'proxy|port'); do + echo "Removing $d from $c" + incus config device remove $c $d 2>/dev/null + done +done +``` + +--- + +## installer.rs Improvements Required + +### 1. New Module Structure + +``` +botserver/src/core/package_manager/ +├── mod.rs +├── component.rs # ComponentConfig (existing) +├── installer.rs # PackageManager (existing) +├── container.rs # NEW: Container deployment logic +└── templates/ # NEW: Service file templates + ├── dns.service + ├── email.service + ├── alm.service + ├── minio.service + └── webmail.service +``` + +### 2. Container Settings in ComponentConfig + +```rust +// Add to component.rs + +#[derive(Debug, Clone)] +pub struct NatRule { + pub port: u16, + pub protocol: String, // "tcp" or "udp" +} + +#[derive(Debug, Clone)] +pub struct ContainerSettings { + pub container_name: String, + pub ip: String, + pub user: String, + pub group: Option, + pub working_dir: Option, + pub service_template: String, + pub nat_rules: Vec, + pub binary_path: String, // "/opt/gbo/bin/coredns" + pub config_path: String, // "/opt/gbo/conf/Corefile" + pub data_path: Option, // "/opt/gbo/data" + pub exec_cmd_args: Vec, // ["--config", "/opt/gbo/conf/Corefile"] + pub internal_ports: Vec, // Ports container listens on internally + pub external_port: Option, // External port (if different from internal) +} +``` + +### 3. Component Registration with Container Settings + +```rust +fn register_dns(&mut self) { + self.components.insert( + "dns".to_string(), + ComponentConfig { + name: "dns".to_string(), + // ... existing fields ... + + // NEW: Container settings + container: Some(ContainerSettings { + container_name: "dns".to_string(), + ip: "10.107.115.155".to_string(), + user: "root".to_string(), + group: None, + working_dir: None, + service_template: include_str!("templates/dns.service").to_string(), + nat_rules: vec![ + NatRule { port: 53, protocol: "tcp".to_string() }, + NatRule { port: 53, protocol: "udp".to_string() }, + ], + binary_path: "/opt/gbo/bin/coredns".to_string(), + config_path: "/opt/gbo/conf/Corefile".to_string(), + data_path: Some("/opt/gbo/data".to_string()), + exec_cmd_args: vec!["-conf".to_string(), "/opt/gbo/conf/Corefile".to_string()], + internal_ports: vec![53], + external_port: Some(53), + }), + }, + ); +} + +fn register_tables(&mut self) { + // PostgreSQL with external port 4445 + self.components.insert( + "tables".to_string(), + ComponentConfig { + name: "tables".to_string(), + container: Some(ContainerSettings { + container_name: "tables".to_string(), + ip: "10.107.115.33".to_string(), + user: "root".to_string(), + nat_rules: vec![ + NatRule { port: 4445, protocol: "tcp".to_string() }, + ], + internal_ports: vec![5432], + external_port: Some(4445), + // ... + }), + }, + ); +} +``` + +### 4. Container Deployment Methods + +```rust +// Add to installer.rs + +impl PackageManager { + + /// Bootstrap a container with all its services and NAT rules + pub async fn bootstrap_container( + &self, + container_name: &str, + source_lxd: Option<&str>, + ) -> Result<()> { + info!("Bootstrapping container: {}", container_name); + + // 0. CLEANUP - Remove any existing socat or proxy devices + self.cleanup_existing(container_name).await?; + + // 1. Copy from source LXD if migrating + if let Some(source_remote) = source_lxd { + self.copy_container(source_remote, container_name).await?; + } + + // 2. Ensure network is configured + self.ensure_network(container_name).await?; + + // 3. Sync data from host to container + self.sync_data_to_container(container_name).await?; + + // 4. Fix permissions + self.fix_permissions(container_name).await?; + + // 5. Install and start service + self.install_systemd_service(container_name).await?; + + // 6. Configure NAT rules on host (ONLY iptables, never socat) + self.configure_iptables_nat(container_name).await?; + + // 7. Reload DNS if dns container + if container_name == "dns" { + self.reload_dns_zones().await?; + } + + info!("Container {} bootstrapped successfully", container_name); + Ok(()) + } + + /// Cleanup existing socat and proxy devices + async fn cleanup_existing(&self, container: &str) -> Result<()> { + // Remove socat processes + SafeCommand::new("pkill") + .and_then(|c| c.arg("-9")) + .and_then(|c| c.arg("-f")) + .and_then(|c| c.arg("socat")) + .execute()?; + + // Remove proxy devices from container + let output = SafeCommand::new("incus") + .and_then(|c| c.arg("config")) + .and_then(|c| c.arg("device")) + .and_then(|c| c.arg("list")) + .and_then(|c| c.arg(container)) + .and_then(|cmd| cmd.execute_with_output())?; + + let output_str = String::from_utf8_lossy(&output.stdout); + for line in output_str.lines() { + if line.contains("proxy") || line.contains("port") { + let parts: Vec<&str> = line.split_whitespace().collect(); + if let Some(name) = parts.first() { + SafeCommand::new("incus") + .and_then(|c| c.arg("config")) + .and_then(|c| c.arg("device")) + .and_then(|c| c.arg("remove")) + .and_then(|c| c.arg(container)) + .and_then(|c| c.arg(name)) + .execute()?; + } + } + } + + Ok(()) + } + + /// Copy container from LXD source + async fn copy_container(&self, source_remote: &str, name: &str) -> Result<()> { + info!("Copying container {} from {}", name, source_remote); + + SafeCommand::new("incus") + .and_then(|c| c.arg("copy")) + .and_then(|c| c.arg("--instance-only")) + .and_then(|c| c.arg(format!("{}:{}", source_remote, name))) + .and_then(|c| c.arg(name)) + .and_then(|cmd| cmd.execute()) + .context("Failed to copy container")?; + + SafeCommand::new("incus") + .and_then(|c| c.arg("start")) + .and_then(|c| c.arg(name)) + .and_then(|cmd| cmd.execute()) + .context("Failed to start container")?; + + Ok(()) + } + + /// Add eth0 network to container + async fn ensure_network(&self, container: &str) -> Result<()> { + let output = SafeCommand::new("incus") + .and_then(|c| c.arg("config")) + .and_then(|c| c.arg("device")) + .and_then(|c| c.arg("list")) + .and_then(|c| c.arg(container)) + .and_then(|cmd| cmd.execute_with_output())?; + + let output_str = String::from_utf8_lossy(&output.stdout); + if !output_str.contains("eth0") { + SafeCommand::new("incus") + .and_then(|c| c.arg("config")) + .and_then(|c| c.arg("device")) + .and_then(|c| c.arg("add")) + .and_then(|c| c.arg(container)) + .and_then(|c| c.arg("eth0")) + .and_then(|c| c.arg("nic")) + .and_then(|c| c.arg("name=eth0")) + .and_then(|c| c.arg("network=PROD-GBO")) + .and_then(|cmd| cmd.execute())?; + } + Ok(()) + } + + /// Sync data from host to container + async fn sync_data_to_container(&self, container: &str) -> Result<()> { + let source_path = format!( + "/opt/gbo/tenants/{}/{}/", + self.tenant, container + ); + + if Path::new(&source_path).exists() { + info!("Syncing data for {}", container); + + SafeCommand::new("incus") + .and_then(|c| c.arg("exec")) + .and_then(|c| c.arg(container)) + .and_then(|c| c.arg("--")) + .and_then(|c| c.arg("mkdir")) + .and_then(|c| c.arg("-p")) + .and_then(|c| c.arg("/opt/gbo")) + .and_then(|cmd| cmd.execute())?; + + SafeCommand::new("incus") + .and_then(|c| c.arg("file")) + .and_then(|c| c.arg("push")) + .and_then(|c| c.arg("--recursive")) + .and_then(|c| c.arg(format!("{}.", source_path))) + .and_then(|c| c.arg(format!("{}:/opt/gbo/", container))) + .and_then(|cmd| cmd.execute())?; + } + Ok(()) + } + + /// Fix file permissions based on container user + async fn fix_permissions(&self, container: &str) -> Result<()> { + let settings = self.get_container_settings(container)?; + + if let Some(user) = &settings.user { + let chown_cmd = if let Some(group) = &settings.group { + format!("chown -R {}:{} /opt/gbo/", user, group) + } else { + format!("chown -R {}:{} /opt/gbo/", user, user) + }; + + SafeCommand::new("incus") + .and_then(|c| c.arg("exec")) + .and_then(|c| c.arg(container)) + .and_then(|c| c.arg("--")) + .and_then(|c| c.arg("sh")) + .and_then(|c| c.arg("-c")) + .and_then(|c| c.arg(&chown_cmd)) + .and_then(|cmd| cmd.execute())?; + } + + // Make binaries executable + SafeCommand::new("incus") + .and_then(|c| c.arg("exec")) + .and_then(|c| c.arg(container)) + .and_then(|c| c.arg("--")) + .and_then(|c| c.arg("chmod")) + .and_then(|c| c.arg("+x")) + .and_then(|c| c.arg(format!("{}/bin/*", self.base_path.display()))) + .and_then(|cmd| cmd.execute())?; + + Ok(()) + } + + /// Install systemd service file and start + async fn install_systemd_service(&self, container: &str) -> Result<()> { + let settings = self.get_container_settings(container)?; + + let service_name = format!("{}.service", container); + let temp_path = format!("/tmp/{}", service_name); + + std::fs::write(&temp_path, &settings.service_template) + .context("Failed to write service template")?; + + SafeCommand::new("incus") + .and_then(|c| c.arg("file")) + .and_then(|c| c.arg("push")) + .and_then(|c| c.arg(&temp_path)) + .and_then(|c| c.arg(format!("{}:/etc/systemd/system/{}", container, service_name))) + .and_then(|cmd| cmd.execute())?; + + for cmd_args in [ + ["daemon-reload"], + &["enable", &service_name], + &["start", &service_name], + ] { + let mut cmd = SafeCommand::new("incus") + .and_then(|c| c.arg("exec")) + .and_then(|c| c.arg(container)) + .and_then(|c| c.arg("--")) + .and_then(|c| c.arg("systemctl")); + + for arg in cmd_args { + cmd = cmd.and_then(|c| c.arg(arg)); + } + cmd.execute()?; + } + + std::fs::remove_file(&temp_path).ok(); + Ok(()) + } + + /// Configure iptables NAT rules on host - ONLY method allowed, NEVER socat + async fn configure_iptables_nat(&self, container: &str) -> Result<()> { + let settings = self.get_container_settings(container)?; + + // Set route_localnet if not already set + SafeCommand::new("sudo") + .and_then(|c| c.arg("sysctl")) + .and_then(|c| c.arg("-w")) + .and_then(|c| c.arg("net.ipv4.conf.all.route_localnet=1")) + .execute()?; + + for rule in &settings.nat_rules { + // PREROUTING rule - for external traffic + SafeCommand::new("sudo") + .and_then(|c| c.arg("iptables")) + .and_then(|c| c.arg("-t")) + .and_then(|c| c.arg("nat")) + .and_then(|c| c.arg("-A")) + .and_then(|c| c.arg("PREROUTING")) + .and_then(|c| c.arg("-p")) + .and_then(|c| c.arg(&rule.protocol)) + .and_then(|c| c.arg("--dport")) + .and_then(|c| c.arg(rule.port.to_string())) + .and_then(|c| c.arg("-j")) + .and_then(|c| c.arg("DNAT")) + .and_then(|c| c.arg("--to-destination")) + .and_then(|c| c.arg(format!("{}:{}", settings.ip, rule.port))) + .and_then(|cmd| cmd.execute())?; + + // OUTPUT rule - for local traffic (CRITICAL for NAT to work) + SafeCommand::new("sudo") + .and_then(|c| c.arg("iptables")) + .and_then(|c| c.arg("-t")) + .and_then(|c| c.arg("nat")) + .and_then(|c| c.arg("-A")) + .and_then(|c| c.arg("OUTPUT")) + .and_then(|c| c.arg("-p")) + .and_then(|c| c.arg(&rule.protocol)) + .and_then(|c| c.arg("--dport")) + .and_then(|c| c.arg(rule.port.to_string())) + .and_then(|c| c.arg("-j")) + .and_then(|c| c.arg("DNAT")) + .and_then(|c| c.arg("--to-destination")) + .and_then(|c| c.arg(format!("{}:{}", settings.ip, rule.port))) + .and_then(|cmd| cmd.execute())?; + + // FORWARD rules + SafeCommand::new("sudo") + .and_then(|c| c.arg("iptables")) + .and_then(|c| c.arg("-A")) + .and_then(|c| c.arg("FORWARD")) + .and_then(|c| c.arg("-p")) + .and_then(|c| c.arg(&rule.protocol)) + .and_then(|c| c.arg("-d")) + .and_then(|c| c.arg(&settings.ip)) + .and_then(|c| c.arg("--dport")) + .and_then(|c| c.arg(rule.port.to_string())) + .and_then(|c| c.arg("-j")) + .and_then(|c| c.arg("ACCEPT")) + .and_then(|cmd| cmd.execute())?; + } + + // POSTROUTING MASQUERADE for return traffic + SafeCommand::new("sudo") + .and_then(|c| c.arg("iptables")) + .and_then(|c| c.arg("-t")) + .and_then(|c| c.arg("nat")) + .and_then(|c| c.arg("-A")) + .and_then(|c| c.arg("POSTROUTING")) + .and_then(|c| c.arg("-p")) + .and_then(|c| c.arg("tcp")) + .and_then(|c| c.arg("-d")) + .and_then(|c| c.arg(&settings.ip)) + .and_then(|c| c.arg("-j")) + .and_then(|c| c.arg("MASQUERADE")) + .and_then(|cmd| cmd.execute())?; + + // Save rules + SafeCommand::new("sudo") + .and_then(|c| c.arg("sh")) + .and_then(|c| c.arg("-c")) + .and_then(|c| c.arg("iptables-save > /etc/iptables/rules.v4")) + .and_then(|cmd| cmd.execute())?; + + Ok(()) + } + + /// Start CoreDNS (special case - doesn't work well with systemd in Incus) + async fn start_coredns(&self, container: &str) -> Result<()> { + SafeCommand::new("incus") + .and_then(|c| c.arg("exec")) + .and_then(|c| c.arg(container)) + .and_then(|c| c.arg("--")) + .and_then(|c| c.arg("bash")) + .and_then(|c| c.arg("-c")) + .and_then(|c| c.arg("mkdir -p /opt/gbo/logs && nohup /opt/gbo/bin/coredns -conf /opt/gbo/conf/Corefile > /opt/gbo/logs/coredns.log 2>&1 &")) + .and_then(|cmd| cmd.execute())?; + + Ok(()) + } + + /// Reload DNS zones with new IPs + async fn reload_dns_zones(&self) -> Result<()> { + // Update zone files to point to new IP + SafeCommand::new("incus") + .and_then(|c| c.arg("exec")) + .and_then(|c| c.arg("dns")) + .and_then(|c| c.arg("--")) + .and_then(|c| c.arg("sh")) + .and_then(|c| c.arg("-c")) + .and_then(|c| c.arg("sed -i 's/OLD_IP/NEW_IP/g' /opt/gbo/data/*.zone")) + .and_then(|cmd| cmd.execute())?; + + // Restart coredns + self.start_coredns("dns").await?; + + Ok(()) + } + + /// Get container settings for a component + fn get_container_settings(&self, container: &str) -> Result<&ContainerSettings> { + self.components + .get(container) + .and_then(|c| c.container.as_ref()) + .context("Container settings not found") + } +} +``` + +### 5. Binary Installation (For Fresh Containers) + +```rust +/// Install binary to container from URL or fallback +async fn install_binary_to_container( + &self, + container: &str, + component: &str, +) -> Result<()> { + let config = self.components.get(component) + .context("Component not found")?; + + let binary_name = config.binary_name.as_ref() + .context("No binary name")?; + + let settings = config.container.as_ref() + .context("No container settings")?; + + // Check if already exists + let check = SafeCommand::new("incus") + .and_then(|c| c.arg("exec")) + .and_then(|c| c.arg(container)) + .and_then(|c| c.arg("--")) + .and_then(|c| c.arg("test")) + .and_then(|c| c.arg("-f")) + .and_then(|c| c.arg(&settings.binary_path)) + .and_then(|cmd| cmd.execute()); + + if check.is_ok() { + info!("Binary {} already exists in {}", binary_name, container); + return Ok(()); + } + + // Download if URL available + if let Some(url) = &config.download_url { + self.download_and_push_binary(container, url, binary_name).await?; + } + + // Make executable + SafeCommand::new("incus") + .and_then(|c| c.arg("exec")) + .and_then(|c| c.arg(container)) + .and_then(|c| c.arg("--")) + .and_then(|c| c.arg("chmod")) + .and_then(|c| c.arg("+x")) + .and_then(|c| c.arg(&settings.binary_path)) + .and_then(|cmd| cmd.execute())?; + + Ok(()) +} +``` + +--- + +## Full Bootstrap API + +```rust +/// Bootstrap an entire tenant +pub async fn bootstrap_tenant( + state: &AppState, + tenant: &str, + containers: &[&str], + source_remote: Option<&str>, +) -> Result<()> { + let pm = PackageManager::new(InstallMode::Container, Some(tenant.to_string()))?; + + for container in containers { + pm.bootstrap_container(container, source_remote).await?; + } + + info!("Tenant {} bootstrapped successfully", tenant); + Ok(()) +} + +/// Bootstrap all pragmatismo containers +pub async fn bootstrap_pragmatismo(state: &AppState) -> Result<()> { + let containers = [ + "dns", "email", "webmail", "alm", "drive", + "tables", "system", "proxy", "alm-ci", "table-editor" + ]; + + bootstrap_tenant(state, "pragmatismo", &containers, Some("lxd-source")).await +} +``` + +--- + +## Command Line Usage + +```bash +# Bootstrap single container +cargo run --bin bootstrap -- container dns --tenant pragmatismo + +# Bootstrap all containers for a tenant +cargo run --bin bootstrap -- tenant pragmatismo --source lxd-source + +# Only sync data (no copy from LXD) +cargo run --bin bootstrap -- sync-data dns --tenant pragmatismo + +# Only configure NAT +cargo run --bin bootstrap -- configure-nat --container dns + +# Only install service +cargo run --bin bootstrap -- install-service dns + +# Clean up socat and proxy devices +cargo run --bin bootstrap -- cleanup --container dns +``` + +--- + +## Files to Create/Modify + +### New Files +1. `botserver/src/core/package_manager/container.rs` - Container deployment logic +2. `botserver/src/core/package_manager/templates/dns.service` +3. `botserver/src/core/package_manager/templates/email.service` +4. `botserver/src/core/package_manager/templates/alm.service` +5. `botserver/src/core/package_manager/templates/minio.service` +6. `botserver/src/core/package_manager/templates/webmail.service` +7. `botserver/src/core/package_manager/templates/tables-postgresql.service` + +### Modified Files +1. `botserver/src/core/package_manager/component.rs` - Add ContainerSettings +2. `botserver/src/core/package_manager/installer.rs` - Add container methods, update registrations + +--- + +## Testing Checklist + +After implementation, verify: +- [ ] `incus list` shows all containers running +- [ ] `nc -zv 127.0.0.1 4445` - PostgreSQL accessible +- [ ] `dig @127.0.0.1 webmail.pragmatismo.com.br` - Returns correct IP +- [ ] `curl https://webmail.pragmatismo.com.br` - Webmail accessible +- [ ] NAT rules work from external IP +- [ ] Zone files have correct A records (not CNAMEs) +- [ ] Services survive container restart +- [ ] `which socat` returns nothing on host +- [ ] No proxy devices in any container + +--- + +## Known Issues Fixed + +1. **socat conflicts with iptables** - NEVER use socat, use ONLY iptables NAT +2. **Incus proxy devices conflict with NAT** - Remove all proxy devices before setting up NAT +3. **PREROUTING doesn't handle local traffic** - Must add OUTPUT rules +4. **CoreDNS won't start via systemd in Incus** - Use nohup instead +5. **DNS CNAME resolution issues** - Use A records for internal services +6. **route_localnet needed for localhost NAT** - Set sysctl before NAT rules +7. **FORWARD chain blocks container traffic** - Must add FORWARD ACCEPT rules +8. **Return traffic fails without MASQUERADE** - Add POSTROUTING MASQUERADE rules +9. **Binary permissions** - chmod +x after push +10. **Apache SSL needs mod_ssl enabled** - Run `a2enmod ssl` before starting Apache diff --git a/prompts/fail2ban-start.sh b/prompts/fail2ban-start.sh new file mode 100644 index 0000000..55b64bf --- /dev/null +++ b/prompts/fail2ban-start.sh @@ -0,0 +1,11 @@ +#!/bin/bash +# fail2ban startup script for Incus containers +# Usage: Place in /opt/gbo/bin/ and run as root in container + +LOGFILE=/opt/gbo/logs/fail2ban.log + +mkdir -p /opt/gbo/logs +nohup /usr/bin/fail2ban-server -x -f > $LOGFILE 2>&1 & +sleep 2 +fail2ban-client reload +echo "Fail2ban started - check status with: fail2ban-client status" \ No newline at end of file diff --git a/prompts/go.md b/prompts/go.md new file mode 100644 index 0000000..c20d520 --- /dev/null +++ b/prompts/go.md @@ -0,0 +1,15 @@ +# Production Smoke Test Plan + +## Status: Caddy running ✅ + +## URLs to Test (YOLO) +1. https://webmail.pragmatismo.com.br/ +2. https://chat.pragmatismo.com.br/cristo +3. https://alm.pragmatismo.com.br/ +4. https://tables.pragmatismo.com.br/ + +## Steps +1. Navigate to each URL +2. Take screenshot +3. Check for errors in console/network +4. Report pass/fail per URL