- Use /tmp/persistent-botserver instead of /opt/gbo/data
- Use /tmp/sccache and /tmp/cargo for caches
- Runner has full permissions on /tmp
- Fixes 'Operation not permitted' on chown
- Remove su - and sudo -u which require passwords
- Set HOME=/home/gbuser and USER=gbuser as env vars
- Run git/cargo with proper HOME prefix
- Fix runner hanging on authentication
- Move CI workspace to /home/gbuser/persistent-botserver
- Cache now in /home/gbuser/.cache and /home/gbuser/.cargo
- No more permission conflicts with /opt/gbo/data
- Replace sudo with su - gbuser -c for proper user switching
- Simplify permission handling with chown/chmod as root
- Use su - for all gbuser operations
- Use sudo -u gbuser for all git operations
- Add chown/chmod for cache directories
- Use git pull instead of fetch/reset for cleaner updates
- Ensure consistent gbuser ownership
- Add AzureGPT5Client struct for Responses API
- Add AzureGPT5 to LLMProviderType enum
- Detect provider via azuregpt5 or gpt5 in llm-provider config
- Fix gpt_oss_120b.rs chars.peek() issue
Move logs detalhados de LLM e DriveMonitor de info! para trace!
para reduzir poluição nos logs de produção:
- bot/mod.rs: LLM chunk logs, streaming start, abort
- llm/mod.rs: LLM Request Details, provider creation logs
Estes logs são úteis para debug mas geram muito ruído em produção.
Com trace! só aparecem quando RUST_LOG=trace está configurado.
Aumenta a abrangência da busca em KB para capturar mais contexto
relevante, especialmente em documentos com múltiplas entidades
como listas de ramais.
- inject_kb_context: 5 -> 20 resultados
- think_kb: 10 -> 25 resultados
- search_active_websites: 5 -> 20 resultados
Bug: Using break instead of continue when encountering low-relevance
results caused the search to stop prematurely, missing potentially
relevant chunks in subsequent results.
- Changed break to continue when score < 0.4 in search_single_collection
- Changed break to continue when score < 0.4 in search_single_kb
- Lowered threshold from 0.5 to 0.4 for consistency
This ensures all search results are evaluated, not just those before
the first low-relevance result.
- Broadcast channel allows multiple subscribers for cancellation
- Aborts LLM task when user sends new message
- Properly stops LLM generation when cancelled
- Add active_streams HashMap to AppState to track streaming sessions
- Create cancellation channel for each streaming session
- Cancel existing streaming when new message arrives
- Prevents overlapping responses and improves UX
Thinking signals ({"type":"thinking"} and {"type":"thinking_clear"})
were leaking into the final HTML response when they appeared in the
middle or end of chunks, concatenated with regular content.
The previous check only looked at the start of chunks with
chunk.trim().starts_with('{'), which missed embedded signals.
Solution:
- Use regex to find ALL thinking signal JSON objects anywhere in the chunk
- Send each thinking signal separately to the frontend
- Remove thinking signals from the chunk before content processing
- Skip to next iteration if chunk contained only thinking signals
This prevents thinking signals from appearing in the final HTML output
and ensures they're properly handled by the frontend thinking indicator.
🤖 Generated with [Claude Code](https://claude.com/claude-code)
Co-Authored-By: Claude <noreply@anthropic.com>