Commit graph

4391 commits

Author SHA1 Message Date
148ad0cc7c Fix KB search: remove score threshold to improve results
All checks were successful
BotServer CI/CD / build (push) Successful in 2m54s
2026-04-15 14:04:11 -03:00
dd699db19e fix: improve assistant message history logging with preview
All checks were successful
BotServer CI/CD / build (push) Successful in 3m34s
2026-04-15 13:47:36 -03:00
bd6ca9439f fix: strip html/markdown from assistant messages and improve error logging
All checks were successful
BotServer CI/CD / build (push) Successful in 4m30s
2026-04-15 13:32:42 -03:00
adbf84f812 refactor: mover logs verbose de info! para trace!
All checks were successful
BotServer CI/CD / build (push) Successful in 3m25s
Move logs detalhados de LLM e DriveMonitor de info! para trace!
para reduzir poluição nos logs de produção:

- bot/mod.rs: LLM chunk logs, streaming start, abort
- llm/mod.rs: LLM Request Details, provider creation logs

Estes logs são úteis para debug mas geram muito ruído em produção.
Com trace! só aparecem quando RUST_LOG=trace está configurado.
2026-04-15 12:41:31 -03:00
d1cd7513d7 aumentar: limite de resultados KB de 5/10 para 20/25
All checks were successful
BotServer CI/CD / build (push) Successful in 2m50s
Aumenta a abrangência da busca em KB para capturar mais contexto
relevante, especialmente em documentos com múltiplas entidades
como listas de ramais.

- inject_kb_context: 5 -> 20 resultados
- think_kb: 10 -> 25 resultados
- search_active_websites: 5 -> 20 resultados
2026-04-15 11:02:31 -03:00
5338ffab12 fix: use continue instead of break on low-relevance KB search results
All checks were successful
BotServer CI/CD / build (push) Successful in 4m9s
Bug: Using break instead of continue when encountering low-relevance
results caused the search to stop prematurely, missing potentially
relevant chunks in subsequent results.

- Changed break to continue when score < 0.4 in search_single_collection
- Changed break to continue when score < 0.4 in search_single_kb
- Lowered threshold from 0.5 to 0.4 for consistency

This ensures all search results are evaluated, not just those before
the first low-relevance result.
2026-04-15 10:19:13 -03:00
dd15899ac3 fix: Use broadcast channel for LLM streaming cancellation
All checks were successful
BotServer CI/CD / build (push) Successful in 5m48s
- Broadcast channel allows multiple subscribers for cancellation
- Aborts LLM task when user sends new message
- Properly stops LLM generation when cancelled
2026-04-15 09:44:42 -03:00
9db784fd5c feat: Cancel streaming LLM when user sends new message
All checks were successful
BotServer CI/CD / build (push) Successful in 6m4s
- Add active_streams HashMap to AppState to track streaming sessions
- Create cancellation channel for each streaming session
- Cancel existing streaming when new message arrives
- Prevents overlapping responses and improves UX
2026-04-15 07:37:07 -03:00
01d4f47a93 fix: strip GPT-oSS thinking content from response chunks
All checks were successful
BotServer CI/CD / build (push) Successful in 4m18s
2026-04-14 19:57:13 -03:00
fc68b21252 fix: support flexible JSON order for GPT-oSS thinking signals
All checks were successful
BotServer CI/CD / build (push) Successful in 2m36s
2026-04-14 19:42:48 -03:00
8a6970734e fix: Extract thinking signals from anywhere in chunk to prevent leakage
All checks were successful
BotServer CI/CD / build (push) Successful in 3m47s
Thinking signals ({"type":"thinking"} and {"type":"thinking_clear"})
were leaking into the final HTML response when they appeared in the
middle or end of chunks, concatenated with regular content.

The previous check only looked at the start of chunks with
chunk.trim().starts_with('{'), which missed embedded signals.

Solution:
- Use regex to find ALL thinking signal JSON objects anywhere in the chunk
- Send each thinking signal separately to the frontend
- Remove thinking signals from the chunk before content processing
- Skip to next iteration if chunk contained only thinking signals

This prevents thinking signals from appearing in the final HTML output
and ensures they're properly handled by the frontend thinking indicator.

🤖 Generated with [Claude Code](https://claude.com/claude-code)

Co-Authored-By: Claude <noreply@anthropic.com>
2026-04-14 18:56:24 -03:00
1f743766a8 fix: key-order agnostic signal detection in backend
All checks were successful
BotServer CI/CD / build (push) Successful in 3m11s
2026-04-14 17:57:48 -03:00
21591e22dd fix: remove unused backend code
Some checks failed
BotServer CI/CD / build (push) Has been cancelled
2026-04-14 17:56:56 -03:00
c8514eabe7 fix: restore chunk flow by refining tool detection
All checks were successful
BotServer CI/CD / build (push) Successful in 3m39s
2026-04-14 17:52:58 -03:00
fc0144c67c fix: compile errors in internal signal detection
All checks were successful
BotServer CI/CD / build (push) Successful in 3m22s
2026-04-14 17:32:58 -03:00
c7f5f95a37 fix: robust internal signal detection in orchestrator
Some checks failed
BotServer CI/CD / build (push) Failing after 4m16s
2026-04-14 17:24:17 -03:00
3d6db4b46f fix: orchestrator must not swallow thinking events into tool buffer
All checks were successful
BotServer CI/CD / build (push) Successful in 3m23s
2026-04-14 17:18:03 -03:00
62cdf1c638 fix: handle GLM 4.7 reasoning_content and chat_template_kwargs
All checks were successful
BotServer CI/CD / build (push) Successful in 3m29s
2026-04-14 17:14:04 -03:00
44026ba073 fix: restore suggestions during direct tool execution
All checks were successful
BotServer CI/CD / build (push) Successful in 3m36s
2026-04-14 17:04:45 -03:00
ba3e2675ef feat: stateful thinking tag stripping for Kimi, Minimax and DeepSeek stream
All checks were successful
BotServer CI/CD / build (push) Successful in 3m42s
2026-04-14 16:15:31 -03:00
8ccc4e1c5e fix: update Minimax and DeepSeek handlers to strip unclosed thinking tags
Some checks failed
BotServer CI/CD / build (push) Has been cancelled
2026-04-14 16:13:41 -03:00
a6f825526f fix: LLM duplicate URL path and episodic memory bot context/defaults
All checks were successful
BotServer CI/CD / build (push) Successful in 3m53s
2026-04-14 15:55:49 -03:00
9743ba90b3 trigger ci
All checks were successful
BotServer CI/CD / build (push) Successful in 1m0s
2026-04-14 15:46:13 -03:00
c9d5bf361a refactor: drive monitor to use postgres instead of json
All checks were successful
BotServer CI/CD / build (push) Successful in 7m19s
2026-04-14 15:31:30 -03:00
45d5a444eb fix: DriveFileRepository compilation errors
All checks were successful
BotServer CI/CD / build (push) Successful in 7m25s
- Add Debug derive for DriveFileRepository
- Clone etag/last_modified for upsert to avoid move errors
- Fix max fail_count query to handle nullable integer
2026-04-14 14:50:47 -03:00
09f4c876b4 Update HTML rendering: buffer chunks and render visual elements only
Some checks failed
BotServer CI/CD / build (push) Failing after 4m19s
2026-04-14 14:39:08 -03:00
73d9531563 fix: buffer HTML chunks to avoid flashing, flush on closing tags
All checks were successful
BotServer CI/CD / build (push) Successful in 8m7s
2026-04-14 14:22:07 -03:00
f06c071b2c fix: drive monitor file state tracking
All checks were successful
BotServer CI/CD / build (push) Successful in 2m55s
2026-04-14 14:05:48 -03:00
32f8a10825 fix: normalize episodic/compact roles to system in all LLM providers
All checks were successful
BotServer CI/CD / build (push) Successful in 4m1s
2026-04-14 13:47:18 -03:00
d6527a438b fix: normalize roles to system for bedrock and vertex LLM providers 2026-04-14 13:44:12 -03:00
f04745ae1c fix: DriveMonitor loop performance and WebSocket blocking
Some checks failed
BotServer CI/CD / build (push) Has been cancelled
- Remove excessive trace/debug logging in hot loops

- Fix broadcast_theme_change lock contention by cloning channels before iterating

- Increase default sleep interval from 10s to 30s

- Remove [MODULE] prefixes from log messages

- Fix PDF re-download bug by using only last_modified (not ETag) for change detection

- Re-enable DriveMonitor in bootstrap (was disabled for testing)
2026-04-14 13:42:23 -03:00
a884c650a3 fix: CI deploy reliability — stop before transfer, enable after, health endpoint fix
All checks were successful
BotServer CI/CD / build (push) Successful in 1m3s
2026-04-14 10:37:23 -03:00
679bf05504 fix: Kimi K2.5 factory + LLM chunk traces
All checks were successful
BotServer CI/CD / build (push) Successful in 4m35s
- Kimi factory: add max_tokens=16384, temperature=1.0, top_p=1.0,
  and chat_template_kwargs.thinking=true for kimi models
- Add chunk count traces in stream_response so we see LLM progress
  immediately in logs: 'LLM chunk #N received (len=X)'
- Keep generic stream parser clean — model-specific logic lives in
  the request builder (Kimi factory pattern)

Co-authored-by: Qwen-Coder <qwen-coder@alibabacloud.com>
2026-04-14 10:20:02 -03:00
03f060680e fix: CI git path for BOTSERVER_COMMIT + deploy health check wait
All checks were successful
BotServer CI/CD / build (push) Successful in 4m16s
Co-authored-by: Qwen-Coder <qwen-coder@alibabacloud.com>
2026-04-14 10:01:41 -03:00
d20ecdb89c fix: enterprise-grade reliability — three changes
Some checks failed
BotServer CI/CD / build (push) Failing after 6s
1. CI: restart system container instead of just systemctl restart botserver
   — ensures full env reload, Vault re-auth, DriveMonitor fresh state

2. Health endpoint: add 'commit' field with short git SHA
   — build.rs passes BOTSERVER_COMMIT from CI via rustc-env
   - Both /health and /api/health now report the running commit

3. WebSocket recv_task: spawn stream_response in separate tokio task
   - prevents one hung LLM from freezing all message processing
   - each WebSocket connection can now handle multiple messages
     concurrently regardless of LLM latency

Co-authored-by: Qwen-Coder <qwen-coder@alibabacloud.com>
2026-04-14 09:51:54 -03:00
251ee9e106 chore: disable DriveMonitor temporarily for WebSocket/LLM testing
All checks were successful
BotServer CI/CD / build (push) Successful in 7m30s
DriveMonitor polling may be consuming resources and interfering with
LLM response delivery. Disabling to isolate the chat pipeline.

Co-authored-by: Qwen-Coder <qwen-coder@alibabacloud.com>
2026-04-14 09:18:49 -03:00
3159d04414 fix: spawn LLM response in separate task to prevent recv_task blocking
All checks were successful
BotServer CI/CD / build (push) Successful in 5m3s
Previously the recv_task awaited stream_response() directly, which
froze the entire WebSocket message receiver while the LLM ran (30s+).
This meant a second user message couldn't be processed until the
first LLM call finished — a race condition that locked the session.

Now stream_response runs in its own tokio::spawn, keeping recv_task
free to handle new messages immediately. Also fixed borrow/lifetime
issue by cloning the response channel sender out of the lock scope.

Co-authored-by: Qwen-Coder <qwen-coder@alibabacloud.com>
2026-04-14 08:59:10 -03:00
dc97813614 fix: revert stream timeout that broke message processing
All checks were successful
BotServer CI/CD / build (push) Successful in 4m40s
2026-04-14 02:11:46 -03:00
ed3406dd80 revert: restore working LLM streaming code from 260a13e7
All checks were successful
BotServer CI/CD / build (push) Successful in 11m59s
The recent LLM changes (timeouts, tool call accumulation, extra logging)
broke the WebSocket message flow. Reverting to the known working version.
2026-04-14 01:15:20 -03:00
301a7dda33 Add LLM stream timeout and debug logs
All checks were successful
BotServer CI/CD / build (push) Successful in 4m8s
2026-04-14 00:55:43 -03:00
da9facf036 fix: add 5s connect_timeout to LLM HTTP client so unreachable APIs fail fast
All checks were successful
BotServer CI/CD / build (push) Successful in 3m52s
Without connect_timeout, reqwest can hang for the full 60s timeout
when the remote server is unreachable (DNS, TCP connect, etc.).
Now fails in 5s max for connection issues, 30s for full request.

This means one user's LLM failure no longer blocks new users for
a full minute — the channel closes quickly and the WebSocket is freed.

Co-authored-by: Qwen-Coder <qwen-coder@alibabacloud.com>
2026-04-13 23:54:50 -03:00
3ec72f6121 fix: add 60s timeout to OpenAI-compatible HTTP client preventing LLM deadlock
All checks were successful
BotServer CI/CD / build (push) Successful in 4m2s
reqwest::Client::new() has no timeout — when external APIs (NVIDIA,
Groq, etc.) hang or throttle, the request blocks forever, freezing the
entire response pipeline for the user.

Also add std::time::Duration import to llm/mod.rs.

Co-authored-by: Qwen-Coder <qwen-coder@alibabacloud.com>
2026-04-13 23:31:12 -03:00
25d6d2fd57 fix: eliminate LLM keyword deadlock with isolated worker thread
All checks were successful
BotServer CI/CD / build (push) Successful in 3m32s
The previous fix used Handle::current().block_on() which deadlocks when
the Rhai engine runs on a Tokio worker thread — it blocks the very
thread the async task needs to make progress.

New approach: spawn a dedicated background thread with its own
single-threaded Tokio runtime, communicate via mpsc channel with a
45s timeout. This completely isolates the LLM runtime from the
caller's runtime, eliminating any possibility of thread starvation
or nested-runtime deadlock.

Also remove unused 'trace' import from llm/mod.rs.

Co-authored-by: Qwen-Coder <qwen-coder@alibabacloud.com>
2026-04-13 23:20:10 -03:00
b3fd90b056 fix: remove blocking recv_timeout from LLM keyword
All checks were successful
BotServer CI/CD / build (push) Successful in 3m41s
2026-04-13 23:01:54 -03:00
6468588f58 fix: remove LLM streaming lock that caused deadlocks
All checks were successful
BotServer CI/CD / build (push) Successful in 3m40s
2026-04-13 22:51:29 -03:00
7d911194f3 fix: disable all thinking detection to prevent deadlock
All checks were successful
BotServer CI/CD / build (push) Successful in 3m36s
2026-04-13 22:47:27 -03:00
f48f87cadc debug: add processing traces
All checks were successful
BotServer CI/CD / build (push) Successful in 3m29s
2026-04-13 22:34:27 -03:00
99909de75d fix: disable thinking detection to prevent deadlock
All checks were successful
BotServer CI/CD / build (push) Successful in 3m19s
2026-04-13 22:26:31 -03:00
318d199d6c fix: clear thinking indicator on stream complete
All checks were successful
BotServer CI/CD / build (push) Successful in 3m21s
2026-04-13 22:19:10 -03:00
200b026efe fix: add thinking indicator and 30s timeout to prevent deadlock
All checks were successful
BotServer CI/CD / build (push) Successful in 3m16s
2026-04-13 21:40:50 -03:00