reqwest::Client::new() has no timeout — when external APIs (NVIDIA,
Groq, etc.) hang or throttle, the request blocks forever, freezing the
entire response pipeline for the user.
Also add std::time::Duration import to llm/mod.rs.
Co-authored-by: Qwen-Coder <qwen-coder@alibabacloud.com>
The previous fix used Handle::current().block_on() which deadlocks when
the Rhai engine runs on a Tokio worker thread — it blocks the very
thread the async task needs to make progress.
New approach: spawn a dedicated background thread with its own
single-threaded Tokio runtime, communicate via mpsc channel with a
45s timeout. This completely isolates the LLM runtime from the
caller's runtime, eliminating any possibility of thread starvation
or nested-runtime deadlock.
Also remove unused 'trace' import from llm/mod.rs.
Co-authored-by: Qwen-Coder <qwen-coder@alibabacloud.com>
- Show thinking indicator while LLM is in reasoning mode
- Skip reasoning content (thinking text) from user response
- Only show actual HTML content after thinking ends
Co-authored-by: Qwen-Coder <qwen-coder@alibabacloud.com>
- GLM4.7 and Kimi K2.5 send response in 'reasoning_content' field, 'content' is null
- Prefer 'content' for normal models, fallback to 'reasoning_content' for reasoning models
- Fixes blank white screen when using z-ai/glm4.7 model
Co-authored-by: Qwen-Coder <qwen-coder@alibabacloud.com>
- Move all preprocessing transforms (convert_multiword_keywords, preprocess_llm_keyword,
convert_while_wend_syntax, predeclare_variables) into BasicCompiler::preprocess_basic
so .ast files are fully preprocessed by Drive Monitor
- Replace ScriptService compile/compile_preprocessed/compile_tool_script with
single run(ast_content) that does engine.compile() + eval_ast_with_scope()
- Remove .bas fallback in tool_executor and start.bas paths - .ast only
- Remove dead code: preprocess_basic_script, normalize_variables_to_lowercase,
convert_save_for_tools, parse_save_parts, normalize_word
- Fix: USE KB 'cartas' in tool .ast now correctly converted to USE_KB('cartas')
during compilation, ensuring KB context injection works after tool execution
- Fix: add trace import in llm/mod.rs
- Add tokio timeout to SSE stream reads in OpenAI client (60s)
- Prevents indefinite hang when Kimi/Nvidia stops responding
- Add scanning AtomicBool to prevent concurrent check_gbkb_changes calls
- Skip GBKB scan entirely when all KBs already indexed in Qdrant
Co-authored-by: Qwen-Coder <qwen-coder@alibabacloud.com>
- drive_monitor: replace hardcoded salesianos.gbot with dynamic bot_name
- llm/mod.rs: stop falling back to reasoning_content as content
- llm/claude.rs: same fix for Claude handler
- deepseek_r3: export strip_think_tags for reuse
- gpt_oss_20b: use strip_think_tags so all models strip tags
- gpt_oss_120b: use strip_think_tags so all models strip tags
1. Fix model.starts_with('') always true - was limiting ALL models to 768 tokens
(local llama limit), truncating system prompts and KB context. Now only
applies when model=='local' or empty string, default is 32k tokens.
2. Fix create_bot_from_drive missing NOT NULL columns (llm_provider,
context_provider) - bots auto-created from S3 buckets failed to persist.
3. Fix S3 endpoint URL construction missing port 9100.
4. Fix Vault seed: vectordb.url was empty string, now defaults to
http://localhost:6333.
5. Fix Vault credential regeneration on recovery - added vault_seeds_exist().
6. Fix CA cert path for Vault TLS (botserver-stack vs botserver-stack).
7. Add bot verification after insert in create_bot_from_drive.
- Add Redis-based tracking to prevent start.bas from running repeatedly
when clicking suggestion buttons. start.bas now executes only once per
session with a 24-hour expiration on the tracking key.
- Add generic tool executor (ToolExecutor) for parsing and executing
tool calls from any LLM provider. Works with Claude, OpenAI, and
other providers that use standard tool calling formats.
- Update both start.bas execution paths (WebSocket handler and LLM
message handler) to check Redis before executing.
- Fix suggestion duplication by clearing suggestions from Redis after
fetching them.
- Add rate limiter for LLM API calls.
- Improve error handling and logging throughout.
Co-Authored-By: Claude Sonnet 4.5 <noreply@anthropic.com>
- Fix ConfigManager to treat 'none', 'null', 'n/a', and empty values as placeholders
and fall back to default bot's configuration instead of using these as literal values
- Fix ConfigManager to detect local file paths (e.g., .gguf, .bin, ../) and fall back
to default bot's model when using remote API, allowing bots to keep local model
config for local LLM server while automatically using remote model for API calls
- Fix get_default_bot() to return the bot actually named 'default' instead of
the first active bot by ID, ensuring consistent fallback behavior
- Add comprehensive debug logging to trace LLM configuration from database to API call
This fixes the issue where bots with incomplete or local LLM configuration would
fail with 401/400 errors when trying to use remote API, instead of automatically
falling back to the default bot's configuration from config.csv.
Closes: #llm-config-fallback
- Add llm to default features in Cargo.toml
- Fix duplicate smart_router module declaration
- Remove unused LLMProvider import and fix unused variable warnings
- Fix move error in enhanced_llm.rs by cloning state separately for each closure
- Improve code formatting and consistency
- Fix match arms with identical bodies by consolidating patterns
- Fix case-insensitive file extension comparisons using eq_ignore_ascii_case
- Fix unnecessary Debug formatting in log/format macros
- Fix clone_from usage instead of clone assignment
- Fix let...else patterns where appropriate
- Fix format! append to String using write! macro
- Fix unwrap_or with function calls to use unwrap_or_else
- Add missing fields to manual Debug implementations
- Fix duplicate code in if blocks
- Add type aliases for complex types
- Rename struct fields to avoid common prefixes
- Various other clippy warning fixes
Note: Some 'unused async' warnings remain for functions that are
called with .await but don't contain await internally - these are
kept async for API compatibility.
This commit introduces comprehensive documentation and implementation
for multi-agent orchestration capabilities:
- Add IMPLEMENTATION-PLAN.md with 4-phase roadmap
- Add Kubernetes deployment manifests (deployment.yaml, hpa.yaml)
- Add database migrations for multi-agent tables (6.1.1, 6.1.2)
- Implement A2A protocol for agent-to-agent communication
- Implement user memory keywords for cross-session persistence
- Implement model routing for dynamic L
The sqlx database library has been removed from the project along with
associated database-specific code that was no longer being used. This
includes removal of various sqlx-related dependencies from Cargo.lock
and cleanup of database connection pool references.