- Kimi factory: add max_tokens=16384, temperature=1.0, top_p=1.0,
and chat_template_kwargs.thinking=true for kimi models
- Add chunk count traces in stream_response so we see LLM progress
immediately in logs: 'LLM chunk #N received (len=X)'
- Keep generic stream parser clean — model-specific logic lives in
the request builder (Kimi factory pattern)
Co-authored-by: Qwen-Coder <qwen-coder@alibabacloud.com>
Previously the recv_task awaited stream_response() directly, which
froze the entire WebSocket message receiver while the LLM ran (30s+).
This meant a second user message couldn't be processed until the
first LLM call finished — a race condition that locked the session.
Now stream_response runs in its own tokio::spawn, keeping recv_task
free to handle new messages immediately. Also fixed borrow/lifetime
issue by cloning the response channel sender out of the lock scope.
Co-authored-by: Qwen-Coder <qwen-coder@alibabacloud.com>
- Move all preprocessing transforms (convert_multiword_keywords, preprocess_llm_keyword,
convert_while_wend_syntax, predeclare_variables) into BasicCompiler::preprocess_basic
so .ast files are fully preprocessed by Drive Monitor
- Replace ScriptService compile/compile_preprocessed/compile_tool_script with
single run(ast_content) that does engine.compile() + eval_ast_with_scope()
- Remove .bas fallback in tool_executor and start.bas paths - .ast only
- Remove dead code: preprocess_basic_script, normalize_variables_to_lowercase,
convert_save_for_tools, parse_save_parts, normalize_word
- Fix: USE KB 'cartas' in tool .ast now correctly converted to USE_KB('cartas')
during compilation, ensuring KB context injection works after tool execution
- Fix: add trace import in llm/mod.rs
- Log full LLM response preview (500 chars) with has_html detection
- Log WebSocket send with message type, completeness, and content preview
- Use clone() for chunk in BotResponse to ensure accurate logging
Co-authored-by: Qwen-Coder <qwen-coder@alibabacloud.com>
When a tool button like Cartas activates a KB via USE KB, instead of
returning just the tool result (empty/label), the handler now checks
if session has active KBs. If so and result is empty/trivial,
falls through to the full LLM pipeline which injects KB context.
- Disable LocalFileMonitor and ConfigWatcher - use S3/MinIO only
- Filter S3 buckets to gbo-*.gbai prefix
- Auto-create bots in database when new S3 buckets discovered
- Change file paths to use work directory instead of /opt/gbo/data
- Add RunQueryDsl import for Diesel queries
- Remove LocalFileMonitor and ConfigWatcher for /opt/gbo/data
- Remove /opt/gbo/data from mount_all_bots() scanning
- Change start.bas, tables.bas, and tool paths to use work directory
- Filter drive buckets to only gbo-* prefix
- Remove unused create_bot_simple method
- Fix all warnings (unused imports, variables, dead code)
- Adds get_stack_path() helper: returns /opt/gbo in production (.env without botserver-stack), ./botserver-stack in dev
- Adds get_work_path() helper: returns /opt/gbo/work in production, ./botserver-stack/data/system/work in dev
- Updated 35+ files to use dynamic path resolution
- Production system container no longer needs botserver-stack directory
- Work files go to /opt/gbo/work instead of /opt/gbo/bin/botserver-stack
- Fix double_ended_iterator_last: use next_back() instead of last()
- Fix manual_clamp: use .clamp() instead of min().max()
- Fix too_many_arguments: create KbInjectionContext struct
- Fix needless_borrow: remove unnecessary & reference
- Fix let_and_return: return value directly
- Fix await_holding_lock: drop guard before await
- Fix collapsible_else_if: collapse nested if-else
All changes verified with cargo clippy (0 warnings, 0 errors)
Note: Local botserver crashes with existing panic during LocalFileMonitor initialization
This panic exists in original code too, not caused by these changes
The epoch caused a new key to be created every second, bypassing
the 'already executed' check and running start.bas multiple times,
resulting in triplicated suggestions.
- Remove suggestions fetching from TALK function
- WebSocket handler now fetches and sends suggestions after start.bas executes
- Clear suggestions and start_bas_executed keys to allow re-run on refresh
- Decouple TALK from suggestions handling
- Add hear_channels: HashMap<Uuid, SyncSender<String>> to AppState
- HEAR now blocks the spawn_blocking thread via sync_channel recv()
- deliver_hear_input() called at top of stream_response() to unblock
- Script continues from exact HEAR position, no side-effect re-execution
- All three HEAR variants (basic, AS TYPE, AS MENU) use same mechanism