Commit graph

140 commits

Author SHA1 Message Date
8ddcde4830 fix: detect NVIDIA API as GLM provider, handle full URL path
Some checks failed
BotServer CI/CD / build (push) Has been cancelled
2026-04-13 16:18:00 -03:00
498c771d7b feat: add thinking indicator for reasoning models (GLM4.7, Kimi K2.5)
All checks were successful
BotServer CI/CD / build (push) Successful in 3m27s
- Show thinking indicator while LLM is in reasoning mode
- Skip reasoning content (thinking text) from user response
- Only show actual HTML content after thinking ends

Co-authored-by: Qwen-Coder <qwen-coder@alibabacloud.com>
2026-04-13 15:35:22 -03:00
3e99235a49 fix: support reasoning models (GLM4.7, Kimi K2.5) - use reasoning_content when content is null
All checks were successful
BotServer CI/CD / build (push) Successful in 3m19s
- GLM4.7 and Kimi K2.5 send response in 'reasoning_content' field, 'content' is null
- Prefer 'content' for normal models, fallback to 'reasoning_content' for reasoning models
- Fixes blank white screen when using z-ai/glm4.7 model

Co-authored-by: Qwen-Coder <qwen-coder@alibabacloud.com>
2026-04-13 15:18:21 -03:00
c5d30adebe revert: restore llm/mod.rs to stable April 9 version
All checks were successful
BotServer CI/CD / build (push) Successful in 3m26s
Co-authored-by: Qwen-Coder <qwen-coder@alibabacloud.com>
2026-04-13 15:07:19 -03:00
f8b47d1ac2 refactor: unify BASIC compilation into BasicCompiler only, runtime uses ScriptService::run() on pre-compiled .ast
Some checks failed
BotServer CI/CD / build (push) Has been cancelled
- Move all preprocessing transforms (convert_multiword_keywords, preprocess_llm_keyword,
  convert_while_wend_syntax, predeclare_variables) into BasicCompiler::preprocess_basic
  so .ast files are fully preprocessed by Drive Monitor
- Replace ScriptService compile/compile_preprocessed/compile_tool_script with
  single run(ast_content) that does engine.compile() + eval_ast_with_scope()
- Remove .bas fallback in tool_executor and start.bas paths - .ast only
- Remove dead code: preprocess_basic_script, normalize_variables_to_lowercase,
  convert_save_for_tools, parse_save_parts, normalize_word
- Fix: USE KB 'cartas' in tool .ast now correctly converted to USE_KB('cartas')
  during compilation, ensuring KB context injection works after tool execution
- Fix: add trace import in llm/mod.rs
2026-04-13 14:05:55 -03:00
723407cfd6 fix: add 60s timeout to LLM stream reads and add concurrent scan guard
All checks were successful
BotServer CI/CD / build (push) Successful in 3m53s
- Add tokio timeout to SSE stream reads in OpenAI client (60s)
- Prevents indefinite hang when Kimi/Nvidia stops responding
- Add scanning AtomicBool to prevent concurrent check_gbkb_changes calls
- Skip GBKB scan entirely when all KBs already indexed in Qdrant

Co-authored-by: Qwen-Coder <qwen-coder@alibabacloud.com>
2026-04-13 12:58:11 -03:00
d1652fc413 feat: add build_date to health endpoint for CI deploy verification
All checks were successful
BotServer CI/CD / build (push) Successful in 4m21s
- Add BOTSERVER_BUILD_DATE env var to /api/health response
- Set build date during CI compilation via environment variable
- Enables checking deployed binary age without SSH access

Co-authored-by: Qwen-Coder <qwen-coder@alibabacloud.com>
2026-04-13 11:49:10 -03:00
dd68cdbe6c fix: remove hardcoded salesianos, strip think tags globally, block reasoning_content leak
All checks were successful
BotServer CI/CD / build (push) Successful in 6m38s
- drive_monitor: replace hardcoded salesianos.gbot with dynamic bot_name
- llm/mod.rs: stop falling back to reasoning_content as content
- llm/claude.rs: same fix for Claude handler
- deepseek_r3: export strip_think_tags for reuse
- gpt_oss_20b: use strip_think_tags so all models strip tags
- gpt_oss_120b: use strip_think_tags so all models strip tags
2026-04-13 09:04:22 -03:00
1977c4c0af fix: extract base URL for embedding health checks
All checks were successful
BotServer CI/CD / build (push) Successful in 4m2s
- Add extract_base_url() helper to parse scheme://host:port from full URLs
- Fix health check to use base URL instead of full endpoint path
- Allows embedding-url config like http://host:port/v1/embeddings to work correctly
- Health check now goes to http://host:port/health instead of http://host:port/v1/embeddings/health
2026-04-12 19:33:35 -03:00
2f3dd957e3 fix: resolve kb_collections and kb_group_associations imports for directory feature
All checks were successful
BotServer CI/CD / build (push) Successful in 7m50s
- Extract kb_collections and kb_group_associations into dedicated schema/kb.rs module
- Gate kb module behind rbac feature (directory depends on rbac)
- Remove duplicate definitions from research.rs
- Fix import paths in directory/groups/kbs.rs
- Remove dead rbac_kb imports from settings/rbac.rs
- Gate llm::local module behind llm feature to fix missing set_embedding_server_ready

Co-authored-by: Qwen-Coder <qwen-coder@alibabacloud.com>
2026-04-12 12:48:42 -03:00
180bab0358 fix: mark embedding server ready when already running
All checks were successful
BotServer CI/CD / build (push) Successful in 3m36s
Previously, ensure_llama_servers_running() would return early
when both LLM and embedding servers were already running, without
calling set_embedding_server_ready(true). This caused DriveMonitor
to skip KB indexing with 'Embedding server not yet marked ready'.

Fix: call set_embedding_server_ready(true) before returning early
when servers are already running.
2026-04-12 10:27:23 -03:00
be3e4c4e54 Fix: Handle 'reasoning' field from NVIDIA kimi-k2.5 model
All checks were successful
BotServer CI/CD / build (push) Successful in 3m6s
2026-04-11 22:58:50 -03:00
47cb470c8e Fix: Handle reasoning_content from NVIDIA reasoning models (gpt-oss-120b)
All checks were successful
BotServer CI/CD / build (push) Successful in 3m16s
2026-04-11 22:30:39 -03:00
a4a3837c4c fix: critical bugs - LLM context truncation, bot creation, S3 endpoint, vectordb seed
All checks were successful
BotServer CI/CD / build (push) Successful in 3m28s
1. Fix model.starts_with('') always true - was limiting ALL models to 768 tokens
   (local llama limit), truncating system prompts and KB context. Now only
   applies when model=='local' or empty string, default is 32k tokens.

2. Fix create_bot_from_drive missing NOT NULL columns (llm_provider,
   context_provider) - bots auto-created from S3 buckets failed to persist.

3. Fix S3 endpoint URL construction missing port 9100.

4. Fix Vault seed: vectordb.url was empty string, now defaults to
   http://localhost:6333.

5. Fix Vault credential regeneration on recovery - added vault_seeds_exist().

6. Fix CA cert path for Vault TLS (botserver-stack vs botserver-stack).

7. Add bot verification after insert in create_bot_from_drive.
2026-04-11 17:56:03 -03:00
5576378b3f Update botserver: Multiple improvements across core modules
All checks were successful
BotServer CI/CD / build (push) Successful in 10m41s
2026-04-11 07:33:32 -03:00
2b2b386f5e Fix duplicate endpoint path in LLM URL
All checks were successful
BotServer CI/CD / build (push) Successful in 5m58s
2026-04-09 22:51:32 -03:00
90c14bcd09 Fix DETECT: use bot-specific DB pool, add anonymous auth when directory disabled
All checks were successful
BotServer CI/CD / build (push) Successful in 12m42s
2026-04-06 13:37:23 -03:00
7d8f141fc2 refactor: Replace all hardcoded ./botserver-stack paths with get_stack_path()/get_work_path()
Some checks failed
BotServer CI/CD / build (push) Failing after 1m28s
- Adds get_stack_path() helper: returns /opt/gbo in production (.env without botserver-stack), ./botserver-stack in dev
- Adds get_work_path() helper: returns /opt/gbo/work in production, ./botserver-stack/data/system/work in dev
- Updated 35+ files to use dynamic path resolution
- Production system container no longer needs botserver-stack directory
- Work files go to /opt/gbo/work instead of /opt/gbo/bin/botserver-stack
2026-04-04 09:24:44 -03:00
ab1f2df476 Read Drive config from Vault at runtime with fallback defaults
Some checks failed
BotServer CI / build (push) Failing after 7m26s
2026-03-17 00:00:36 -03:00
49e9f419f3 Fix: export observability module from llm/mod.rs
Some checks failed
BotServer CI / build (push) Has been cancelled
2026-03-16 22:00:55 -03:00
ef426b7a50 LXD proxy and container improvements
Some checks failed
BotServer CI / build (push) Failing after 7m5s
2026-03-15 15:50:02 -03:00
13892b3157 Fix tenant-org-bot relationship and CRM lead form 2026-03-12 18:19:18 -03:00
e98de24fe6 chore: update submodules
All checks were successful
BotServer CI / build (push) Successful in 9m56s
2026-03-10 19:39:31 -03:00
1053c86a73 fix: whatsapp dynamic routing and openai tool call accumulation
All checks were successful
BotServer CI / build (push) Successful in 13m40s
2026-03-10 17:19:17 -03:00
786d404938 feat: handle 429 rate limit in OpenAI non-stream generate
All checks were successful
BotServer CI / build (push) Successful in 11m7s
2026-03-10 15:26:10 -03:00
f34d401c2e feat: handle 429 rate limit in OpenAI client
Some checks failed
BotServer CI / build (push) Has been cancelled
2026-03-10 15:21:40 -03:00
260a13e77d refactor: apply various fixes across botserver
Some checks failed
BotServer CI / build (push) Has been cancelled
2026-03-10 15:15:21 -03:00
82bfd0a443 Fix Bedrock config for OpenAI GPT-OSS models
All checks were successful
BotServer CI / build (push) Successful in 12m35s
2026-03-10 12:36:24 -03:00
c523cee177 Use Redis to track last sent time per WhatsApp recipient
All checks were successful
BotServer CI / build (push) Successful in 13m37s
- Store last_sent timestamp in Redis (whatsapp:last_sent:<phone>)
- Always wait 6 seconds between messages to same recipient
- Persists across restarts
2026-03-09 21:00:45 -03:00
9f35863bff Simplify hallucination detector: only stop if 50+ repetitions
All checks were successful
BotServer CI / build (push) Successful in 9m8s
- Simple: count pattern repetitions, stop at 50
- Async API with Redis-backed counting
- 60-second window for cleanup
2026-03-09 20:02:29 -03:00
032de108fd Relax hallucination detector: ignore Markdown separators, increase thresholds
- Ignore ---, ***, ___ (legitimate Markdown)
- Increase consecutive threshold: 5 → 10
- Increase occurrence threshold: 8 → 15
- Increase token threshold: 10 → 15
2026-03-09 19:50:34 -03:00
77c35ccde5 feat: add WhatsApp rate limiting and LLM hallucination detection
All checks were successful
BotServer CI / build (push) Successful in 11m51s
2026-03-09 17:22:47 -03:00
c072fb936e fix(llm): load system-prompt from config.csv correctly
All checks were successful
BotServer CI / build (push) Successful in 17m27s
- Move system_prompt retrieval inside spawn_blocking closure
- Include system_prompt in the return tuple to fix scope issue
- Add trace logging for debugging system-prompt loading
- GLM-5 and other LLM providers now correctly receive custom system prompts

🤖 Generated with [Claude Code](https://claude.com/claude-code)

Co-Authored-By: Claude <noreply@anthropic.com>
2026-03-09 11:55:05 -03:00
8500949fcd fix: Lower KB search thresholds and add Cloudflare AI embedding support
- Lower score_threshold in kb_indexer.rs from 0.5 to 0.3
- Lower website search threshold in kb_context.rs from 0.6 to 0.4
- Lower KB search threshold in kb_context.rs from 0.7 to 0.5
- Add Cloudflare AI (/ai/run/) URL detection in cache.rs
- Add Cloudflare AI request format ({"text": ...}) in cache.rs
- Add Cloudflare AI response parsing (result.data) in cache.rs

This fixes the issue where KB search returned 0 results even with
114 chunks indexed. The high thresholds were filtering out all results.

🤖 Generated with [Claude Code](https://claude.com/claude-code)

Co-Authored-By: Claude <noreply@anthropic.com>
2026-03-05 00:06:17 -03:00
e389178a36 feat: Add HuggingFace embedding API support with authentication
- Added api_key field to LocalEmbeddingService for authentication
- Added embedding-key config parameter support in bootstrap
- Smart URL handling: doesn't append /embedding for full URLs (HuggingFace, OpenAI, etc.)
- Detects HuggingFace URLs and uses correct request format (inputs instead of input/model)
- Handles multiple response formats:
  - HuggingFace: direct array [0.1, 0.2, ...]
  - Standard/OpenAI: {"data": [{"embedding": [...]}]}
- Added Authorization header with Bearer token when api_key is provided
- Improved error messages with full response details

Fixes embedding errors when using HuggingFace Inference API
2026-03-04 16:54:25 -03:00
d5b877f8e8 fix: Improve embedding error handling and add semantic cache toggle
- Enhanced error messages in LocalEmbeddingService to show actual HTTP status and response
- Added semantic-cache-enabled config parameter to disable semantic matching when embedding service unavailable
- Improved error logging with full response details for debugging production issues
- Prevents 'Invalid embedding response' errors by allowing graceful fallback
2026-03-04 16:49:22 -03:00
8f495c75ec WIP: Local changes before merging master into main 2026-03-01 07:40:11 -03:00
1856215d05 chore: update dependencies and formatting
All checks were successful
BotServer CI / build (push) Successful in 7m30s
2026-02-22 15:55:39 -03:00
de017241f2 fix: Complete security remediation - RCE and SSRF fixes
All checks were successful
BotServer CI / build (push) Successful in 7m34s
- Fixed RCE vulnerability in trusted_shell_script_arg execution
- Fixed SSRF vulnerability in GET command with internal IP blocking
- Updated SafeCommand to use explicit positional arguments

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
2026-02-20 01:14:14 +00:00
e143968179 feat: Add JWT secret rotation and health verification
SEC-02: Implement credential rotation security improvements

- Add JWT secret rotation to rotate-secret command
- Generate 64-character HS512-compatible secrets
- Automatic .env backup with timestamp
- Atomic file updates via temp+rename pattern
- Add health verification for rotated credentials
- Route rotate-secret, rotate-secrets, vault commands in CLI
- Add verification attempts for database and JWT endpoints

Security improvements:
- JWT_SECRET now rotatable (previously impossible)
- Automatic rollback via backup files
- Health checks catch configuration errors
- Clear warnings about token invalidation

Co-Authored-By: Claude Sonnet 4.5 <noreply@anthropic.com>
2026-02-19 19:42:41 +00:00
98813fbdc8 chore: Fix warnings and clean TODO refs 2026-02-19 12:18:39 +00:00
ac5b814536 fix(security): Fix unsafe code, CORS logic, and expect usage 2026-02-19 12:06:05 +00:00
b1118f977d fix: Correct parameter names in tool .bas files to match database schema
- Tool 06: Change tipoExibicao to tipoDescricao (matches pedidos_uso_imagem table)
- Tool 07: Change tipoExibicao to categoriaDescricao (matches licenciamentos table)
- Both tools now compile and execute successfully with database inserts

Co-Authored-By: Claude Sonnet 4.5 <noreply@anthropic.com>
2026-02-18 17:51:47 +00:00
0a1bd25869 fix: Increase default n_predict to 512 for DeepSeek R1 reasoning
All checks were successful
BotServer CI / build (push) Successful in 9m26s
DeepSeek R1 model outputs reasoning_content first, then content.
With n_predict=50, responses were truncated during reasoning phase.
Increased to 512 to allow full reasoning + response.

Co-Authored-By: Claude Sonnet 4.5 <noreply@anthropic.com>
2026-02-15 20:27:35 +00:00
1cee912b72 fix: Correct LLM model paths and remove unnecessary cd command
- Change model paths to use ./data/llm/ instead of relative paths from build dir
- Remove cd command when starting llama-server to keep botserver root as cwd
- This fixes model loading when servers are started from different directories
- Both LLM and embedding servers now start successfully

Co-Authored-By: Claude Sonnet 4.5 <noreply@anthropic.com>
2026-02-15 20:15:17 +00:00
0c9665dd8b fix: Enable vector_db by default with health check and fallback to local LLM
- Add vector_db_health_check() function to verify Qdrant availability
- Add wait loop for vector_db startup in bootstrap (15 seconds)
- Fallback to local LLM when external URL configured but no API key provided
- Prevent external LLM (api.z.ai) usage without authentication key

This fixes the production issues:
- Qdrant vector database not available at https://localhost:6333
- External LLM being used instead of local when no key is configured
- Ensures vector_db is properly started and ready before use

Co-Authored-By: Claude Sonnet 4.5 <noreply@anthropic.com>
2026-02-15 14:54:17 +00:00
307809bbdd fix: Handle empty config values for LLM server arguments
All checks were successful
BotServer CI / build (push) Successful in 8m3s
The config_manager.get_config() can return Ok("") for empty config values,
which would pass through unwrap_or_else() without using the default.

Added checks after config retrieval to use defaults when config values
are empty strings:
- gpu_layers: "20" (default for GPU layers)
- n_moe: "4" (default for MoE)
- parallel: "1" (default for parallel)
- n_predict: "50" (default for predict)
- n_ctx_size: "32000" (default for context size)

This fixes the error: "error while handling argument --n-gpu-layers: stoi"

Co-Authored-By: Claude Sonnet 4.5 <noreply@anthropic.com>
2026-02-15 13:17:26 +00:00
58adf7c4ae fix: Set default llm_server_path and correct model file paths
Some checks failed
BotServer CI / build (push) Has been cancelled
When no default.gbai/config.csv exists, the system now:
- Sets default llm_server_path to ./botserver-stack/bin/llm/build/bin
- Uses correct relative paths to model files: ../../../../data/llm/
- Uses actual model filenames from 3rdparty.toml

This fixes the issue where LLM/embedding servers couldn't find model files
because the paths were constructed incorrectly.

Model filenames:
- LLM: DeepSeek-R1-Distill-Qwen-1.5B-Q3_K_M.gguf
- Embedding: bge-small-en-v1.5-f32.gguf

Co-Authored-By: Claude Sonnet 4.5 <noreply@anthropic.com>
2026-02-15 13:11:14 +00:00
0e6e2bfc6d fix: Correct default LLM model to deepseek-small
All checks were successful
BotServer CI / build (push) Successful in 8m57s
Changed the default LLM model from glm-4 to deepseek-small to match
the model defined in 3rdparty.toml ([models.deepseek_small]).

This ensures that when no default.gbai/config.csv exists, the system
uses the correct default local model.

Co-Authored-By: Claude Sonnet 4.5 <noreply@anthropic.com>
2026-02-15 12:56:15 +00:00
337bef3bad fix: Use default local LLM models when config is empty
Some checks failed
BotServer CI / build (push) Has been cancelled
When no default.gbai/config.csv exists or when llm-model/embedding-model
config is empty, the system now uses default local models instead of
skipping server startup.

Changes:
- Default LLM model: glm-4
- Default Embedding model: bge-small-en-v1.5
- Logs when using defaults

This fixes the issue where the "default" bot would fail to load LLM
and Embedding services when no config.csv was present, causing the
error: "not loading embedding neither llm local for default bot"

Co-Authored-By: Claude Sonnet 4.5 <noreply@anthropic.com>
2026-02-15 12:54:40 +00:00