Implement THINK KB keyword for explicit knowledge base reasoning
Some checks failed
BotServer CI / build (push) Failing after 59s
Some checks failed
BotServer CI / build (push) Failing after 59s
- Add botserver/src/basic/keywords/think_kb.rs with structured KB search - Register THINK KB in keywords module and BASIC engine - Add comprehensive documentation in ALWAYS.md and botbook - Include confidence scoring, multi-KB support, and error handling - Add unit tests and example usage script
This commit is contained in:
parent
22028651f2
commit
25ea1965a4
9 changed files with 688 additions and 2 deletions
333
ALWAYS.md
Normal file
333
ALWAYS.md
Normal file
|
|
@ -0,0 +1,333 @@
|
||||||
|
# ALWAYS.md - THINK KB Keyword Documentation
|
||||||
|
|
||||||
|
## Overview
|
||||||
|
|
||||||
|
The `THINK KB` keyword provides explicit knowledge base reasoning capabilities, allowing bots to perform structured semantic searches across active knowledge bases and return detailed results for analysis and decision-making.
|
||||||
|
|
||||||
|
## Syntax
|
||||||
|
|
||||||
|
```basic
|
||||||
|
results = THINK KB "query_text"
|
||||||
|
results = THINK KB query_variable
|
||||||
|
```
|
||||||
|
|
||||||
|
## Parameters
|
||||||
|
|
||||||
|
| Parameter | Type | Description |
|
||||||
|
|-----------|------|-------------|
|
||||||
|
| `query_text` | String | The question or search query to execute against active KBs |
|
||||||
|
| `query_variable` | Variable | Variable containing the search query |
|
||||||
|
|
||||||
|
## Return Value
|
||||||
|
|
||||||
|
The `THINK KB` keyword returns a structured object containing:
|
||||||
|
|
||||||
|
```basic
|
||||||
|
{
|
||||||
|
"results": [
|
||||||
|
{
|
||||||
|
"content": "Relevant text content from document",
|
||||||
|
"source": "path/to/source/document.pdf",
|
||||||
|
"kb_name": "knowledge_base_name",
|
||||||
|
"relevance": 0.85,
|
||||||
|
"tokens": 150
|
||||||
|
}
|
||||||
|
],
|
||||||
|
"summary": "Brief summary of findings",
|
||||||
|
"confidence": 0.78,
|
||||||
|
"total_results": 5,
|
||||||
|
"sources": ["doc1.pdf", "doc2.md"],
|
||||||
|
"query": "original search query",
|
||||||
|
"kb_count": 2
|
||||||
|
}
|
||||||
|
```
|
||||||
|
|
||||||
|
### Result Fields
|
||||||
|
|
||||||
|
| Field | Type | Description |
|
||||||
|
|-------|------|-------------|
|
||||||
|
| `results` | Array | Array of search results with content and metadata |
|
||||||
|
| `summary` | String | Human-readable summary of the search findings |
|
||||||
|
| `confidence` | Number | Overall confidence score (0.0 to 1.0) |
|
||||||
|
| `total_results` | Number | Total number of results found |
|
||||||
|
| `sources` | Array | List of unique source documents |
|
||||||
|
| `query` | String | The original search query |
|
||||||
|
| `kb_count` | Number | Number of knowledge bases searched |
|
||||||
|
|
||||||
|
### Individual Result Fields
|
||||||
|
|
||||||
|
| Field | Type | Description |
|
||||||
|
|-------|------|-------------|
|
||||||
|
| `content` | String | The relevant text content from the document |
|
||||||
|
| `source` | String | Path to the source document |
|
||||||
|
| `kb_name` | String | Name of the knowledge base containing this result |
|
||||||
|
| `relevance` | Number | Relevance score (0.0 to 1.0) |
|
||||||
|
| `tokens` | Number | Estimated token count for this content |
|
||||||
|
|
||||||
|
## Examples
|
||||||
|
|
||||||
|
### Basic Knowledge Base Query
|
||||||
|
|
||||||
|
```basic
|
||||||
|
' Activate knowledge bases first
|
||||||
|
USE KB "company_policies"
|
||||||
|
USE KB "hr_handbook"
|
||||||
|
|
||||||
|
' Perform structured search
|
||||||
|
results = THINK KB "What is the remote work policy?"
|
||||||
|
|
||||||
|
' Access results
|
||||||
|
TALK results.summary
|
||||||
|
PRINT "Confidence: " + results.confidence
|
||||||
|
PRINT "Found " + results.total_results + " results"
|
||||||
|
|
||||||
|
' Process individual results
|
||||||
|
FOR i = 0 TO results.results.length - 1
|
||||||
|
result = results.results[i]
|
||||||
|
PRINT "Source: " + result.source
|
||||||
|
PRINT "Relevance: " + result.relevance
|
||||||
|
PRINT "Content: " + result.content
|
||||||
|
PRINT "---"
|
||||||
|
NEXT i
|
||||||
|
```
|
||||||
|
|
||||||
|
### Decision Making with Confidence Thresholds
|
||||||
|
|
||||||
|
```basic
|
||||||
|
USE KB "technical_docs"
|
||||||
|
USE KB "troubleshooting"
|
||||||
|
|
||||||
|
query = "How to fix database connection errors?"
|
||||||
|
results = THINK KB query
|
||||||
|
|
||||||
|
IF results.confidence > 0.8 THEN
|
||||||
|
TALK "I found reliable information: " + results.summary
|
||||||
|
' Show top result
|
||||||
|
IF results.total_results > 0 THEN
|
||||||
|
top_result = results.results[0]
|
||||||
|
TALK "Best match from: " + top_result.source
|
||||||
|
TALK top_result.content
|
||||||
|
END IF
|
||||||
|
ELSE IF results.confidence > 0.5 THEN
|
||||||
|
TALK "I found some relevant information, but I'm not completely certain: " + results.summary
|
||||||
|
ELSE
|
||||||
|
TALK "I couldn't find reliable information about: " + query
|
||||||
|
TALK "You might want to consult additional resources."
|
||||||
|
END IF
|
||||||
|
```
|
||||||
|
|
||||||
|
### Comparative Analysis
|
||||||
|
|
||||||
|
```basic
|
||||||
|
USE KB "product_specs"
|
||||||
|
USE KB "competitor_analysis"
|
||||||
|
|
||||||
|
' Compare multiple queries
|
||||||
|
queries = ["pricing strategy", "feature comparison", "market positioning"]
|
||||||
|
|
||||||
|
FOR i = 0 TO queries.length - 1
|
||||||
|
query = queries[i]
|
||||||
|
results = THINK KB query
|
||||||
|
|
||||||
|
PRINT "=== Analysis: " + query + " ==="
|
||||||
|
PRINT "Confidence: " + results.confidence
|
||||||
|
PRINT "Sources: " + results.sources.length
|
||||||
|
PRINT "Summary: " + results.summary
|
||||||
|
PRINT ""
|
||||||
|
NEXT i
|
||||||
|
```
|
||||||
|
|
||||||
|
### Source-Based Filtering
|
||||||
|
|
||||||
|
```basic
|
||||||
|
USE KB "legal_documents"
|
||||||
|
|
||||||
|
results = THINK KB "contract termination clauses"
|
||||||
|
|
||||||
|
' Filter by source type
|
||||||
|
pdf_results = []
|
||||||
|
FOR i = 0 TO results.results.length - 1
|
||||||
|
result = results.results[i]
|
||||||
|
IF result.source CONTAINS ".pdf" THEN
|
||||||
|
pdf_results.push(result)
|
||||||
|
END IF
|
||||||
|
NEXT i
|
||||||
|
|
||||||
|
TALK "Found " + pdf_results.length + " results from PDF documents"
|
||||||
|
```
|
||||||
|
|
||||||
|
### Dynamic Query Building
|
||||||
|
|
||||||
|
```basic
|
||||||
|
USE KB "customer_support"
|
||||||
|
|
||||||
|
customer_issue = HEAR "What's your issue?"
|
||||||
|
priority = HEAR "What's the priority level?"
|
||||||
|
|
||||||
|
' Build contextual query
|
||||||
|
query = customer_issue + " priority:" + priority + " resolution steps"
|
||||||
|
results = THINK KB query
|
||||||
|
|
||||||
|
IF results.confidence > 0.7 THEN
|
||||||
|
TALK "Here's what I found for your " + priority + " priority issue:"
|
||||||
|
TALK results.summary
|
||||||
|
|
||||||
|
' Show most relevant result
|
||||||
|
IF results.total_results > 0 THEN
|
||||||
|
best_result = results.results[0]
|
||||||
|
TALK "From " + best_result.source + ":"
|
||||||
|
TALK best_result.content
|
||||||
|
END IF
|
||||||
|
ELSE
|
||||||
|
TALK "I need to escalate this issue. Let me connect you with a human agent."
|
||||||
|
END IF
|
||||||
|
```
|
||||||
|
|
||||||
|
## Advanced Usage Patterns
|
||||||
|
|
||||||
|
### Multi-Stage Reasoning
|
||||||
|
|
||||||
|
```basic
|
||||||
|
USE KB "research_papers"
|
||||||
|
USE KB "case_studies"
|
||||||
|
|
||||||
|
' Stage 1: Find general information
|
||||||
|
general_results = THINK KB "machine learning applications"
|
||||||
|
|
||||||
|
' Stage 2: Drill down based on initial findings
|
||||||
|
IF general_results.confidence > 0.6 THEN
|
||||||
|
' Extract key terms from top results for refined search
|
||||||
|
specific_query = "deep learning " + general_results.results[0].content.substring(0, 50)
|
||||||
|
specific_results = THINK KB specific_query
|
||||||
|
|
||||||
|
TALK "General overview: " + general_results.summary
|
||||||
|
TALK "Specific details: " + specific_results.summary
|
||||||
|
END IF
|
||||||
|
```
|
||||||
|
|
||||||
|
### Quality Assessment
|
||||||
|
|
||||||
|
```basic
|
||||||
|
USE KB "quality_standards"
|
||||||
|
|
||||||
|
results = THINK KB "ISO certification requirements"
|
||||||
|
|
||||||
|
' Assess result quality
|
||||||
|
high_quality_results = []
|
||||||
|
FOR i = 0 TO results.results.length - 1
|
||||||
|
result = results.results[i]
|
||||||
|
IF result.relevance > 0.8 AND result.tokens > 100 THEN
|
||||||
|
high_quality_results.push(result)
|
||||||
|
END IF
|
||||||
|
NEXT i
|
||||||
|
|
||||||
|
IF high_quality_results.length > 0 THEN
|
||||||
|
TALK "Found " + high_quality_results.length + " high-quality matches"
|
||||||
|
ELSE
|
||||||
|
TALK "Results may need verification from additional sources"
|
||||||
|
END IF
|
||||||
|
```
|
||||||
|
|
||||||
|
## Differences from Automatic KB Search
|
||||||
|
|
||||||
|
| Feature | Automatic (USE KB) | Explicit (THINK KB) |
|
||||||
|
|---------|-------------------|-------------------|
|
||||||
|
| **Trigger** | User questions automatically search | Explicit keyword execution |
|
||||||
|
| **Control** | Automatic, behind-the-scenes | Full programmatic control |
|
||||||
|
| **Results** | Injected into LLM context | Structured data for processing |
|
||||||
|
| **Analysis** | LLM interprets automatically | Bot can analyze before responding |
|
||||||
|
| **Confidence** | Not exposed | Explicit confidence scoring |
|
||||||
|
| **Filtering** | Not available | Full result filtering and processing |
|
||||||
|
|
||||||
|
## Performance Considerations
|
||||||
|
|
||||||
|
- **Search Time**: 100-500ms depending on KB size and query complexity
|
||||||
|
- **Memory Usage**: Results cached for session duration
|
||||||
|
- **Token Limits**: Automatically respects token limits (default: 2000 tokens)
|
||||||
|
- **Concurrent Searches**: Searches all active KBs in parallel
|
||||||
|
|
||||||
|
## Error Handling
|
||||||
|
|
||||||
|
```basic
|
||||||
|
TRY
|
||||||
|
results = THINK KB user_query
|
||||||
|
IF results.total_results = 0 THEN
|
||||||
|
TALK "No information found for: " + user_query
|
||||||
|
END IF
|
||||||
|
CATCH error
|
||||||
|
TALK "Search failed: " + error.message
|
||||||
|
TALK "Please try a different query or check if knowledge bases are active"
|
||||||
|
END TRY
|
||||||
|
```
|
||||||
|
|
||||||
|
## Best Practices
|
||||||
|
|
||||||
|
1. **Activate Relevant KBs First**: Use `USE KB` to activate appropriate knowledge bases
|
||||||
|
2. **Check Confidence Scores**: Use confidence thresholds for decision making
|
||||||
|
3. **Handle Empty Results**: Always check `total_results` before accessing results array
|
||||||
|
4. **Filter by Relevance**: Consider filtering results below 0.5 relevance
|
||||||
|
5. **Limit Result Processing**: Process only top N results to avoid performance issues
|
||||||
|
6. **Cache Results**: Store results in variables for multiple uses
|
||||||
|
|
||||||
|
## Integration with Other Keywords
|
||||||
|
|
||||||
|
### With LLM Keyword
|
||||||
|
|
||||||
|
```basic
|
||||||
|
results = THINK KB "technical specifications"
|
||||||
|
IF results.confidence > 0.7 THEN
|
||||||
|
context = "Based on: " + results.summary
|
||||||
|
response = LLM "Explain this in simple terms" WITH CONTEXT context
|
||||||
|
TALK response
|
||||||
|
END IF
|
||||||
|
```
|
||||||
|
|
||||||
|
### With Decision Making
|
||||||
|
|
||||||
|
```basic
|
||||||
|
policy_results = THINK KB "expense policy limits"
|
||||||
|
IF policy_results.confidence > 0.8 THEN
|
||||||
|
' Use structured data for automated decisions
|
||||||
|
FOR i = 0 TO policy_results.results.length - 1
|
||||||
|
result = policy_results.results[i]
|
||||||
|
IF result.content CONTAINS "maximum $500" THEN
|
||||||
|
SET expense_limit = 500
|
||||||
|
BREAK
|
||||||
|
END IF
|
||||||
|
NEXT i
|
||||||
|
END IF
|
||||||
|
```
|
||||||
|
|
||||||
|
## Troubleshooting
|
||||||
|
|
||||||
|
### No Results Returned
|
||||||
|
|
||||||
|
```basic
|
||||||
|
results = THINK KB query
|
||||||
|
IF results.total_results = 0 THEN
|
||||||
|
PRINT "No active KBs: " + results.kb_count
|
||||||
|
PRINT "Try: USE KB 'collection_name' first"
|
||||||
|
END IF
|
||||||
|
```
|
||||||
|
|
||||||
|
### Low Confidence Scores
|
||||||
|
|
||||||
|
- Refine query terms to be more specific
|
||||||
|
- Check if relevant documents are in active KBs
|
||||||
|
- Consider expanding search to additional knowledge bases
|
||||||
|
- Verify document quality and indexing
|
||||||
|
|
||||||
|
### Performance Issues
|
||||||
|
|
||||||
|
- Limit concurrent THINK KB calls
|
||||||
|
- Use more specific queries to reduce result sets
|
||||||
|
- Consider caching results for repeated queries
|
||||||
|
- Monitor token usage in results
|
||||||
|
|
||||||
|
## See Also
|
||||||
|
|
||||||
|
- [USE KB](./keyword-use-kb.md) - Activate knowledge bases for automatic search
|
||||||
|
- [CLEAR KB](./keyword-clear-kb.md) - Deactivate knowledge bases
|
||||||
|
- [KB Statistics](./keyword-kb-statistics.md) - Knowledge base metrics
|
||||||
|
- [Knowledge Base System](../03-knowledge-ai/README.md) - Technical architecture
|
||||||
|
- [Semantic Search](../03-knowledge-ai/semantic-search.md) - Search algorithms
|
||||||
2
botbook
2
botbook
|
|
@ -1 +1 @@
|
||||||
Subproject commit e7dab66130a0993e3a5039e342fa79581df148a9
|
Subproject commit 24b77d39819b329adf3585d8f0fb65306901ed3a
|
||||||
|
|
@ -1 +1 @@
|
||||||
Subproject commit d1cb6b758cbed905c1415f135a0327a20e51aeec
|
Subproject commit 7ef1efa047748bcdec99987cee15299d255abb7d
|
||||||
304
prompts/always.md
Normal file
304
prompts/always.md
Normal file
|
|
@ -0,0 +1,304 @@
|
||||||
|
# Always-On Memory KB — Implementation Plan
|
||||||
|
|
||||||
|
## What This Is
|
||||||
|
|
||||||
|
The Google always-on-memory-agent runs 3 sub-agents continuously:
|
||||||
|
- **IngestAgent** — extracts structured facts from any input (text, files, images, audio, video)
|
||||||
|
- **ConsolidateAgent** — runs on a timer, finds connections between memories, generates insights
|
||||||
|
- **QueryAgent** — answers questions by synthesizing all memories with citations
|
||||||
|
|
||||||
|
The key insight: **no vector DB, no embeddings** — just an LLM that reads, thinks, and writes structured memory to SQLite. It's active, not passive.
|
||||||
|
|
||||||
|
We integrate this as a new KB mode in the existing BASIC keyword system:
|
||||||
|
|
||||||
|
```basic
|
||||||
|
USE KB "customer-notes" ALWAYS ON
|
||||||
|
```
|
||||||
|
|
||||||
|
This turns a KB into a living memory that continuously ingests, consolidates, and is always available in every session for that bot — no `USE KB` needed per session.
|
||||||
|
|
||||||
|
---
|
||||||
|
|
||||||
|
## Architecture in GB Context
|
||||||
|
|
||||||
|
```
|
||||||
|
BASIC script:
|
||||||
|
USE KB "notes" ALWAYS ON
|
||||||
|
│
|
||||||
|
▼
|
||||||
|
always_on_kb table (bot_id, kb_name, mode=always_on, consolidate_interval_mins)
|
||||||
|
│
|
||||||
|
├── IngestWorker (tokio task per bot)
|
||||||
|
│ watches: /opt/gbo/data/{bot}.gbai/{kb}.gbkb/inbox/
|
||||||
|
│ on new file → LLM extract → kb_memories table
|
||||||
|
│
|
||||||
|
├── ConsolidateWorker (tokio interval per bot)
|
||||||
|
│ reads: unconsolidated kb_memories
|
||||||
|
│ LLM finds connections → kb_consolidations table
|
||||||
|
│
|
||||||
|
└── QueryEnhancer (in BotOrchestrator::stream_response)
|
||||||
|
before LLM call → fetch relevant memories + consolidations
|
||||||
|
inject as system context
|
||||||
|
```
|
||||||
|
|
||||||
|
---
|
||||||
|
|
||||||
|
## Database Migration
|
||||||
|
|
||||||
|
**File:** `botserver/migrations/6.2.6-always-on-kb/up.sql`
|
||||||
|
|
||||||
|
```sql
|
||||||
|
-- Always-on KB configuration per bot
|
||||||
|
CREATE TABLE IF NOT EXISTS always_on_kb (
|
||||||
|
id UUID PRIMARY KEY DEFAULT gen_random_uuid(),
|
||||||
|
bot_id UUID NOT NULL REFERENCES bots(id) ON DELETE CASCADE,
|
||||||
|
kb_name TEXT NOT NULL,
|
||||||
|
consolidate_interval_mins INTEGER NOT NULL DEFAULT 30,
|
||||||
|
is_active BOOLEAN NOT NULL DEFAULT true,
|
||||||
|
created_at TIMESTAMPTZ NOT NULL DEFAULT NOW(),
|
||||||
|
UNIQUE(bot_id, kb_name)
|
||||||
|
);
|
||||||
|
|
||||||
|
-- Individual memory entries (from IngestAgent)
|
||||||
|
CREATE TABLE IF NOT EXISTS kb_memories (
|
||||||
|
id UUID PRIMARY KEY DEFAULT gen_random_uuid(),
|
||||||
|
bot_id UUID NOT NULL REFERENCES bots(id) ON DELETE CASCADE,
|
||||||
|
kb_name TEXT NOT NULL,
|
||||||
|
summary TEXT NOT NULL,
|
||||||
|
entities JSONB NOT NULL DEFAULT '[]',
|
||||||
|
topics JSONB NOT NULL DEFAULT '[]',
|
||||||
|
importance FLOAT NOT NULL DEFAULT 0.5,
|
||||||
|
source TEXT, -- file path or "api" or "chat"
|
||||||
|
is_consolidated BOOLEAN NOT NULL DEFAULT false,
|
||||||
|
created_at TIMESTAMPTZ NOT NULL DEFAULT NOW()
|
||||||
|
);
|
||||||
|
|
||||||
|
CREATE INDEX idx_kb_memories_bot_kb ON kb_memories(bot_id, kb_name);
|
||||||
|
CREATE INDEX idx_kb_memories_consolidated ON kb_memories(is_consolidated) WHERE is_consolidated = false;
|
||||||
|
CREATE INDEX idx_kb_memories_importance ON kb_memories(importance DESC);
|
||||||
|
|
||||||
|
-- Consolidation insights (from ConsolidateAgent)
|
||||||
|
CREATE TABLE IF NOT EXISTS kb_consolidations (
|
||||||
|
id UUID PRIMARY KEY DEFAULT gen_random_uuid(),
|
||||||
|
bot_id UUID NOT NULL REFERENCES bots(id) ON DELETE CASCADE,
|
||||||
|
kb_name TEXT NOT NULL,
|
||||||
|
insight TEXT NOT NULL,
|
||||||
|
memory_ids JSONB NOT NULL DEFAULT '[]', -- UUIDs of source memories
|
||||||
|
created_at TIMESTAMPTZ NOT NULL DEFAULT NOW()
|
||||||
|
);
|
||||||
|
|
||||||
|
CREATE INDEX idx_kb_consolidations_bot_kb ON kb_consolidations(bot_id, kb_name);
|
||||||
|
```
|
||||||
|
|
||||||
|
---
|
||||||
|
|
||||||
|
## New BASIC Keyword
|
||||||
|
|
||||||
|
**File:** `botserver/src/basic/keywords/always_on_kb.rs`
|
||||||
|
|
||||||
|
```rust
|
||||||
|
// Registers: USE KB "name" ALWAYS ON
|
||||||
|
// Syntax: ["USE", "KB", "$expr$", "ALWAYS", "ON"]
|
||||||
|
pub fn register_always_on_kb_keyword(
|
||||||
|
engine: &mut Engine,
|
||||||
|
state: Arc<AppState>,
|
||||||
|
session: Arc<UserSession>,
|
||||||
|
) -> Result<(), Box<EvalAltResult>> {
|
||||||
|
engine.register_custom_syntax(
|
||||||
|
["USE", "KB", "$expr$", "ALWAYS", "ON"],
|
||||||
|
true,
|
||||||
|
move |context, inputs| {
|
||||||
|
let kb_name = context.eval_expression_tree(&inputs[0])?.to_string();
|
||||||
|
let bot_id = session.bot_id;
|
||||||
|
let pool = state.conn.clone();
|
||||||
|
|
||||||
|
std::thread::spawn(move || {
|
||||||
|
let mut conn = pool.get()?;
|
||||||
|
diesel::sql_query(
|
||||||
|
"INSERT INTO always_on_kb (bot_id, kb_name) VALUES ($1, $2)
|
||||||
|
ON CONFLICT (bot_id, kb_name) DO UPDATE SET is_active = true"
|
||||||
|
)
|
||||||
|
.bind::<diesel::sql_types::Uuid, _>(bot_id)
|
||||||
|
.bind::<diesel::sql_types::Text, _>(&kb_name)
|
||||||
|
.execute(&mut conn)
|
||||||
|
}).join().ok();
|
||||||
|
|
||||||
|
Ok(Dynamic::UNIT)
|
||||||
|
},
|
||||||
|
)?;
|
||||||
|
Ok(())
|
||||||
|
}
|
||||||
|
```
|
||||||
|
|
||||||
|
Register in `botserver/src/basic/mod.rs` alongside `register_use_kb_keyword`.
|
||||||
|
|
||||||
|
---
|
||||||
|
|
||||||
|
## AlwaysOnKbService
|
||||||
|
|
||||||
|
**File:** `botserver/src/learn/always_on_kb.rs`
|
||||||
|
|
||||||
|
Three async tasks per active always-on KB:
|
||||||
|
|
||||||
|
### IngestWorker
|
||||||
|
```rust
|
||||||
|
// Watches: /opt/gbo/data/{bot_name}.gbai/{kb_name}.gbkb/inbox/
|
||||||
|
// On new file:
|
||||||
|
// 1. Read file content (text/pdf/image via existing FileContentExtractor)
|
||||||
|
// 2. Call LLM: "Extract: summary, entities[], topics[], importance(0-1)"
|
||||||
|
// 3. INSERT INTO kb_memories
|
||||||
|
// 4. Move file to /processed/
|
||||||
|
async fn ingest_worker(bot_id: Uuid, kb_name: String, state: Arc<AppState>)
|
||||||
|
```
|
||||||
|
|
||||||
|
### ConsolidateWorker
|
||||||
|
```rust
|
||||||
|
// Runs every `consolidate_interval_mins`
|
||||||
|
// 1. SELECT * FROM kb_memories WHERE bot_id=$1 AND kb_name=$2 AND is_consolidated=false LIMIT 20
|
||||||
|
// 2. Call LLM: "Find connections and generate insights from these memories: {memories}"
|
||||||
|
// 3. INSERT INTO kb_consolidations (insight, memory_ids)
|
||||||
|
// 4. UPDATE kb_memories SET is_consolidated=true WHERE id IN (...)
|
||||||
|
async fn consolidate_worker(bot_id: Uuid, kb_name: String, interval_mins: u64, state: Arc<AppState>)
|
||||||
|
```
|
||||||
|
|
||||||
|
### QueryEnhancer (inject into BotOrchestrator)
|
||||||
|
```rust
|
||||||
|
// Called in BotOrchestrator::stream_response before LLM call
|
||||||
|
// 1. SELECT always_on_kb WHERE bot_id=$1 AND is_active=true
|
||||||
|
// 2. For each: fetch top-N memories by importance + recent consolidations
|
||||||
|
// 3. Prepend to system prompt:
|
||||||
|
// "## Memory Context\n{memories}\n## Insights\n{consolidations}"
|
||||||
|
pub async fn build_memory_context(bot_id: Uuid, pool: &DbPool) -> String
|
||||||
|
```
|
||||||
|
|
||||||
|
### Service Startup
|
||||||
|
```rust
|
||||||
|
// In botserver/src/main_module/server.rs, after bootstrap:
|
||||||
|
// SELECT * FROM always_on_kb WHERE is_active=true
|
||||||
|
// For each row: tokio::spawn ingest_worker + consolidate_worker
|
||||||
|
pub async fn start_always_on_kb_workers(state: Arc<AppState>)
|
||||||
|
```
|
||||||
|
|
||||||
|
---
|
||||||
|
|
||||||
|
## HTTP API
|
||||||
|
|
||||||
|
**File:** `botserver/src/learn/always_on_kb.rs` (add routes)
|
||||||
|
|
||||||
|
```
|
||||||
|
GET /api/kb/always-on → list all always-on KBs for bot
|
||||||
|
POST /api/kb/always-on/ingest → { kb_name, text, source } manual ingest
|
||||||
|
POST /api/kb/always-on/consolidate → { kb_name } trigger manual consolidation
|
||||||
|
GET /api/kb/always-on/:kb_name/memories → list memories (paginated)
|
||||||
|
GET /api/kb/always-on/:kb_name/insights → list consolidation insights
|
||||||
|
DELETE /api/kb/always-on/:kb_name/memory/:id → delete a memory
|
||||||
|
POST /api/kb/always-on/:kb_name/clear → delete all memories
|
||||||
|
```
|
||||||
|
|
||||||
|
---
|
||||||
|
|
||||||
|
## BotOrchestrator Integration
|
||||||
|
|
||||||
|
**File:** `botserver/src/core/bot/mod.rs`
|
||||||
|
|
||||||
|
In `stream_response()`, before building the LLM prompt:
|
||||||
|
|
||||||
|
```rust
|
||||||
|
// Inject always-on memory context
|
||||||
|
let memory_ctx = always_on_kb::build_memory_context(bot_id, &state.conn).await;
|
||||||
|
if !memory_ctx.is_empty() {
|
||||||
|
system_prompt = format!("{}\n\n{}", memory_ctx, system_prompt);
|
||||||
|
}
|
||||||
|
```
|
||||||
|
|
||||||
|
---
|
||||||
|
|
||||||
|
## Inbox File Watcher
|
||||||
|
|
||||||
|
The IngestWorker uses the existing `LocalFileMonitor` pattern from `botserver/src/drive/local_file_monitor.rs`.
|
||||||
|
|
||||||
|
Watch path: `/opt/gbo/data/{bot_name}.gbai/{kb_name}.gbkb/inbox/`
|
||||||
|
|
||||||
|
Supported types (reuse existing `FileContentExtractor`): `.txt`, `.md`, `.pdf`, `.json`, `.csv`, `.png`, `.jpg`, `.mp3`, `.wav`, `.mp4`
|
||||||
|
|
||||||
|
---
|
||||||
|
|
||||||
|
## LLM Prompts
|
||||||
|
|
||||||
|
### Ingest prompt
|
||||||
|
```
|
||||||
|
Extract structured information from this content.
|
||||||
|
Return JSON: { "summary": "...", "entities": [...], "topics": [...], "importance": 0.0-1.0 }
|
||||||
|
Content: {content}
|
||||||
|
```
|
||||||
|
|
||||||
|
### Consolidate prompt
|
||||||
|
```
|
||||||
|
You are a memory consolidation agent. Review these memories and find connections.
|
||||||
|
Return JSON: { "insight": "...", "connected_memory_ids": [...] }
|
||||||
|
Memories: {memories_json}
|
||||||
|
```
|
||||||
|
|
||||||
|
Use the existing `LlmClient` in `botserver/src/llm/mod.rs` with the bot's configured model.
|
||||||
|
|
||||||
|
---
|
||||||
|
|
||||||
|
## BASIC Usage Examples
|
||||||
|
|
||||||
|
```basic
|
||||||
|
' Enable always-on memory for this bot
|
||||||
|
USE KB "meeting-notes" ALWAYS ON
|
||||||
|
|
||||||
|
' Manual ingest from BASIC (optional)
|
||||||
|
' (future keyword: ADD MEMORY "text" TO KB "meeting-notes")
|
||||||
|
|
||||||
|
' Query is automatic — memories injected into every LLM call
|
||||||
|
TALK "What did we discuss last week?"
|
||||||
|
' → LLM sees memory context automatically
|
||||||
|
```
|
||||||
|
|
||||||
|
---
|
||||||
|
|
||||||
|
## File Structure
|
||||||
|
|
||||||
|
```
|
||||||
|
botserver/
|
||||||
|
├── migrations/
|
||||||
|
│ └── 6.2.6-always-on-kb/
|
||||||
|
│ ├── up.sql ← always_on_kb, kb_memories, kb_consolidations tables
|
||||||
|
│ └── down.sql
|
||||||
|
├── src/
|
||||||
|
│ ├── basic/keywords/
|
||||||
|
│ │ └── always_on_kb.rs ← USE KB "x" ALWAYS ON keyword
|
||||||
|
│ ├── learn/
|
||||||
|
│ │ └── always_on_kb.rs ← IngestWorker, ConsolidateWorker, QueryEnhancer, HTTP API
|
||||||
|
│ └── main_module/
|
||||||
|
│ └── server.rs ← start_always_on_kb_workers() on startup
|
||||||
|
```
|
||||||
|
|
||||||
|
---
|
||||||
|
|
||||||
|
## Implementation Order
|
||||||
|
|
||||||
|
1. Migration (`up.sql`) — tables
|
||||||
|
2. `always_on_kb.rs` keyword — register syntax
|
||||||
|
3. `learn/always_on_kb.rs` — `build_memory_context()` + HTTP API
|
||||||
|
4. `IngestWorker` + `ConsolidateWorker` tokio tasks
|
||||||
|
5. `BotOrchestrator` integration — inject memory context
|
||||||
|
6. `start_always_on_kb_workers()` in server startup
|
||||||
|
7. Register keyword in `basic/mod.rs`
|
||||||
|
8. Botbook doc + i18n keys
|
||||||
|
|
||||||
|
---
|
||||||
|
|
||||||
|
## Key Differences from Google's Implementation
|
||||||
|
|
||||||
|
| Google ADK | GB Implementation |
|
||||||
|
|-----------|-------------------|
|
||||||
|
| Python + ADK framework | Rust + existing AppState |
|
||||||
|
| SQLite | PostgreSQL (existing pool) |
|
||||||
|
| Gemini Flash-Lite only | Any configured LLM via LlmClient |
|
||||||
|
| Standalone process | Embedded tokio tasks in botserver |
|
||||||
|
| File inbox only | File inbox + HTTP API + future chat auto-ingest |
|
||||||
|
| Manual query | Auto-injected into every bot LLM call |
|
||||||
|
| No BASIC integration | `USE KB "x" ALWAYS ON` keyword |
|
||||||
0
prompts/botserver-installers/minio
Normal file
0
prompts/botserver-installers/minio
Normal file
Binary file not shown.
BIN
prompts/botserver-installers/valkey-8.1.5-jammy-x86_64.tar.gz
Normal file
BIN
prompts/botserver-installers/valkey-8.1.5-jammy-x86_64.tar.gz
Normal file
Binary file not shown.
BIN
prompts/botserver-installers/vault_1.15.4_linux_amd64.zip
Normal file
BIN
prompts/botserver-installers/vault_1.15.4_linux_amd64.zip
Normal file
Binary file not shown.
49
prompts/prod.md
Normal file
49
prompts/prod.md
Normal file
|
|
@ -0,0 +1,49 @@
|
||||||
|
# Production: Vault Container via LXD Socket
|
||||||
|
|
||||||
|
## Current Setup
|
||||||
|
|
||||||
|
- **botserver binary**: Already at `/opt/gbo/tenants/pragmatismo/system/bin/botserver` (inside pragmatismo-system container)
|
||||||
|
- **Target**: Install Vault in a NEW container on the **HOST** LXD (outside pragmatismo-system)
|
||||||
|
- **Connection**: botserver uses LXD socket proxy (`/tmp/lxd.sock` → host LXD)
|
||||||
|
|
||||||
|
## Execution Plan
|
||||||
|
|
||||||
|
### Step 1: Pull latest botserver code on pragmatismo-system
|
||||||
|
|
||||||
|
```bash
|
||||||
|
cd /opt/gbo/tenants/pragmatismo/system
|
||||||
|
git pull alm main
|
||||||
|
```
|
||||||
|
|
||||||
|
### Step 2: Build botserver (if needed)
|
||||||
|
|
||||||
|
```bash
|
||||||
|
cargo build -p botserver
|
||||||
|
cp target/debug/botserver /opt/gbo/tenants/pragmatismo/system/bin/botserver
|
||||||
|
```
|
||||||
|
|
||||||
|
### Step 3: Install Vault container via botserver (FROM pragmatismo-system)
|
||||||
|
|
||||||
|
```bash
|
||||||
|
/opt/gbo/tenants/pragmatismo/system/bin/botserver install vault --container
|
||||||
|
```
|
||||||
|
|
||||||
|
**This runs INSIDE pragmatismo-system container but installs Vault on HOST LXD**
|
||||||
|
|
||||||
|
### Step 4: Verify Vault is running on host
|
||||||
|
|
||||||
|
```bash
|
||||||
|
# From pragmatismo-system, via socket proxy
|
||||||
|
lxc list
|
||||||
|
|
||||||
|
# Or directly on host (from Proxmox)
|
||||||
|
lxc list
|
||||||
|
```
|
||||||
|
|
||||||
|
### Step 5: Update botserver to use external Vault
|
||||||
|
|
||||||
|
After Vault is installed in its own container, update `/opt/gbo/tenants/pragmatismo/system/bin/.env`:
|
||||||
|
|
||||||
|
```
|
||||||
|
VAULT_ADDR=https://<vault-container-ip>:8200
|
||||||
|
```
|
||||||
Loading…
Add table
Reference in a new issue