Guided Discovery
Not every query has a single right answer. When a search is ambiguous, ToolCairn’s clarification engine asks targeted questions to understand what you actually need before committing to a recommendation.
The Problem
Consider the query “vector database”. This could mean an embedded library like ChromaDB, a hosted service like Pinecone, or a specialized engine like Milvus. Returning a single tool would be a guess. Returning all 12 candidates would be noise. ToolCairn needs a middle path.
How It Works
After Stage 1 retrieval, ToolCairn analyzes the candidate distribution. If the confidence score is below the threshold (typically 0.7), it generates clarification questions based on where candidates diverge:
- Candidates span 3 languages → ask which language
- Candidates span multiple categories → ask which category
- Candidates have different deployment models → ask about deployment
- Candidates have mixed licenses → ask about license requirements
The Flow
Here’s the complete interaction flow between an agent and the ToolCairn MCP server during a guided discovery session:
Agent ToolCairn MCP Server
│ │
│ search_tools("vector database") │
│ ──────────────────────────────────▶ │
│ │
│ ┌─────────┴─────────┐
│ │ Stage 1: Retrieval │
│ │ 12 candidates span │
│ │ 3 deployment types │
│ │ Confidence < 0.7 │
│ └─────────┬─────────┘
│ │
│ ◀──── clarification response ──── │
│ "What deployment model?" │
│ options: [embedded, hosted, │
│ specialized] │
│ │
│ search_tools_respond("embedded") │
│ ──────────────────────────────────▶ │
│ │
│ ┌─────────┴─────────┐
│ │ Stage 2–4: Filter, │
│ │ Rerank, Select │
│ └─────────┬─────────┘
│ │
│ ◀──── final recommendation ────── │
│ "ChromaDB" (score: 0.91) │
│ │Question Types
Clarification questions are generated dynamically based on candidate variance. The four question types are:
Language
"What language is your project in?"
Triggered when: Candidates written in 2+ different languages
Category
"What type of tool are you looking for?"
Triggered when: Candidates span multiple functional categories
License
"Do you need a specific license type?"
Triggered when: Mix of permissive and copyleft licenses in candidates
Deployment
"What deployment model do you need?"
Triggered when: Candidates include embedded, hosted, and self-hosted options
Clarification Response
When clarification is needed, the MCP server returns a structured response with a session ID and one or more questions. Agents answer using search_tools_respond:
// Clarification response structure
{
"type": "clarification",
"session_id": "sess_abc123",
"questions": [
{
"id": "deployment",
"text": "What deployment model do you need?",
"options": [
{ "value": "embedded", "label": "Embedded (in-process)" },
{ "value": "hosted", "label": "Hosted / managed service" },
{ "value": "self-hosted", "label": "Self-hosted server" }
]
},
{
"id": "language",
"text": "What language is your project in?",
"options": [
{ "value": "python", "label": "Python" },
{ "value": "typescript", "label": "TypeScript" },
{ "value": "go", "label": "Go" }
]
}
]
}Rich Context Shortcut
If you already know the context (language, deployment model, etc.), you can bypass clarification entirely by including it in the initial search_tools call. The pipeline skips directly to Stage 2 with your filters pre-applied:
// Skip clarification by providing context upfront
search_tools({
query: "vector database",
context: {
language: "python",
deployment: "embedded",
use_case: "RAG pipeline for LLM app"
}
})
// → Goes straight to Stage 2–4 with filters pre-applied
// → Returns final recommendation immediately