Guided Discovery
Not every query has a single right answer. When a search is ambiguous, ToolPilot’s clarification engine asks targeted questions to understand what you actually need before committing to a recommendation.
The Problem
Consider the query “vector database”. This could mean an embedded library like ChromaDB, a hosted service like Pinecone, or a specialized engine like Milvus. Returning a single tool would be a guess. Returning all 12 candidates would be noise. ToolPilot needs a middle path.
How It Works
After Stage 1 retrieval, ToolPilot analyzes the candidate distribution. If the confidence score is below the threshold (typically 0.7), it generates clarification questions based on where candidates diverge:
- Candidates span 3 languages → ask which language
- Candidates span multiple categories → ask which category
- Candidates have different deployment models → ask about deployment
- Candidates have mixed licenses → ask about license requirements
The Flow
Here’s the complete interaction flow between an agent and the ToolPilot MCP server during a guided discovery session:
Agent ToolPilot MCP Server
│ │
│ search_tools("vector database") │
│ ──────────────────────────────────▶ │
│ │
│ ┌─────────┴─────────┐
│ │ Stage 1: Retrieval │
│ │ 12 candidates span │
│ │ 3 deployment types │
│ │ Confidence < 0.7 │
│ └─────────┬─────────┘
│ │
│ ◀──── clarification response ──── │
│ "What deployment model?" │
│ options: [embedded, hosted, │
│ specialized] │
│ │
│ search_tools_respond("embedded") │
│ ──────────────────────────────────▶ │
│ │
│ ┌─────────┴─────────┐
│ │ Stage 2–4: Filter, │
│ │ Rerank, Select │
│ └─────────┬─────────┘
│ │
│ ◀──── final recommendation ────── │
│ "ChromaDB" (score: 0.91) │
│ │Question Types
Clarification questions are generated dynamically based on candidate variance. The four question types are:
Language
"What language is your project in?"
Triggered when: Candidates written in 2+ different languages
Category
"What type of tool are you looking for?"
Triggered when: Candidates span multiple functional categories
License
"Do you need a specific license type?"
Triggered when: Mix of permissive and copyleft licenses in candidates
Deployment
"What deployment model do you need?"
Triggered when: Candidates include embedded, hosted, and self-hosted options
Clarification Response
When clarification is needed, the MCP server returns a structured response with a session ID and one or more questions. Agents answer using search_tools_respond:
1// Clarification response structure
2{
3 "type": "clarification",
4 "session_id": "sess_abc123",
5 "questions": [
6 {
7 "id": "deployment",
8 "text": "What deployment model do you need?",
9 "options": [
10 { "value": "embedded", "label": "Embedded (in-process)" },
11 { "value": "hosted", "label": "Hosted / managed service" },
12 { "value": "self-hosted", "label": "Self-hosted server" }
13 ]
14 },
15 {
16 "id": "language",
17 "text": "What language is your project in?",
18 "options": [
19 { "value": "python", "label": "Python" },
20 { "value": "typescript", "label": "TypeScript" },
21 { "value": "go", "label": "Go" }
22 ]
23 }
24 ]
25}
Rich Context Shortcut
If you already know the context (language, deployment model, etc.), you can bypass clarification entirely by including it in the initial search_tools call. The pipeline skips directly to Stage 2 with your filters pre-applied:
1// Skip clarification by providing context upfront
2search_tools({
3 query: "vector database",
4 context: {
5 language: "python",
6 deployment: "embedded",
7 use_case: "RAG pipeline for LLM app"
8 }
9})
10
11// → Goes straight to Stage 2–4 with filters pre-applied
12// → Returns final recommendation immediately
Faster Results