Guided Discovery

Not every query has a single right answer. When a search is ambiguous, ToolCairn’s clarification engine asks targeted questions to understand what you actually need before committing to a recommendation.

The Problem

Consider the query “vector database”. This could mean an embedded library like ChromaDB, a hosted service like Pinecone, or a specialized engine like Milvus. Returning a single tool would be a guess. Returning all 12 candidates would be noise. ToolCairn needs a middle path.

How It Works

After Stage 1 retrieval, ToolCairn analyzes the candidate distribution. If the confidence score is below the threshold (typically 0.7), it generates clarification questions based on where candidates diverge:

Candidates span 3 languages → ask which language
Candidates span multiple categories → ask which category
Candidates have different deployment models → ask about deployment
Candidates have mixed licenses → ask about license requirements

The Flow

Here’s the complete interaction flow between an agent and the ToolCairn MCP server during a guided discovery session:

Guided discovery sequence

  Agent                          ToolCairn MCP Server
    │                                    │
    │  search_tools("vector database")   │
    │ ──────────────────────────────────▶ │
    │                                    │
    │                          ┌─────────┴─────────┐
    │                          │ Stage 1: Retrieval │
    │                          │ 12 candidates span │
    │                          │ 3 deployment types │
    │                          │ Confidence < 0.7   │
    │                          └─────────┬─────────┘
    │                                    │
    │  ◀──── clarification response ──── │
    │  "What deployment model?"          │
    │  options: [embedded, hosted,       │
    │            specialized]            │
    │                                    │
    │  search_tools_respond("embedded")  │
    │ ──────────────────────────────────▶ │
    │                                    │
    │                          ┌─────────┴─────────┐
    │                          │ Stage 2–4: Filter, │
    │                          │ Rerank, Select     │
    │                          └─────────┬─────────┘
    │                                    │
    │  ◀──── final recommendation ────── │
    │  "ChromaDB" (score: 0.91)          │
    │                                    │

Question Types

Clarification questions are generated dynamically based on candidate variance. The four question types are:

💬

Language

"What language is your project in?"

Triggered when: Candidates written in 2+ different languages

📂

License

"Do you need a specific license type?"

Triggered when: Mix of permissive and copyleft licenses in candidates

🚀

Deployment

"What deployment model do you need?"

Triggered when: Candidates include embedded, hosted, and self-hosted options

Clarification Response

When clarification is needed, the MCP server returns a structured response with a session ID and one or more questions. Agents answer using search_tools_respond:

Clarification response payload

json

// Clarification response structure
{
  "type": "clarification",
  "session_id": "sess_abc123",
  "questions": [
    {
      "id": "deployment",
      "text": "What deployment model do you need?",
      "options": [
        { "value": "embedded", "label": "Embedded (in-process)" },
        { "value": "hosted",   "label": "Hosted / managed service" },
        { "value": "self-hosted", "label": "Self-hosted server" }
      ]
    },
    {
      "id": "language",
      "text": "What language is your project in?",
      "options": [
        { "value": "python",     "label": "Python" },
        { "value": "typescript", "label": "TypeScript" },
        { "value": "go",         "label": "Go" }
      ]
    }
  ]
}

Rich Context Shortcut

If you already know the context (language, deployment model, etc.), you can bypass clarification entirely by including it in the initial search_tools call. The pipeline skips directly to Stage 2 with your filters pre-applied:

Skipping clarification with context

typescript

// Skip clarification by providing context upfront
search_tools({
  query: "vector database",
  context: {
    language: "python",
    deployment: "embedded",
    use_case: "RAG pipeline for LLM app"
  }
})

// → Goes straight to Stage 2–4 with filters pre-applied
// → Returns final recommendation immediately

How It Works

Candidates span 3 languages → ask which language

Candidates span multiple categories → ask which category

Candidates have different deployment models → ask about deployment

Candidates have mixed licenses → ask about license requirements

The Flow

Here’s the complete interaction flow between an agent and the ToolCairn MCP server during a guided discovery session:

Guided discovery sequence

  Agent                          ToolCairn MCP Server
    │                                    │
    │  search_tools("vector database")   │
    │ ──────────────────────────────────▶ │
    │                                    │
    │                          ┌─────────┴─────────┐
    │                          │ Stage 1: Retrieval │
    │                          │ 12 candidates span │
    │                          │ 3 deployment types │
    │                          │ Confidence < 0.7   │
    │                          └─────────┬─────────┘
    │                                    │
    │  ◀──── clarification response ──── │
    │  "What deployment model?"          │
    │  options: [embedded, hosted,       │
    │            specialized]            │
    │                                    │
    │  search_tools_respond("embedded")  │
    │ ──────────────────────────────────▶ │
    │                                    │
    │                          ┌─────────┴─────────┐
    │                          │ Stage 2–4: Filter, │
    │                          │ Rerank, Select     │
    │                          └─────────┬─────────┘
    │                                    │
    │  ◀──── final recommendation ────── │
    │  "ChromaDB" (score: 0.91)          │
    │                                    │

Question Types

Clarification questions are generated dynamically based on candidate variance. The four question types are:

💬

Language

"What language is your project in?"

Triggered when: Candidates written in 2+ different languages

📂

License

"Do you need a specific license type?"

Triggered when: Mix of permissive and copyleft licenses in candidates

🚀

Deployment

"What deployment model do you need?"

Triggered when: Candidates include embedded, hosted, and self-hosted options

Clarification Response

When clarification is needed, the MCP server returns a structured response with a session ID and one or more questions. Agents answer using search_tools_respond:

Clarification response payload

json

// Clarification response structure
{
  "type": "clarification",
  "session_id": "sess_abc123",
  "questions": [
    {
      "id": "deployment",
      "text": "What deployment model do you need?",
      "options": [
        { "value": "embedded", "label": "Embedded (in-process)" },
        { "value": "hosted",   "label": "Hosted / managed service" },
        { "value": "self-hosted", "label": "Self-hosted server" }
      ]
    },
    {
      "id": "language",
      "text": "What language is your project in?",
      "options": [
        { "value": "python",     "label": "Python" },
        { "value": "typescript", "label": "TypeScript" },
        { "value": "go",         "label": "Go" }
      ]
    }
  ]
}

Rich Context Shortcut

Skipping clarification with context

typescript

// Skip clarification by providing context upfront
search_tools({
  query: "vector database",
  context: {
    language: "python",
    deployment: "embedded",
    use_case: "RAG pipeline for LLM app"
  }
})

// → Goes straight to Stage 2–4 with filters pre-applied
// → Returns final recommendation immediately

Guided Discovery

The Problem

How It Works

The Flow

Question Types

Language

Category

License

Deployment

Clarification Response

Rich Context Shortcut

Guided Discovery

The Problem

How It Works

The Flow

Question Types

Language

Category

License

Deployment

Clarification Response

Rich Context Shortcut

Guided Discovery

The Problem

How It Works

The Flow

Question Types

Language

Category

License

Deployment

Clarification Response

Rich Context Shortcut

Search ToolCairn

Guided Discovery

The Problem

How It Works

The Flow

Question Types

Language

Category

License

Deployment

Clarification Response

Rich Context Shortcut