SmallWorld Concierge · Meeting Prep

Agent Implementation Plan

Architecture decisions, tool-by-tool breakdown, API contracts, data model shapes, and suggested Linear tickets — derived from the May 5 morning planning session and the Agent Capability Catalog.

Generated 2026-05-05 · Sources: meeting transcript (8:30–9:10 AM PDT), capability catalog (20 tools) · For 11:00 AM reconvene

Key Architectural Decisions

Rule 1 — Agents Only Hit Rails API

The concierge tools should never directly access the orchestration layer or databases. All writes go through Rails API endpoints. The agent interface is decoupled from data persistence mechanics.

Rule 2 — Reads Come from Relationship Map Context

Instead of ad-hoc database queries, agents load a pre-hydrated "Relationship Map Context" object into memory. This is a cached, sparse representation of all relevant data for an account/target pair — primarily IDs and keys, minimal hydration.

  • Backed by Elasticsearch
  • Same object that hydrates the frontend page view
  • Partial updates needed (ES supports this via _update API)
  • Redis or Cloudflare KV as intermediate cache if scale demands

Rule 3 — Tools Are Data-Source Agnostic

Tools don't need to know where data comes from. The system prompt for each tool type says: "Load the relationship map context first." Tools that need external data (Lima Data, LinkedIn) operate independently of the context.

Rule 4 — Sparse ID-Based Data

Tool outputs should be sparse — relationship IDs, primary/foreign keys, integers. Not fully hydrated objects. Deltas can be identified by comparing ID arrays. Granular lookups happen downstream when needed.

Division of Labor

  • Cameron — Interface, tickets, orchestration frontend (notifications, data display)
  • Michael — Orchestration API, Rails API endpoints, Lima Data investigation

Data Flow Architecture

flowchart TB
    subgraph CF["Cloudflare (Orchestration)"]
        ORCH["Orchestrator"]
        KV["KV Cache
tool I/O"] D1["D1
orchestration state"] end subgraph RAILS["Rails (Application)"] API["Rails API
/api/v1/concierge/*"] PG["PostgreSQL"] SJ["Sidekiq
async jobs"] end subgraph ES_LAYER["Search & Context"] ES["Elasticsearch
Relationship Map Context"] end subgraph EXT["External"] LIMA["Lima Data API"] LI["LinkedIn"] WEB["Web Scrape"] end subgraph UI["Frontend"] REACT["React App"] end ORCH -->|"tool write"| API API -->|"persist"| PG API -->|"index/update"| ES SJ -->|"rebuild context"| ES ORCH -->|"cache I/O"| KV ORCH -->|"read context"| ES ORCH -->|"firmographic"| LIMA ORCH -->|"graph data"| LI REACT -->|"read context"| API API -->|"serve context"| ES REACT -->|"actions"| API
Concierge Data Flow — Tools write through Rails API, read from Elasticsearch context
sequenceDiagram
    participant O as Orchestrator
    participant ES as Elasticsearch
    participant API as Rails API
    participant DB as PostgreSQL
    participant KV as CF KV

    Note over O: Tool triggered (continuous/event/on-demand)
    O->>ES: Load Relationship Map Context
    ES-->>O: Sparse ID-based context object
    O->>O: Execute tool logic
    O->>KV: Cache tool input + output
    O->>API: POST /api/v1/concierge/{tool}/results
    API->>DB: Persist results
    API->>ES: Update context (partial)
    API-->>O: 200 OK + updated IDs
    
Typical tool execution sequence

Relationship Map Context

The central pre-hydrated object that agents load into memory. Backed by Elasticsearch, keyed by account_id:target_company_id. Also hydrates the frontend account view. ES max doc size is 2GB — more than sufficient.

RelationshipMapContext (Elasticsearch Document)
// Index: relationship_map_contexts
// Document ID: {account_id}_{target_company_id}
{
  "account_id": number,
  "target_company_id": number,
  "updated_at": "2026-05-05T15:30:00Z",

  // Discovery layer — sparse ID arrays
  "network": {
    "direct_connections": [relationship_id, ...],
    "ex_employer_connections": [relationship_id, ...],
    "prior_coworker_connections": [relationship_id, ...],
    "shared_cohort_connections": [relationship_id, ...],
    "shared_investor_paths": [path_id, ...],
    "third_degree_paths": [path_id, ...],
    "hq_area_connectors": [relationship_id, ...],
    "industry_veterans": [relationship_id, ...]
  },

  // Key personnel at target
  "target_personnel": [
    {
      "person_id": number,
      "name": "Jane Smith",
      "title": "VP Engineering",
      "department": "Engineering",
      "is_decision_maker": boolean,
      "qualified": boolean | null
    }
  ],

  // Targeting layer
  "ranked_paths": [
    {
      "prospect_id": number,
      "connector_id": number,
      "relationship_id": number,
      "score": number,
      "strength": "strong" | "moderate" | "weak" | null,
      "path_type": "direct" | "second_degree" | "third_degree"
    }
  ],

  // Engagement layer
  "offers_for_help": [offer_id, ...],
  "pending_asks": [ask_id, ...],
  "connector_responses": [response_id, ...],

  // Intelligence layer
  "company_research": {
    "firmographic_id": number | null,
    "last_refreshed": "2026-05-05T00:00:00Z" | null
  },
  "personnel_changes": [
    { "person_id": number, "change_type": "new_hire" | "departure" | "promotion", "detected_at": "ISO8601" }
  ]
}
Tool I/O Cache (Cloudflare KV)
// Key pattern: {tool_id}:{content_hash(input)}
// Example: map_account_network:a1b2c3d4
// Note: key includes hash of full input object to handle
// tools keyed by prospect_id, relationship_id, person_name, etc.
{
  "tool_id": "map_account_network",
  "input": { "account_id": 42, "target_company_id": 1337 },
  "output": { "relationship_ids": [101, 202, 303] },
  "executed_at": "2026-05-05T15:30:00Z",
  "ttl": 86400
}

API Contract Principles

Rails → Concierge Interface

The Rails API exposes endpoints that the Cloudflare orchestrator calls. The orchestrator is the only consumer.

  • Write path: POST /api/v1/concierge/{tool_id}/results — tool persists its output
  • Read path: GET /api/v1/concierge/context/{account_id}/{target_company_id} — returns relationship map context (or orchestrator reads directly from ES)
  • Status path: GET /api/v1/concierge/status/{orchestration_id} — orchestration status for frontend polling
POST /api/v1/concierge/{tool_id}/results
// Request body — tool submits its output to Rails for persistence
// Headers: Authorization: Bearer {service_jwt}
// Headers: Idempotency-Key: {tool_id}:{request_id}
{
  "account_id": number,
  "target_company_id": number,
  "tool_id": "map_account_network",
  "request_id": "uuid",
  "result": {
    // tool-specific output shape
    "relationship_ids": [101, 202, 303]
  },
  "executed_at": "2026-05-05T15:30:00Z"
}

// Response — Rails persists + updates ES context
{
  "status": "ok",
  "context_updated": true,
  "ids_added": 3
}
flowchart LR
    subgraph TOOL["Tool Execution"]
        T1["map_account_network"]
        T2["qualify_target_prospects"]
        T3["personalize_connector_outreach"]
    end

    subgraph IFACE["Concierge Interface"]
        W["POST /results"]
        R["GET /context"]
        S["GET /status"]
    end

    subgraph STORE["Persistence"]
        PG["PostgreSQL"]
        ES["Elasticsearch"]
    end

    T1 -->|write| W
    T2 -->|write| W
    T3 -->|write| W
    W --> PG
    W -->|partial update| ES
    T1 -.->|read| R
    T2 -.->|read| R
    T3 -.->|read| R
    R -.-> ES
    
All tools share the same write/read interface

Discovery

Card legend: Occurrence Complexity (S/M/L/XL) Relevance

Map Account Network

Discover everyone on the customer team's collective network who works at the target account.

Continuous L High
Input: account_id, target_company_id
Output: { connector_id, prospect_id, relationship_id }[]
Sources: LinkedIn, internal graph
Writes to: Relationship Map Context → network.direct_connections
Reads from: Rails API (existing prospect/relationship data)
Linear: Implement map_account_network tool
Build the core network mapping tool. Accept account_id + target_company_id, query existing relationship data via Rails API, return sparse array of relationship IDs. Persist to ES context via POST /results. Cache I/O in KV. This is the foundational tool — most other discovery tools depend on its output.
SW-CONC-001 · map_account_network

Map Key Personnel

Identify and stratify key people at the target — leadership, decision-makers, board, advisors, recent hires/departures. Runs without connectors.

Continuous M High
Input: target_company_id
Output: { person_id, name, title, department, is_decision_maker }[]
Sources: LinkedIn, web scrape, firmographic APIs (Lima Data)
Writes to: Relationship Map Context → target_personnel
Linear: Implement map_key_personnel tool
Enrich target company with key personnel data. Pull from LinkedIn + Lima Data. Stratify by role (decision-maker, advisor, recent hire). Persist person records and update ES context. No connector data needed — purely target-side.
SW-CONC-002 · map_key_personnel

Find HQ-Area Connectors

Search for people in the customer's network who live in the target's HQ metro with dense local relationships.

On-demand M Medium
Input: account_id, target_company_id
Output: relationship_id[]
Sources: LinkedIn, internal graph
Writes to: Context → network.hq_area_connectors
Linear: Implement find_hq_area_connectors tool
Filter connectors by geographic proximity to target HQ. Requires geo data on connector profiles. Read from context, filter, write back filtered IDs.
SW-CONC-003 · find_hq_area_connectors

Find Industry Veterans

Surface connectors whose career history is rooted in the target's industry.

On-demand M Medium
Input: account_id, target_company_id, industry_codes[]
Output: relationship_id[]
Sources: LinkedIn, internal graph, firmographic APIs
Writes to: Context → network.industry_veterans
Linear: Implement find_industry_veterans tool
Match connectors by industry experience overlap with target company. Leverage Lima Data for industry classification. Return filtered relationship IDs.
SW-CONC-004 · find_industry_veterans

Find Ex-Employer Connectors

Surface connectors who previously worked at the target company.

On-demand S High
Input: account_id, target_company_id
Output: relationship_id[]
Sources: LinkedIn, internal graph
Writes to: Context → network.ex_employer_connections
Linear: Implement find_ex_employer_connectors tool
Simple graph query: find connectors with employment history at target. Lowest complexity discovery tool — good candidate for first implementation after map_account_network.
SW-CONC-005 · find_ex_employer_connectors

Find Prior-Coworker Connectors

Surface connectors who overlapped at the same company at the same time as a prospect. Filterable by department, size, era.

On-demand M High
Input: account_id, prospect_id, optional filters
Output: relationship_id[]
Sources: LinkedIn, internal graph
Writes to: Context → network.prior_coworker_connections
Linear: Implement find_prior_coworker_connectors tool
Temporal overlap query: find connectors who worked at the same company during overlapping date ranges. Requires employment date data. Supports department/size filters.
SW-CONC-006 · find_prior_coworker_connectors

Find Shared-Cohort Connectors

Surface connectors sharing non-employer cohorts — same school, exec program, fellowship, accelerator batch.

On-demand M Medium
Input: account_id, prospect_id
Output: relationship_id[]
Sources: LinkedIn, internal graph
Writes to: Context → network.shared_cohort_connections
Linear: Implement find_shared_cohort_connectors tool
Query non-employment affiliations: education, fellowships (YC, Endeavor), military, accelerators. Requires structured cohort data on profiles.
SW-CONC-007 · find_shared_cohort_connectors

Find Shared-Investor Paths

Surface paths through investors, board members, or advisors whose portfolios overlap the customer's network and target's cap table.

On-demand L Medium
Input: account_id, target_company_id
Output: path_id[]
Sources: Cap-table data, internal graph
Writes to: Context → network.shared_investor_paths
Linear: Implement find_shared_investor_paths tool
Graph traversal through investor/board/advisor nodes. Requires cap table data source. Higher complexity — defer until core discovery tools are stable.
SW-CONC-008 · find_shared_investor_paths

Explore Third-Degree Paths

Find and verify viable paths to prospects through one extra hop when no direct connector exists.

Continuous L High
Input: account_id, target_company_id, prospect_id
Output: path_id[] (each path = chain of relationship IDs)
Sources: LinkedIn, internal graph
Writes to: Context → network.third_degree_paths
Linear: Implement explore_third_degree_paths tool
Multi-hop graph traversal. High complexity — requires path verification (are intermediaries willing to connect?). Core value prop but defer until first/second degree tools are proven.
SW-CONC-009 · explore_third_degree_paths

Path to Named Person

User-initiated path-finding for a specific person not yet in the graph. Agent locates, enriches, and finds a path.

On-demand M High
Input: account_id, person_name, optional company, title
Output: { person_id, path_ids[] }
Sources: LinkedIn, web scrape, internal graph
Writes to: Creates person record + paths in context
Linear: Implement path_to_named_person tool
On-demand person lookup + enrichment + path-finding. Combines person resolution (fuzzy name match) with graph traversal. High user-facing value — "find me a path to Peggy Smith at Acme."
SW-CONC-010 · path_to_named_person

Targeting

Qualify Target Prospects

Identify which people at the target are worth pursuing given buying centers and ICP.

Continuous M High
Input: target_company_id, icp_criteria
Output: { person_id, qualified: boolean, score }[]
Sources: LinkedIn, firmographic APIs
Writes to: Context → target_personnel[].qualified
Linear: Implement qualify_target_prospects tool
Score and qualify target personnel against ICP criteria. Reads target_personnel from context, enriches via firmographic data, writes qualification flags back. Depends on map_key_personnel output.
SW-CONC-011 · qualify_target_prospects

Verify Relationship Strength

Ask connectors to rate how well they know specific prospects — ensure a one-time meeting isn't treated as a strong tie.

Continuous M High
Input: relationship_id
Output: { relationship_id, strength: "strong"|"moderate"|"weak" }
Sources: Connector survey responses
Writes to: Context → ranked_paths[].strength
Linear: Implement verify_relationship_strength tool
Trigger strength verification surveys to connectors. Process responses, normalize to strong/moderate/weak. Critical for path ranking accuracy. Depends on engagement layer (connector outreach infrastructure).
SW-CONC-012 · verify_relationship_strength

Rank Introduction Paths

Score all available paths by connector quality, willingness, strength, and adjacency signals.

Continuous M High
Input: account_id, target_company_id
Output: ranked_paths[] with scores
Sources: Internal graph (relationship map context)
Writes to: Context → ranked_paths
Linear: Implement rank_introduction_paths tool
Pure context-based computation — reads all network arrays + strength data from context, applies scoring algorithm, writes ranked paths back. No external data needed. Depends on discovery + strength verification.
SW-CONC-013 · rank_introduction_paths

Engagement

Solicit Offers for Help

Nudge connectors to volunteer ways they could help — flips the social dynamic so reps don't have to ask directly.

Continuous M High
Input: relationship_id, context from relationship map
Output: offer_id (created offer record)
Sources: Connector survey
Writes to: Context → offers_for_help
Linear: Implement solicit_offers_for_help tool
Generate contextual solicitation messages using relationship map data. "We have a gap in legal department — you could really help there." Track offer state. Uses context without re-fetching.
SW-CONC-014 · solicit_offers_for_help

Personalize Connector Outreach

Tailor every outbound message using work history, shared experience, prior offers, and request relevance.

Event-driven M High
Input: relationship_id, prospect_id, context
Output: personalized message draft
Sources: Internal graph, LinkedIn
Reads from: Full relationship map context (no re-fetching)
Linear: Implement personalize_connector_outreach tool
LLM-powered message personalization. Reads full context object — connector history, target gaps, prior offers — and generates tailored outreach. Example: "We've been looking for this kind of introduction. We have this gap in legal."
SW-CONC-015 · personalize_connector_outreach

Draft Connector Asks

Generate the connector-facing message with right framing, context, prospect, and strength.

Event-driven M High
Input: relationship_id, prospect_id, ask_type
Output: { ask_id, message_draft, connector_id, prospect_id }
Sources: Internal graph
Writes to: Context → pending_asks
Linear: Implement draft_connector_asks tool
Generate ready-to-send connector ask messages. Depends on ranked_paths and strength data. Customer shouldn't have to write the message themselves.
SW-CONC-016 · draft_connector_asks

Process Connector Responses

Classify and route incoming responses. Surface positives, capture negatives with reasoning, feed signal back into ranking.

Event-driven M High
Input: response_payload (from survey/email)
Output: { response_id, classification, signal_updates }
Sources: Connector survey responses
Writes to: Context → connector_responses, feeds back into ranked_paths
Linear: Implement process_connector_responses tool
Ingest and classify connector responses (positive/negative/partial). Update strength data and path rankings. Closes the feedback loop for the targeting layer.
SW-CONC-017 · process_connector_responses

Intelligence

Lima Data Investigation Required

The firmographic tools below depend on Lima Data API. Michael is investigating their API — need to validate: available endpoints, rate limits, token costs, department-level data availability. This is a blocker for research_target_company, map_key_personnel, and find_industry_veterans.

Research Target Company

Pull firmographic data — size, revenue, funding, industry, locations, tech stack, recent news.

On-demand S High
Input: target_company_id or company_name
Output: { firmographic_id, size, revenue, funding, industry, locations, tech_stack }
Sources: Lima Data API, web scrape, news
Writes to: Context → company_research
Linear: Implement research_target_company tool
Lowest complexity tool. Pull firmographic data from Lima Data, cache result. Blocked on Lima Data API investigation. Good candidate for early implementation once API access is confirmed.
SW-CONC-018 · research_target_company

Find Contact Info

Discover email, phone, and direct LinkedIn URL for identified people at the target.

On-demand M High
Input: person_id or person_id[]
Output: { person_id, email, phone, linkedin_url }[]
Sources: Email inference, LinkedIn, web scrape
Writes to: Person records (via Rails API)
Linear: Implement find_contact_info tool
Contact enrichment for target personnel. Email inference patterns, LinkedIn URL resolution. Batch-capable. Writes directly to person records via Rails API.
SW-CONC-019 · find_contact_info

Track Personnel Changes

Continuously monitor job changes so paths stay current and new arrivals trigger outreach windows.

Continuous M High
Input: target_company_id (runs on schedule)
Output: { person_id, change_type, detected_at }[]
Sources: LinkedIn
Writes to: Context → personnel_changes, triggers re-evaluation of paths
Linear: Implement track_personnel_changes tool
Scheduled job monitoring LinkedIn for personnel changes at target companies. Detect new hires, departures, promotions. Trigger events for downstream tools (new arrival = outreach window).
SW-CONC-020 · track_personnel_changes

Build Sequencing

gantt
    title Concierge Tool Build Order
    dateFormat YYYY-MM-DD
    axisFormat %b %d

    section Foundation
    Relationship Map Context (ES schema + Rails API)  :crit, f1, 2026-05-06, 3d
    Tool I/O Cache (KV setup)                         :f2, after f1, 1d

    section Discovery (Phase 1)
    map_account_network                               :crit, d1, after f2, 3d
    find_ex_employer_connectors                       :d2, after d1, 2d
    map_key_personnel                                 :d3, after d1, 3d

    section Intelligence
    research_target_company                           :i1, after f2, 2d
    find_contact_info                                 :i2, after d3, 2d

    section Discovery (Phase 2)
    find_prior_coworker_connectors                    :d4, after d2, 3d
    find_hq_area_connectors                           :d5, after d2, 2d
    find_industry_veterans                            :d6, after i1, 2d

    section Targeting
    qualify_target_prospects                           :t1, after d3, 2d
    verify_relationship_strength                      :t2, after d4, 3d
    rank_introduction_paths                           :t3, after t2, 2d

    section Engagement
    solicit_offers_for_help                            :e1, after t3, 3d
    personalize_connector_outreach                    :e2, after e1, 3d
    draft_connector_asks                              :e3, after e2, 2d
    process_connector_responses                       :e4, after e3, 3d

    section Discovery (Phase 3)
    find_shared_cohort_connectors                     :d7, after d4, 2d
    find_shared_investor_paths                        :d8, after d7, 3d
    explore_third_degree_paths                        :d9, after t3, 4d
    path_to_named_person                              :d10, after d9, 3d

    section Continuous
    track_personnel_changes                           :c1, after i2, 3d
    
Suggested dependency-aware build sequence
PhaseToolsPriorityNotes
0 — FoundationES schema, Rails API, KV setupBlockingMust land first. Defines contract for all tools.
1 — Core Discoverymap_account_network, find_ex_employer, map_key_personnelCriticalFoundational data. Everything downstream depends on these.
2 — Intelligenceresearch_target_company, find_contact_infoHighBlocked on Lima Data API. Can parallelize with Discovery Phase 2.
3 — Targetingqualify, verify_strength, rank_pathsHighRequires discovery output. Path ranking is the core value prop.
4 — Engagementsolicit, personalize, draft_asks, process_responsesHighRequires targeting output. LLM-powered message generation.
5 — Advanced Discoverycohort, investor paths, 3rd degree, named personMediumHigher complexity, lower urgency. Build after core loop is proven.
6 — Continuoustrack_personnel_changesMediumScheduled job. Can be built anytime after foundation.

Architectural Proposals

The monolithic "Relationship Map Context" doc creates write contention when multiple continuous tools update the same document simultaneously. Below: a key-clustering analysis of all 20 tools, followed by three proposals for splitting the data into separate ES indices.

Tool Key Clustering

Every tool has a natural primary key — the entity it operates on. Mapping all 20 tools by their actual key pattern reveals four clean clusters with zero overlap:

flowchart LR
    subgraph DEAL["Deal Context
account_id × target_company_id"] D1["map_account_network"] D2["find_hq_area_connectors"] D3["find_industry_veterans"] D4["find_ex_employer_connectors"] D5["find_shared_investor_paths"] D6["rank_introduction_paths"] end subgraph TARGET["Target Profile
target_company_id"] T1["map_key_personnel"] T2["qualify_target_prospects"] T3["research_target_company"] T4["track_personnel_changes"] end subgraph PROSPECT["Prospect Paths
account_id × prospect_id"] P1["find_prior_coworker"] P2["find_shared_cohort"] P3["explore_third_degree"] P4["path_to_named_person"] end subgraph EDGE["Relationship Edge
relationship_id"] E1["verify_strength"] E2["solicit_offers"] E3["personalize_outreach"] E4["draft_asks"] E5["process_responses"] E6["find_contact_info"] end DEAL -.->|"prospect_ids"| PROSPECT TARGET -.->|"person_ids"| EDGE PROSPECT -.->|"relationship_ids"| EDGE DEAL -.->|"target_company_id"| TARGET
20 tools clustered by primary key — four natural entities with clear data flow between them
EntityPrimary KeyTools (writers)Write PatternRead Pattern
Deal Context account_id:target_company_id map_account_network, find_hq_area, find_industry_vets, find_ex_employer, find_shared_investor, rank_paths Array append (ID lists), full replace (ranked_paths) Load full doc into orchestrator memory
Target Profile target_company_id map_key_personnel, qualify_prospects, research_company, track_changes Upsert personnel records, overwrite firmographics Account-agnostic — shared across all deals pursuing this target
Prospect Paths account_id:prospect_id find_prior_coworker, find_shared_cohort, explore_3rd_degree, path_to_named_person Array append (path chains) Per-prospect detail view, feeds into targeting
Relationship Edge relationship_id verify_strength, solicit_offers, personalize_outreach, draft_asks, process_responses, find_contact_info Field-level updates (strength, offer state, contact info) Per-relationship detail, hydrates engagement tools
A

Four Indices, One Per Entity

Direct mapping — each entity cluster gets its own ES index
flowchart TB
    subgraph ES["Elasticsearch"]
        IDX1["deal_networks
account:target → ID arrays"] IDX2["target_profiles
target_company → personnel + firmographics"] IDX3["prospect_paths
account:prospect → path chains"] IDX4["relationship_edges
relationship_id → strength + engagement state"] end subgraph TOOLS["Tool Clusters"] DC["Discovery (6)"] TP["Intelligence (4)"] PP["Path Finders (4)"] RE["Engagement (6)"] end DC -->|write| IDX1 TP -->|write| IDX2 PP -->|write| IDX3 RE -->|write| IDX4 ORCH["Orchestrator"] -->|fan-out read| IDX1 ORCH -->|fan-out read| IDX2 ORCH -->|fan-out read| IDX3 ORCH -->|fan-out read| IDX4
Proposal A — each tool cluster writes to its own index, orchestrator assembles context at read time
deal_networks — account × target network map
// Index: deal_networks
// Document ID: {account_id}_{target_company_id}
{
  "account_id": number,
  "target_company_id": number,
  "updated_at": "ISO8601",
  "direct_connections": [
    { "connector_id": number, "prospect_id": number, "relationship_id": number }
  ],
  "ex_employer_connections": [relationship_id, ...],
  "hq_area_connectors": [relationship_id, ...],
  "industry_veterans": [relationship_id, ...],
  "shared_investor_paths": [
    { "path_id": number, "hops": [person_id, ...] }
  ],
  "ranked_paths": [
    {
      "prospect_id": number,
      "connector_id": number,
      "relationship_id": number,
      "score": number,
      "path_type": "direct" | "2nd" | "3rd"
    }
  ]
}
target_profiles — company-level intelligence
// Index: target_profiles
// Document ID: {target_company_id}
// Shared across all accounts pursuing this target
{
  "target_company_id": number,
  "firmographics": {
    "name": "Acme Corp",
    "size": number,
    "revenue": "$50M-100M",
    "industry": "Enterprise Software",
    "hq_location": "San Francisco, CA",
    "source": "lima_data",
    "refreshed_at": "ISO8601"
  },
  "personnel": [
    {
      "person_id": number,
      "name": "Jane Smith",
      "title": "VP Engineering",
      "department": "Engineering",
      "is_decision_maker": boolean,
      "qualified": boolean | null,
      "qualification_score": number | null
    }
  ],
  "recent_changes": [
    {
      "person_id": number,
      "change_type": "new_hire" | "departure" | "promotion",
      "detected_at": "ISO8601"
    }
  ]
}
prospect_paths — per-person path discovery
// Index: prospect_paths
// Document ID: {account_id}_{prospect_id}
{
  "account_id": number,
  "prospect_id": number,
  "prior_coworker_connections": [relationship_id, ...],
  "shared_cohort_connections": [relationship_id, ...],
  "third_degree_paths": [
    {
      "path_id": number,
      "chain": [person_id, person_id, person_id],
      "verified": boolean
    }
  ],
  "named_person_match": {
    "query": "Peggy Smith",
    "resolved_person_id": number | null,
    "confidence": number
  } | null
}
relationship_edges — per-relationship engagement state
// Index: relationship_edges
// Document ID: {relationship_id}
// One doc per connector↔prospect edge
{
  "relationship_id": number,
  "connector_id": number,
  "prospect_id": number,
  "strength": "strong" | "moderate" | "weak" | null,
  "strength_verified_at": "ISO8601" | null,
  "contact_info": {
    "email": "string" | null,
    "phone": "string" | null,
    "linkedin_url": "string" | null
  },
  "offers": [
    {
      "offer_id": number,
      "type": "intro" | "intel" | "referral",
      "status": "pending" | "accepted" | "declined",
      "offered_at": "ISO8601"
    }
  ],
  "asks": [
    {
      "ask_id": number,
      "message_draft": "string",
      "status": "draft" | "sent" | "responded",
      "created_at": "ISO8601"
    }
  ],
  "response_signals": [
    {
      "response_id": number,
      "classification": "positive" | "negative" | "partial",
      "reasoning": "string",
      "received_at": "ISO8601"
    }
  ]
}
StrengthsZero write contention — each tool cluster writes to its own index. Clean partial updates. Each index has its own mapping optimized for its access pattern. Simple mental model: entity = index.
TradeoffsAssembling "full context" requires fan-out read across 4 indices. Orchestrator needs assembly recipe (which indices, which keys). More indices = more operational surface. Cross-entity joins (e.g., "get all relationship edges for a deal") require multi-index queries.
C

Event-Sourced with Projections

Append-only event log + projected state views

Every tool execution is recorded as an immutable event. Current state is derived by projecting events forward. Two indices: one for events, one for projected state. Full audit trail and replayability.

flowchart TB
    subgraph EVENTS["Event Store"]
        LOG["tool_events
append-only log
every tool execution recorded"] end subgraph STATE["Projected State"] S1["entity_state
current state per entity
type: deal | target | prospect | edge"] end subgraph PROJECTOR["Projector"] PR["Event Processor
applies events → state"] end T1["Tool A"] -->|"emit event"| LOG T2["Tool B"] -->|"emit event"| LOG T3["Tool C"] -->|"emit event"| LOG LOG --> PR PR -->|"upsert"| S1 ORCH["Orchestrator"] -->|"query by type + key"| S1 AUDIT["Audit / Debug"] -->|"replay"| LOG
Proposal C — events are the source of truth, state is derived
tool_events — append-only event log
// Index: tool_events
// Document ID: auto-generated (UUID)
// Immutable — never updated or deleted
{
  "event_id": "uuid",
  "tool_id": "map_account_network",
  "entity_type": "deal" | "target" | "prospect" | "edge",
  "entity_key": "42_1337",
  "event_type": "connections_discovered",
  "payload": {
    // tool-specific output
    "relationship_ids": [101, 202, 303]
  },
  "input": {
    "account_id": 42,
    "target_company_id": 1337
  },
  "emitted_at": "ISO8601",
  "duration_ms": 1250
}
entity_state — projected current state
// Index: entity_state
// Document ID: {entity_type}_{entity_key}
// Rebuilt from events — can be destroyed and replayed
{
  "entity_type": "deal",
  "entity_key": "42_1337",
  "version": number,
  "last_event_id": "uuid",
  "state": {
    // shape depends on entity_type
    // deal → same as deal_networks from Proposal A
    // target → same as target_profiles
    // prospect → same as prospect_paths
    // edge → same as relationship_edges
  },
  "projected_at": "ISO8601"
}

Event Types by Tool Cluster

EntityEvent TypesEmitted By
deal connections_discovered, connectors_filtered, investor_paths_found, paths_ranked 6 discovery + targeting tools
target personnel_mapped, prospects_qualified, firmographics_enriched, personnel_change_detected 4 intelligence/personnel tools
prospect coworker_paths_found, cohort_paths_found, third_degree_explored, named_person_resolved 4 path-finding tools
edge strength_verified, offer_solicited, outreach_personalized, ask_drafted, response_processed, contact_found 6 engagement tools
StrengthsComplete audit trail — every tool execution is recorded with input, output, and timing. State is rebuildable from scratch by replaying events. Natural fit for debugging ("what happened to this deal?"). Two indices instead of four. Event log enables analytics (tool execution frequency, latency trends, failure rates).
TradeoffsMost complex to implement. Event projection logic required for each entity type. Event log grows unbounded — needs retention/compaction policy. Harder to reason about "current state" vs. scrolling through event history. Overkill if you don't need auditability. Reads still require assembling context from entity_state (same fan-out as Proposal A).

Side-by-Side Comparison

DimensionA: Four IndicesB: Source + SnapshotC: Event-Sourced
ES indices45 (4 source + 1 snapshot)2 (events + state)
Write contentionNoneNone (source) + rebuild queueNone (append-only)
Read patternFan-out across 4 indicesSingle doc read from snapshotFan-out on entity_state by type
Read latency~4 parallel ES queries~1 ES query~4 ES queries (by entity_type)
Data freshnessImmediateDebounce delay (5–10s)Projection delay (seconds)
Page hydrationRequires assembly logic in frontendSnapshot IS the page dataRequires assembly from projections
Audit trailNone (overwrite)None (overwrite)Full event replay
Storage cost1x~1.8x (source + snapshot duplication)Unbounded (events grow)
Implementation effortLowMedium (+ Sidekiq rebuild job)High (projector + event schema)
Matches meeting intentPartially — no single context objectYes — single loadable contextPartially — adds complexity not discussed

Recommendation: Proposal B

Proposal B gives you the clean write separation of four indices while preserving the "load one object into memory" pattern the morning session converged on. The snapshot materialization is a well-understood pattern that Rails + Sidekiq handles naturally. It also solves the "same object hydrates the page" requirement — the snapshot IS the page data.

Start with Proposal A's four indices as the foundation. Add the snapshot layer (Proposal B) once the first 2–3 tools are writing data and you need the composite read. This avoids premature optimization while keeping the path open.

Proposal C is worth revisiting if auditability becomes a product requirement (e.g., "show me what the concierge did to this deal over the last 30 days").

Risks & Architectural Concerns

P0: Schema Contradicts Rule 4 (Sparse Data)

The meeting established "sparse ID-based data" as a core rule, but target_personnel embeds name/title/department/is_decision_maker, and ranked_paths embeds score/strength/path_type. These are hydrated, not sparse. Either flatten to pure ID arrays with a separate lookup endpoint, or explicitly amend Rule 4 to "sparse where possible, denormalized where the read pattern demands it." Resolve before implementation or it becomes a per-tool argument.

P0: ES Concurrent Partial Updates Will Conflict

Multiple "Continuous" tools write to the same {account_id}_{target_company_id} doc simultaneously. ES _update on nested arrays requires Painless scripts, and concurrent writes throw version_conflict_engine_exception. Options: (a) set retry_on_conflict parameter, (b) break into per-tool sub-documents within the index, (c) use optimistic locking with version checks. This is not theoretical — ranked_paths, network fields, and target_personnel will all be written by different tools on overlapping schedules.

P1: No Auth on Service-to-Service Boundary

Orchestrator-to-Rails crosses from Cloudflare into your infrastructure. The API contract specifies no authentication mechanism — no mTLS, signed JWT, API key, or shared secret. Also missing: idempotency keys on POST /results for safe retries, and request_id for traceability across the pipeline.

P1: Hydration Gap for LLM-Powered Tools

personalize_connector_outreach reads the "full context object" — but the context is sparse IDs. Where does name/title/relationship history get hydrated for the LLM prompt? This is the critical path for all engagement tools and is currently unaddressed. Likely needs a dedicated hydration endpoint or a denormalized view specifically for LLM consumption.

P1: Missing Tools in the Catalog

  • send_connector_messagedraft_connector_asks and personalize_connector_outreach produce drafts. Nothing sends them. Need an explicit send tool with approval gate, or document that sending is manual.
  • rebuild_relationship_map_context — if the ES context gets stale or corrupted, there's no regeneration mechanism.
  • person_deduppath_to_named_person creates person records that may collide with map_key_personnel output.

P2: D1 vs Durable Objects for Orchestration

D1 is SQL — fine for status records but cannot coordinate long-running multi-step tool chains across worker invocations. Durable Objects are the CF-native primitive for stateful orchestration. Consider DO as the orchestration runtime with D1 as the persistence layer underneath.

P2: "Continuous" Trigger Policy Undefined

Eight tools are tagged "Continuous" with no trigger mechanism defined. What runs them — cron interval, delta detection, event subscription? Different tools likely need different strategies. solicit_offers_for_help running on a naive cron is a spam risk without throttling and cooldown periods.

P2: React → ES Direct Read Is a Trust Boundary

The data flow diagram shows React reading directly from ES. This means ES needs its own auth/scoping layer or you expose raw index access to the frontend. Recommendation: route all frontend reads through Rails (which already has auth), or explicitly commit to an ES proxy with ACLs.

Open Questions

1. Elasticsearch Partial Updates

Can ES handle the partial update pattern we need? Each tool writes to a specific sub-field of the relationship map context. ES supports _update with partial docs and scripted updates — but need to validate this works with nested arrays of IDs.

From meeting: "A good question is can Elasticsearch handle partial updates."

2. Redis/KV Intermediate Layer — When?

The meeting agreed ES is sufficient initially, with Redis/KV as a future scaling layer. But Cloudflare KV was also discussed for caching tool I/O. Need to clarify: is KV for tool I/O caching only, or also for serving the relationship map context to the orchestrator?

From meeting: "I'm not exactly sure about the appropriate conditions or configuration... whatever we build here adheres to a tight schema."

3. Context Object as Page Hydration Source

The relationship map context was described as "also the right data to hydrate this whole page." Does this mean the React frontend reads directly from ES, or does it go through Rails? If ES, do we need a separate read-only API or direct ES queries from the frontend?

From meeting: "Whatever map context is, if we play our cards right, it's also the right data to hydrate this whole page."

4. People ID Batch Performance

A batch of 1,500 people IDs is taking 10 minutes (out of ~55,000 total). Is this an Elasticsearch query issue or a SQL query bottleneck? This directly affects how map_account_network performs at scale.

From meeting: "A batch of 1,500 people is taking 10 minutes... Is it an Elasticsearch issue? Is it a SQL query issue?"

5. Lima Data API Scope

Which tools can Lima Data cover? Confirmed: research_target_company, map_key_personnel, find_industry_veterans. But what about department-level data, contact info, and what are the token/rate limits?

From meeting: "Can you start looking at their API and just starting to get your head around it?"

6. Orchestration Trigger Modalities

Three trigger types were identified: continuous (scheduled), event-driven, on-demand. How does the orchestrator decide when to run continuous tools? Time-based cron? Delta detection? What events trigger event-driven tools?

From meeting: "I keep getting caught up on the different orchestration modalities."

7. D1 ↔ Rails Data Flow

The orchestrator runs on Cloudflare with D1 for state. Tools write results to Rails API. But how does orchestration state (job status, progress) flow from D1 to the Rails frontend? Direct API? Webhook? Polling?

From meeting: "How do we get from D1 into the application database?"

External Data Sources

SourceUsed ByStatusNotes
LinkedIn / Internal GraphAll discovery + targeting toolsExistingCore data source. Already integrated.
Lima Data APIresearch_target_company, map_key_personnel, find_industry_veteransInvestigationMichael reviewing API. Includes department + company data.
Connector Surveyverify_relationship_strength, solicit_offers, process_responsesExistsExisting survey infrastructure for connector engagement.
Email Inferencefind_contact_infoTBDPattern-based email discovery. May need third-party service.
Cap Table Datafind_shared_investor_pathsTBDInvestor/board data. Source undetermined. Crunchbase? PitchBook?
Web Scrapemap_key_personnel, research_target_company, path_to_named_personTBDSupplementary enrichment. Compliance considerations.
News Feedsresearch_target_companyTBDCompany news monitoring. Could be RSS or news API.

Deliverables & Deadlines

WhatWhoWhen
All 20 agent tools broken down — inputs, outputs, build requirementsBothEOD May 6
Linear tickets created for each tool with clear plansCameronMay 5–6
Lima Data API investigation — endpoints, limits, token costsMichaelBefore 1 PM today
Orchestration interface design (notifications, data display)CameronThis week
Rails API endpoints for concierge tool writesMichaelThis week
Relationship Map Context ES schema definitionTBDBefore building any tools
Diagnose 55K people ID / 10-min batch performance issueMichaelASAP

PR Discipline

One PR at a time. "Instead of having 10 PRs over the course of two weeks, sit for two weeks and have one PR." Each tool gets its own Linear ticket with clear plan, inputs, and outputs.