Dossier / History / v1

Version 1

Current

Created 7 days ago

Changelog

Initial version

Skill Content

# Dossier — Decision-Grade Entity Research > **Portability:** Requires `WebSearch` + `WebFetch`, Node.js with `docx` package, and optionally `bash_tool` + `curl` for free APIs (SEC EDGAR, GitHub, ProPublica). BYOK MCPs (LinkedIn, Crunchbase, Apollo, Pitchbook, SimilarWeb) are optional enhancements. Works in Claude Code CLI natively. ## Non-Generic Framing — The Differentiator This skill is **decision-grade entity research with hypothesis-testing**. It **refuses** to be "tell me about Microsoft". Every invocation forces the user to expose their hypothesis upfront (Q4) so the dossier *tests* it rather than confirms it. The use case shape: > "I'm pitching Microsoft Tuesday. My hypothesis is they're consolidating AI spend on their first-party Foundry platform. Validate or disprove, and give me three conversation hooks tied to what you find." **NOT:** > "Tell me about Microsoft." The forcing Q4 — the hypothesis question — is the non-generic anchor. Skip it and the skill produces a Wikipedia summary. See [`references/hypothesis_testing_discipline.md`](references/hypothesis_testing_discipline.md) for the canon. ## Agent Integrity Rules (Research-Pack Convention) Locked verbatim per PR #657 audit. - **Execution discipline.** Sequential search calls. WebSearch + WebFetch have looser rate limits than Consensus but still apply 1 q/sec etiquette. Confirm response received before next call. - **Source discipline.** Cite only sources returned by this session's tool calls. Wikipedia / training knowledge labeled `[Background — verify before quoting]` and excluded from primary findings count. - **Three-count tracking.** Queries sent / sources received / sources cited. Plus **per-tier breakdown** (primary / secondary / tertiary) unique to dossier. Surfaced in audit log. - **Retry policy.** On failure → wait 3s → retry once → log. After 3 consecutive failures: stop, alert user. - **Source reliability tier.** Each citation tagged primary (official, SEC, court records) / secondary (mainstream news, trade press) / tertiary (blogs, forums). DOCX surfaces tier on every flag. ## Phase 1: Grill-Me Intake (6 forcing questions, one at a time) ### Q1 (root) — Subject identity > **Who is the subject? Give me the exact name and, if a company, the website or LinkedIn URL. If a person, their LinkedIn URL or a unique identifier (company affiliation + role).** > > *Why I'm asking:* Disambiguation. There are 47 John Smiths. There are three companies called "Atlas". I need a specific entity to research. If user gives only a name, push for a second identifier. **Refuse to proceed on ambiguous names.** ### Q2 (depends on Q1) — Subject type > **What kind of subject is this? Pick one: person / company / nonprofit / government org / other.** > > *Why I'm asking:* Different source matrices apply. For people I check LinkedIn, GitHub, Scholar, news; for companies I check SEC EDGAR (if public), Crunchbase, news, GitHub for tech orgs; for nonprofits I check Form 990s on ProPublica. Forcing choice. "Other" requires a one-line description. ### Q3 (depends on Q2) — Purpose > **What are you preparing for? Pick one:** > > 1. Sales meeting / partnership pitch > 2. Investment diligence > 3. Acquisition diligence > 4. Journalism / due diligence > 5. Job interview prep > 6. Competitive intelligence > 7. Personal vetting (date, hire, business partner) > 8. Other (specify) > > *Why I'm asking:* The purpose dictates the angle, the depth, and the red-flag sensitivity. Sales prep needs conversation hooks. Investment diligence needs traction signals. Personal vetting needs careful sensitivity boundaries. ### Q4 (depends on Q3) — **Hypothesis — MANDATORY** > **What's your hypothesis going in? What do you already believe about this subject, and what do you want to verify or disprove?** > > *Why I'm asking:* This is the critical question. A dossier that just confirms what you already think is worthless. By stating your hypothesis upfront, I can search for evidence that would *disprove* it as well as evidence that supports it — and give you a verdict you can actually use. > > Examples: > - "I believe Microsoft is consolidating AI spend on first-party Foundry. Verify or disprove." > - "I think the CEO is over their head — too much TAM talk, no traction. Test that." > - "I believe this nonprofit's overhead ratio is sketchy. Check the 990s." > - "I think this person is technical enough to handle a CTO role. Verify." **MANDATORY.** If user says "I don't have one", push back **once**: "Then guess. Commit to a position you can update later. The dossier needs a hypothesis to test, otherwise it's a generic profile and won't help you make a decision." If still refused: fall back to implicit hypothesis "what's the most surprising thing I could find?" and **flag the fallback in audit log**. This question is **the non-generic anchor**. Skip it and the skill becomes a Wikipedia summary. ### Q5 (depends on Q3) — Depth > **Time horizon: 5-minute brief or 15-minute decision-grade dossier?** > > *Why I'm asking:* Brief mode caps at ~10 searches and skips the network + reputation passes. Decision-grade goes deeper on every section. Pick based on how much skin you have in this decision. Forcing choice. ### Q6 (asked only if Q3 ∈ {journalism, personal vetting}) — Sensitivities > **Anything sensitive to exclude? E.g., personal medical, family details, political history, or specific topics off-limits?** > > *Why I'm asking:* Some research contexts have ethical constraints. I'd rather know upfront than surface something you'd never share. Skip for sales/investment/acquisition/competitive intel (low sensitivity); ask for journalism/personal vetting (high sensitivity). **Stop condition:** After Q6 (or earlier with dependency skips), commit and start Phase 2. Never re-open intake after Phase 2 begins. ## Phase 2: Subject Disambiguation Before Phase 3, resolve the subject to a specific entity: - For people: confirm LinkedIn URL OR (employer + role + city) - For companies: confirm domain OR (legal name + incorporation jurisdiction) - For nonprofits: confirm EIN OR (legal name + state) - For government orgs: confirm official .gov URL If still ambiguous after Q1 push-back: **halt and re-ask Q1** with disambiguating identifiers. Refuse to proceed. ## Phase 3: Source Matrix Selection Routed by Q2 subject type. See [`references/subject_type_source_matrix.md`](references/subject_type_source_matrix.md) for the full canon. ### Person - LinkedIn (manual fetch or LinkedIn MCP if BYOK) - Personal website - Twitter/X (rate-limited; degrade gracefully) - GitHub (if technical subject) - Google Scholar (if academic) - News (WebSearch + WebFetch) - Conference talk transcripts, podcasts (WebSearch) ### Company - Official website (about, leadership, news, careers) - SEC EDGAR (free API; 10-Ks, 10-Qs, 8-Ks for public co's) - Crunchbase free tier (or Crunchbase MCP if BYOK) - News (WebSearch + WebFetch) - GitHub (for tech orgs) - Glassdoor + Comparably (sentiment; degrade gracefully if scraping blocked) - LinkedIn company page ### Nonprofit - ProPublica Nonprofit Explorer (free; Form 990s) - Official website - News - GuideStar (if accessible) ### Government org - Official .gov sites - News - ProPublica (for federal agencies) If a paid MCP is connected (Apollo, Pitchbook, SimilarWeb), use it but mark findings as **BYOK-sourced** in the audit log. ## Phase 4: Hypothesis-Driven Search Every Phase 4 search MUST be classified as either: - **Supporting evidence** (confirms hypothesis), OR - **Disconfirming evidence** (would refute hypothesis) **≥30% of search budget allocated to disconfirming queries.** Enforced via `scripts/disconfirming_evidence_balance.py`. Example for hypothesis "Microsoft is consolidating AI spend on Foundry": - **Supporting:** "Microsoft Foundry adoption 2026", "Microsoft AI infrastructure consolidation" - **Disconfirming:** "Microsoft OpenAI deal renegotiation", "Microsoft AI vendor diversification", "Microsoft third-party model partnerships 2026" This is what makes the dossier **decision-grade** rather than confirmation-biased. For each search: - Record via `citation_tracker.py` with classification (supporting / disconfirming) - Apply source tier from `source_tier_classifier.py` to each result URL ## Phase 5: 12-Month Activity Timeline Default 12-month window for activity timeline; deeper for foundational identity. Categories: - News (acquisitions, hires, departures, product launches) - Funding rounds / financial events - Controversies / legal events - Public statements / strategy shifts Reverse chronological. Each entry hyperlinked + tiered. ## Phase 6: Network + Reputation Signals ### Network - **Companies:** investors (in/out), customers (named), partners - **People:** co-founders, advisors, mentors, employers, board roles - **Nonprofits:** funders, board, leadership 5-10 entries, ranked by **relevance to hypothesis**. ### Reputation - Sentiment from news (recent 12 months) - Glassdoor for companies (overall rating + 3 representative reviews) - Peer mentions for people - Caveat: reputation data is noisy; tier accordingly ## Phase 7: Red-Flag Pass Surface but don't sensationalize: - Litigation (court records → primary tier) - Regulatory actions (SEC, DOJ, agency actions → primary) - Unusual departures (key personnel exits within 90 days) - Financial signals (going-concern notes in 10-Ks → primary) - Reputation hits (sustained negative coverage → secondary) **Each flag tiered.** Tier shows up next to every flag in the DOCX. ## Phase 8: Conversation Hook Generation 3-5 specific hooks tied to **actual findings**, not generic talking points. See [`references/conversation_hook_quality.md`](references/conversation_hook_quality.md) for the canon. | ❌ Generic | ✅ Finding-tied | |---|---| | "Ask about their roadmap" | "Mention their recent acquisition of [X] — it signals they're investing in vertical Y. Suggested framing: 'Saw the [X] announcement — how does that change your roadmap on Y?'" | | "Ask about hiring" | "Their VP Engineering left 3 weeks ago (LinkedIn). Suggested framing: 'I noticed [name] moved on — what's the eng leadership plan?'" | | "Talk about their values" | "They updated their pricing page last week (their official site). Suggested framing: 'Saw the pricing refresh — what drove that?'" | Each hook: - **The hook** (one sentence) - **The finding it's tied to** (with hyperlink + tier) - **Suggested framing** (verbatim phrasing user can adapt) ## Phase 9: DOCX Generation (9 Sections) Via Node.js + `docx` library. 1. **Executive Summary** — one paragraph: who they are + why they matter + **verdict on the hypothesis** (SUPPORTED / PARTIALLY SUPPORTED / DISPROVEN / INCONCLUSIVE) + 3 things-you-should-know bullets. 2. **Identity Facts Table** — founded/born, location, size/stage, current role, key affiliations. All cells sourced; hover-text tier. 3. **Hypothesis Test** — user's hypothesis stated verbatim. Supporting evidence (3-5 bullets with hyperlinked citations). Disconfirming evidence (3-5 bullets with hyperlinked citations). Verdict paragraph (2-3 sentences explaining the weight). 4. **12-Month Activity Timeline** — News, funding, hires, departures, product launches, controversies. Reverse chronological. Each entry hyperlinked. 5. **Network Signals** — Collaborators / investors / associates. 5-10 entries, ranked by relevance to hypothesis. 6. **Reputation Signals** — Sentiment from news, Glassdoor for companies, peer mentions for people. Caveat: reputation data is noisy. 7. **Red Flags + Hidden Patterns** — Litigation, regulatory actions, unusual departures, financial signals, reputation hits. Tiered. 8. **Conversation Hooks** — 3-5 specific hooks tied to findings. Each: hook + finding + suggested framing. 9. **Source Provenance + Audit Log** — Per-source list with tier. Search summary table (#, query, classification, sources returned, sources cited). Three counts + per-tier counts. Failed searches. BYOK-MCP usage flag. ### Styling Arial 12pt body, navy headings (#1a3a5c), light blue table headers (#e8f0f8), red red-flag callout, green conversation-hook callout. ### Hyperlink patterns ```js new ExternalHyperlink({ link: "https://...", children: [new TextRun({ text: title, style: "Hyperlink" })], }); ``` ## Phase 10: Deliver - Save: `<output-dir>/dossier_<entity-slug>_<YYYY-MM-DD>.docx` - Chat summary: file path + **verdict on hypothesis** + audit counts + tier breakdown + BYOK MCPs used (if any) - Validate: check zip integrity with `python3 -c "import zipfile,sys; zipfile.ZipFile(sys.argv[1]).testzip()" <docx>` (no output = intact), then confirm the required sections are present ## Tooling | Script | Role | |---|---| | `scripts/citation_tracker.py` | Three-count audit + supporting/disconfirming classification + source-tier tagging at `~/.dossier_sessions/<session>.json` | | `scripts/disconfirming_evidence_balance.py` | Verifies ≥30% of search budget allocated to disconfirming queries; warns if biased | | `scripts/source_tier_classifier.py` | URL → primary / secondary / tertiary classification via domain heuristics | ## References - [`references/hypothesis_testing_discipline.md`](references/hypothesis_testing_discipline.md) — ≥30% rule + decision-grade vs encyclopedic (7+ sources) - [`references/subject_type_source_matrix.md`](references/subject_type_source_matrix.md) — person/company/nonprofit/gov source matrices (7+ sources) - [`references/conversation_hook_quality.md`](references/conversation_hook_quality.md) — finding-tied hook discipline (7+ sources) ## Error Handling | Failure | Behavior | |---|---| | Subject name ambiguous | Refuse to proceed. Re-ask Q1 with disambiguating identifier. | | User refuses to state hypothesis | Push back once. If still refused, fall back to "what's the most surprising thing I could find?" implicit hypothesis. Flag in audit. | | Subject has zero public footprint | Surface explicitly. Suggest different name or early-stage. Don't fabricate. | | LinkedIn scrape blocked | Note in audit; fall back to WebSearch; suggest user verify manually. | | SEC EDGAR fails | Retry once. If still failing, note "public filings not retrieved" and continue. | | Sentiment data sparse | Mark reputation section as "limited public signal"; don't infer from training. | | Sensitive topic surfaces (Q6 exclusion) | Exclude from DOCX. Note in chat (not in DOCX) so user knows the exclusion was honored. | | 3 consecutive tool failures | Stop, alert user, share collected so far. | | DOCX generation fails | Save raw data as JSON fallback. | ## Anti-Patterns To Reject - Producing a dossier without forcing Q4 hypothesis - Allocating <30% of search budget to disconfirming evidence - Batching intake questions - Accepting ambiguous subject names - Generic conversation hooks ("ask about their roadmap") - Sensationalizing red flags (tier them, don't editorialize) - Skipping the source-reliability tier on flags - Fabricating coverage when LinkedIn or scraping is blocked - Using BYOK-MCP data without flagging in audit log - Including sensitive topics user excluded in Q6 - Confirmation-biased verdict ("SUPPORTED" without engaging with disconfirming evidence) --- **Version:** 1.0.0 **Source spec:** [`megaprompts/12-dossier-megaprompt.md`](../../../../megaprompts/12-dossier-megaprompt.md) **Build pattern:** Path B (direct conversion). Research-pack sibling, hypothesis-testing variant.