Multi-LLM Academic Document Pipeline Prompt

Build a LangGraph pipeline in Python that generates a 6,000–10,000 word academic article titled “From the Muses to the Mind: A Genealogy of Inspiration” , tracing the concept of inspiration from ancient philosophy through modern empirical psychology (with particular emphasis on Todd Thrash’s work).

Architecture

The pipeline follows this flow:

Two-Pass Search → Parallel Drafters → Assembly → Lead Editor (with conditional revision loop, max 3 rounds) → Editorial Polish → Finalize

Models

  • Claude Sonnet 4-6 ( claude-sonnet-4-6 ): Discovery round + Ancient/Medieval drafter. max_tokens=16384, temperature=0.4
  • GPT-5.2 ( gpt-5.2 ): Enlightenment/Secularization drafter. max_tokens=16384, temperature=0.4
  • Gemini 3 Pro ( gemini-3-pro-preview ): Modern Psychology drafter (Thrash). max_tokens=16384, temperature=0.4
  • Claude Opus 4-6 ( claude-opus-4-6 ): Lead editor + polish pass. max_tokens=32768, temperature=0.3

API Key Management

  • Use a .env file with python-dotenv (auto-loads from script directory, then cwd)
  • Required keys: ANTHROPIC_API_KEY , OPENAI_API_KEY , GOOGLE_API_KEY
  • Optional: TAVILY_API_KEY (free tier, 1,000 searches/month) — pipeline falls back to LLM-generated research briefs if unavailable
  • Fail fast on startup with clear error messages showing missing keys + URLs to obtain them

Search Node (Two-Pass)

Pass 1: Discovery

  • Claude Sonnet identifies major figures, works, concepts, and search queries across 8 research vectors: ancient_greek_roman, medieval_early_christian, renaissance_early_modern, enlightenment, romanticism, nineteenth_century_naturalism, modern_psychology_cognitive_science, cross_cutting_concepts
  • No hard-coded names — the discovery round surfaces relevant scholars/texts dynamically
  • Use a system message to force JSON-only output (“You are a JSON-only response API”)
  • Do NOT use assistant message prefill — Claude Sonnet 4-6 does not support it

Pass 2: Deep Dive

  • If Tavily is available and discovery yielded queries: run Tavily searches (advanced depth)
  • If Tavily is available but discovery yielded 0 queries: use 9 hardcoded fallback queries covering all eras (including two specifically for Thrash’s tripartite model and Inspiration Scale)
  • If Tavily is unavailable: fall back to an LLM-generated research brief

Parallel Drafters

Three drafters run concurrently via LangGraph fan-out:

  1. Claude Sonnet — Ancient/Medieval sections:
  2. I. The Divine Breath (Greek muses, Plato’s Ion, divine madness, Aristotle, Longinus)
  3. II. Sacred Fire (Augustine, Aquinas, Renaissance genius, Ficino, Vasari)

  4. GPT-5.2 — Enlightenment/Secularization sections:

  5. III. The Secularization of Genius (Kant, Schelling, Romantic poets, Coleridge, Shelley)
  6. IV. The Disenchantment of Inspiration (Wundt, James, Nietzsche, Freud, Galton)

  7. Gemini 3 Pro — Modern Psychology sections:

  8. V. Measuring the Muse (Thrash & Elliot’s tripartite model, Inspiration Scale, empirical findings)
  9. VI. Bridging the Gap (Thrash’s findings vs. ancient accounts, criticisms, neighboring constructs like flow)

Each drafter receives research data + optional editor feedback for revisions. Style: formal academic prose, Chicago author-date citations, 1,500–2,500 words per section.

State Management

  • Use PipelineState TypedDict with Annotated[dict[str, str], _merge_dicts] for manuscript_sections
  • The _merge_dicts reducer handles concurrent writes from parallel drafters (LangGraph fan-out pattern)
  • Each drafter returns only its own section key; the reducer merges automatically

Assembly

Stitch sections in order: ancient_medieval → enlightenment → modern_psychology, separated by --- dividers.

Lead Editor (Two-Phase)

Phase 1: Assessment (short JSON)

  • Claude Opus reads the manuscript and returns a small JSON verdict: verdict (FINALIZE/REVISE), word_count_estimate, tone_issues, continuity_issues, expansion_needed, revision_target, detailed_feedback
  • Do NOT ask the editor to include the manuscript text in the JSON — this causes truncation
  • Conditional routing: if REVISE → route to the specific drafter for revision, then re-assemble, then back to editor
  • Max 3 revision rounds (safety cap)

Phase 2: Editorial Polish (only on FINALIZE)

  • A separate Opus call that takes the raw manuscript and returns only the polished text
  • Sanity check: if polished output is less than 90% of the original word count, it’s truncated — keep the original instead

JSON Parsing

LLM JSON output is unreliable. Implement robust parsing:

  1. _extract_json() : Strip markdown fences, then brace-depth counting to find the complete JSON object. Handle escaped characters inside strings correctly (skip i += 2 on backslash).
  2. _repair_json() with three strategies:
  3. json.JSONDecoder(strict=False) — allows control characters (unescaped newlines)
  4. Trailing comma removal via regex, then re-parse
  5. Regex field extraction as nuclear fallback — pull out just verdict , revision_target , detailed_feedback with regex patterns (this is all the editor routing needs)
  6. Include debug logging: print first 200-300 chars on failure so you can diagnose issues

Response Normalization

Implement _get_text(response) to normalize all LLM outputs: - langchain-google-genai (Gemini) returns response.content as a list of dicts [{"type": "text", "text": "..."}] instead of a plain string - Handle: strings, lists of dicts, lists of strings - Apply everywhere response.content is accessed

Error Handling

  • All drafters wrapped in try/except — one API failure doesn’t crash the pipeline
  • Tavily connection validated with a test search on init
  • If editor JSON parsing fails completely, force FINALIZE with the current manuscript
  • If the polish pass fails or produces truncated output, keep the unpolished original

Output

Final markdown file from_the_muses_to_the_mind.md with: - YAML front-matter (title, authors, date, word count, abstract, keywords) - Full article body - Generation credits footer (which models drafted/edited, date, word count)

Dependencies

langgraph>=0.2.0
langchain>=0.3.0
langchain-openai>=0.3.0
langchain-anthropic>=0.3.0
langchain-google-genai>=4.0.0
tavily-python>=0.5.0
python-dotenv>=1.0.0

Colophon

After I finally got the script to work, I asked Claude to read the prompt back to me.