Multi-LLM Academic Document Pipeline Prompt
Build a LangGraph pipeline in Python that generates a 6,000–10,000 word academic article titled “From the Muses to the Mind: A Genealogy of Inspiration” , tracing the concept of inspiration from ancient philosophy through modern empirical psychology (with particular emphasis on Todd Thrash’s work).
Architecture
The pipeline follows this flow:
Two-Pass Search → Parallel Drafters → Assembly → Lead Editor (with conditional revision loop, max 3 rounds) → Editorial Polish → Finalize
Models
-
Claude Sonnet 4-6
(
claude-sonnet-4-6): Discovery round + Ancient/Medieval drafter. max_tokens=16384, temperature=0.4 -
GPT-5.2
(
gpt-5.2): Enlightenment/Secularization drafter. max_tokens=16384, temperature=0.4 -
Gemini 3 Pro
(
gemini-3-pro-preview): Modern Psychology drafter (Thrash). max_tokens=16384, temperature=0.4 -
Claude Opus 4-6
(
claude-opus-4-6): Lead editor + polish pass. max_tokens=32768, temperature=0.3
API Key Management
-
Use a
.envfile withpython-dotenv(auto-loads from script directory, then cwd) -
Required keys:
ANTHROPIC_API_KEY,OPENAI_API_KEY,GOOGLE_API_KEY -
Optional:
TAVILY_API_KEY(free tier, 1,000 searches/month) — pipeline falls back to LLM-generated research briefs if unavailable - Fail fast on startup with clear error messages showing missing keys + URLs to obtain them
Search Node (Two-Pass)
Pass 1: Discovery
- Claude Sonnet identifies major figures, works, concepts, and search queries across 8 research vectors: ancient_greek_roman, medieval_early_christian, renaissance_early_modern, enlightenment, romanticism, nineteenth_century_naturalism, modern_psychology_cognitive_science, cross_cutting_concepts
- No hard-coded names — the discovery round surfaces relevant scholars/texts dynamically
- Use a system message to force JSON-only output (“You are a JSON-only response API”)
- Do NOT use assistant message prefill — Claude Sonnet 4-6 does not support it
Pass 2: Deep Dive
- If Tavily is available and discovery yielded queries: run Tavily searches (advanced depth)
- If Tavily is available but discovery yielded 0 queries: use 9 hardcoded fallback queries covering all eras (including two specifically for Thrash’s tripartite model and Inspiration Scale)
- If Tavily is unavailable: fall back to an LLM-generated research brief
Parallel Drafters
Three drafters run concurrently via LangGraph fan-out:
- Claude Sonnet — Ancient/Medieval sections:
- I. The Divine Breath (Greek muses, Plato’s Ion, divine madness, Aristotle, Longinus)
-
II. Sacred Fire (Augustine, Aquinas, Renaissance genius, Ficino, Vasari)
-
GPT-5.2 — Enlightenment/Secularization sections:
- III. The Secularization of Genius (Kant, Schelling, Romantic poets, Coleridge, Shelley)
-
IV. The Disenchantment of Inspiration (Wundt, James, Nietzsche, Freud, Galton)
-
Gemini 3 Pro — Modern Psychology sections:
- V. Measuring the Muse (Thrash & Elliot’s tripartite model, Inspiration Scale, empirical findings)
- VI. Bridging the Gap (Thrash’s findings vs. ancient accounts, criticisms, neighboring constructs like flow)
Each drafter receives research data + optional editor feedback for revisions. Style: formal academic prose, Chicago author-date citations, 1,500–2,500 words per section.
State Management
-
Use
PipelineStateTypedDict withAnnotated[dict[str, str], _merge_dicts]formanuscript_sections -
The
_merge_dictsreducer handles concurrent writes from parallel drafters (LangGraph fan-out pattern) - Each drafter returns only its own section key; the reducer merges automatically
Assembly
Stitch sections in order: ancient_medieval → enlightenment → modern_psychology, separated by
---
dividers.
Lead Editor (Two-Phase)
Phase 1: Assessment (short JSON)
- Claude Opus reads the manuscript and returns a small JSON verdict: verdict (FINALIZE/REVISE), word_count_estimate, tone_issues, continuity_issues, expansion_needed, revision_target, detailed_feedback
- Do NOT ask the editor to include the manuscript text in the JSON — this causes truncation
- Conditional routing: if REVISE → route to the specific drafter for revision, then re-assemble, then back to editor
- Max 3 revision rounds (safety cap)
Phase 2: Editorial Polish (only on FINALIZE)
- A separate Opus call that takes the raw manuscript and returns only the polished text
- Sanity check: if polished output is less than 90% of the original word count, it’s truncated — keep the original instead
JSON Parsing
LLM JSON output is unreliable. Implement robust parsing:
-
_extract_json(): Strip markdown fences, then brace-depth counting to find the complete JSON object. Handle escaped characters inside strings correctly (skipi += 2on backslash). -
_repair_json()with three strategies: -
json.JSONDecoder(strict=False)— allows control characters (unescaped newlines) - Trailing comma removal via regex, then re-parse
-
Regex field extraction as nuclear fallback — pull out just
verdict,revision_target,detailed_feedbackwith regex patterns (this is all the editor routing needs) - Include debug logging: print first 200-300 chars on failure so you can diagnose issues
Response Normalization
Implement
_get_text(response)
to normalize all LLM outputs:
-
langchain-google-genai
(Gemini) returns
response.content
as a list of dicts
[{"type": "text", "text": "..."}]
instead of a plain string
- Handle: strings, lists of dicts, lists of strings
- Apply everywhere
response.content
is accessed
Error Handling
- All drafters wrapped in try/except — one API failure doesn’t crash the pipeline
- Tavily connection validated with a test search on init
- If editor JSON parsing fails completely, force FINALIZE with the current manuscript
- If the polish pass fails or produces truncated output, keep the unpolished original
Output
Final markdown file
from_the_muses_to_the_mind.md
with:
- YAML front-matter (title, authors, date, word count, abstract, keywords)
- Full article body
- Generation credits footer (which models drafted/edited, date, word count)
Dependencies
langgraph>=0.2.0
langchain>=0.3.0
langchain-openai>=0.3.0
langchain-anthropic>=0.3.0
langchain-google-genai>=4.0.0
tavily-python>=0.5.0
python-dotenv>=1.0.0