Ziwei Doushu AI Platform — From a Take-Home Test to a Full Multi-Agent Product

An account-based AI fortune-telling platform pairing a deterministic Ziwei Doushu chart engine (iztro) with a genuine parallel multi-agent LangGraph pipeline, RAG knowledge base, and a full subscription/token system — deployed on Docker + Render.

TL;DR

An AI platform that turns a Ziwei Doushu (Chinese astrology) reading into something you can actually trust and inspect: the chart is computed by a real astrology engine — never guessed by an LLM — and the analysis is produced by three independently-personaed AI agents working in parallel, then merged by a fourth. Around that core sits a full product shell: accounts, saved charts, a streaming chat with a fortune-teller persona, and a subscription system with usage-based tokens.

Why I Built This

This project began as a take-home test before an internship — a simple linear pipeline where an LLM both "drew" and interpreted a chart. Two things bothered me about that approach. First, LLMs hallucinate charts: ask the same birth data twice and you can get two different star placements, which is a non-starter for anything claiming to be an analysis tool. Second, most "multi-agent" demos I'd seen were really just one model switching personas in a loop — convincing in a demo, but not an actual multi-agent system. I kept rebuilding this project, partly to fix both problems properly, and partly to push it from a weekend script into something with the shape of a real product: accounts, persistence, billing logic, and deployment.

System at a Glance

Ziwei Doushu Multi-Agent AI

The frontend computes the chart with the official iztro engine directly in the browser — the same JSON is used to render the chart board and sent to the backend, so what the user sees and what the AI reasons over are guaranteed to match.
The chart is POSTed to a FastAPI backend, where a LangGraph graph runs: a researcher node retrieves shared context from a ChromaDB RAG knowledge base, then fans out to three parallel Gemini agents — a reasoning analyst, a domain expert, and a creative interpreter — and fans back in to a coordinator that synthesizes the final report.
Around that core sits an account layer (PostgreSQL + JWT + Google OAuth), a sidebar app shell (saved charts, career/love fortune tools, a real-time streaming chat with the persona "玄機子"), and a subscription system that meters usage with "star tokens."

Why These Technology Choices

iztro over LLM-generated charts — moving chart computation to a deterministic, open-source engine eliminates hallucination at the source; the LLM's job is reduced to interpretation, which is what it's actually good at.
LangGraph over a simple prompt chain — I needed explicit control over branching, parallel execution, and shared state, plus first-class tracing (LangSmith) to prove the agents run independently rather than just claiming they do.
Single-vendor Gemini (LLM + embeddings) — the original version mixed Claude, OpenAI, and Tavily; consolidating onto one provider with one API key cut both cost and operational surface area without sacrificing quality.
ChromaDB — a lightweight, file-backed vector store that indexes the knowledge base idempotently on startup; no separate service to operate for a project this size.
PostgreSQL + SQLAlchemy + Alembic — once the project grew from a stateless analysis tool into an account-based product, I needed real relations (users → chart profiles → chat sessions → messages) and reproducible schema migrations.
Next.js 14 (App Router) + Tailwind — fast iteration on a content-and-interaction-heavy UI, with a route-group sidebar shell that keeps URLs clean after login.
Docker + Render — containerized backend, frontend, and Postgres so the whole stack can be reproduced locally and deployed to a free-tier cloud target with one Blueprint file.

What I'm Most Proud Of

1. A genuinely parallel multi-agent pipeline, not role-play. The graph fans out from researcher to three independently-personaed Gemini calls running concurrently, and fans back in to a coordinator. The tricky part was the shared state: three branches writing to the same agent_outputs field at once would normally clobber each other, so I used LangGraph's operator.add reducer to merge concurrent writes safely. The frontend then exposes each agent's individual analysis in an expandable "multi-agent process" view — and LangSmith traces show the three calls overlapping on the timeline, which is the actual proof that this is parallel execution, not sequential persona-switching dressed up as one.

2. The chart as a single source of truth. The same iztro-computed JSON renders the on-screen chart board and is what the backend reasons over — so the chart a user sees is, by construction, identical to the chart the AI is interpreting. This single design decision removed an entire class of "the AI made up a star that isn't even on my chart" complaints.

3. A real subscription/token economy. Three plans (free / basic / premium) gate features (career tools, love tools, compatibility analysis) and meter usage through "star tokens" — every chat message, analysis, and profile creation has a token cost, backed by an atomic balance-and-ledger system with monthly refresh. It's a small thing to describe, but it's the difference between a demo and something that could actually run as a product.

Challenges & How I Solved Them

"Multi-agent" that wasn't. The earlier version used one LLM cycling through personas in a ReAct loop — it looked multi-agent in the output but wasn't in execution. I rewrote the graph around LangGraph's fan-out/fan-in primitives so each persona is an independent call, and validated it externally with LangSmith rather than trusting my own assumptions about the architecture.
Encoding hell from web-scraped charts. The original design scraped a third-party chart site and fought constant cp950/big5-hkscs garbling. Replacing that entirely with client-side iztro computation didn't just fix the encoding bugs — it removed the whole failure category.
Concurrent writes to shared graph state. Three parallel agents appending to one list would race and overwrite each other; solved with an operator.add reducer so LangGraph merges the writes instead of replacing them.
Streaming chat scroll jitter. The SSE token-by-token chat kept fighting the user's scroll position; removing an over-eager scrollIntoView call fixed it.
"It works on my machine" — Postgres edition. Running Docker Compose's Postgres alongside an already-running native Postgres on port 5432 looks healthy (docker compose ps says so) but silently connects to the wrong database with mismatched credentials. Diagnosed via netstat and documented the fix (remap to an alternate host port) so it doesn't cost the next person an hour.
Google OAuth's three-way config trap. Sign-in only works when the Google Cloud Console authorized origin, the frontend's public client ID, and the backend's verification client ID all match exactly — get one wrong and you get an opaque "invalid credential" error. I documented the checklist so this is a five-minute fix instead of an afternoon of guessing.
Shipping to a free-tier cloud host. Render needed an async-driver database URL, automatic migrations on container start, and a Blueprint that wires backend, frontend, and Postgres together — all things that don't show up until you actually try to deploy.

Results

Took the project from a single-purpose take-home test to a deployable product: accounts, persisted chart profiles, real-time streaming chat, a working subscription/token model, and a containerized stack running on Docker + Render.
Collapsed a three-vendor LLM stack (Claude + OpenAI + Tavily) into a single Gemini API key for both reasoning and embeddings — simpler config, lower cost, fewer moving parts to monitor.
Replaced a fragile web-scraping chart pipeline (and its encoding bugs) with a deterministic, client-side engine — eliminating an entire class of correctness complaints at the source.
Independently verified, via LangSmith trace timelines, that the three analysis agents genuinely execute in parallel rather than merely appearing to.

Looking Back, Looking Forward

The biggest lesson was learning to tell the difference between "an LLM playing multiple roles" and "an actual multi-agent system" — and, more importantly, how to prove which one you've built using tracing tools rather than taking the architecture diagram's word for it. The second was learning to anchor an LLM product on a deterministic ground-truth data source whenever one exists, instead of asking the model to do something it's structurally bad at.

Looking ahead: wiring the subscription system to a real payment provider (it currently runs on manual plan assignment), adding more fortune domains beyond career/love/compatibility, exploring automated quality scoring across the agents' outputs, and tightening the mobile experience.