TSAI_CHENG-HUNG
ALL POSTS
LOG_ENTRY · Jun 12, 2026 · ⊙ 7 MIN READ

Why I Built This Site (and How It Works)

The story behind this site — a home for my résumé, projects, and a running log of what I'm learning — plus a quick look at the stack, the data model, and why I chose this architecture.

#Meta#Architecture#Notes

Why I built this

I wanted one place that's truly mine — not just a static résumé, but somewhere I can keep adding to: projects I ship, things I learn, and the occasional reflection. This blog is where I'll record the day-to-day — what I'm building, what broke, and what I'd do differently. Think of it as a running log rather than a finished portfolio.

What it runs on

The site is split into a frontend and a backend:

The data model (a peek)

Everything you see is data, not hardcoded:

Meet the assistant

There's now a small chat widget on the site, built with LangGraph and Gemini. Ask it something about me, a project, or a post, and it streams back a grounded answer with links to the right page — in whichever language (EN / 中) you're asking in.

Under the hood it's a tiny 3-node graph:

  1. Condense — rewrite a follow-up question into a standalone search query, resolving references like “that one” using the conversation so far.
  2. Retrieve — embed the query and pull the closest chunks from doc_chunks via pgvector's cosine search, scoped to the current language.
  3. Generate — stream an answer grounded only in those chunks, citing sources as Markdown links. If the context doesn't cover the question, it says so instead of guessing.

Keeping the index in sync

The trickiest part of any RAG setup is keeping the vector index honest as content changes. doc_chunks now syncs incrementally: every source (each project, post, and the profile blurb, in each language) is hashed, and only sources whose hash changed get re-chunked and re-embedded — unchanged ones are left alone. Anything removed or unpublished has its chunks deleted too. This runs after seeding and as a background task when the API starts, so the assistant's knowledge stays current without me remembering to run anything by hand. A --full flag is still there if I ever need to force a complete rebuild (say, after changing the chunker or embedding model).

Why this architecture

A few deliberate choices, holding up well so far:

What's next

With the assistant live, the next focus is tuning retrieval quality as more projects and posts land, and writing more of these notes as I go.

Why I Built This Site (and How It Works) — Tsai Cheng-Hung