Content creators don't have a writing problem. They have a structure problem. Most people know what they want to say — they just can't figure out how to say it in a way that holds attention from start to finish. StoryFlow was our answer to that. It's an open-source AI tool we built at V12 Labs that helps creators structure, draft, and refine narrative-driven content. Here's the full build story.
Table of Contents
- The Problem We Kept Seeing
- What StoryFlow Actually Does
- The Tech Stack Decision
- What the AI Actually Does Under the Hood
- What Was Unexpectedly Hard
- What We Shipped vs. What We Cut
- What We'd Do Differently
- The Open-Source Repo
- Ready to Build?
The Problem We Kept Seeing
We work with a lot of early-stage founders at V12 Labs. Almost all of them have content goals — blog posts, LinkedIn articles, pitch narratives, onboarding docs. And almost all of them hit the same wall: they can write sentences but they can't structure an argument.
The typical workflow looked like this: open a blank document, write a few paragraphs, realize it's going nowhere, start over, give up. The problem wasn't a lack of ideas or writing ability. It was that narrative structure — how to open, how to build tension, how to land a point — is a skill most people were never taught explicitly.
Existing AI writing tools made this worse in a specific way. They generated text fluently but structurelessly. You'd get paragraphs that sounded good individually but didn't build toward anything. The creator still had to do the hard structural work themselves.
We wanted to flip the workflow. Instead of AI generating text that creators then had to structure, we wanted AI doing the structural scaffolding first so creators could fill in what only they know — the specific experiences, examples, and opinions that make content worth reading.
What StoryFlow Actually Does
StoryFlow is built around three core functions:
1. Narrative arc generation — You give it a topic and a target audience. It returns 3–5 structural arc options (problem-agitation-solution, hero's journey, before/after/bridge, listicle-with-spine, etc.) with a brief explanation of when each works best.
2. Section scaffolding — Once you pick an arc, it breaks the piece into sections with a purpose statement for each ("this section needs to establish the reader's pain before introducing your solution") and suggested transition logic between sections.
3. Draft review — You paste in a draft section and it tells you what's working narratively, what's missing, and what to cut — without rewriting your words. It's advisory, not generative. The goal is to make your writing better, not replace it.
What it deliberately doesn't do: write your content for you. That was a product decision we made early and held to. If the AI writes it, it doesn't sound like you. Content that doesn't sound like you doesn't build trust with your audience. StoryFlow is a structural coach, not a ghostwriter.
The Tech Stack Decision
We built StoryFlow on:
- Next.js + TypeScript — our default frontend stack across almost everything at V12 Labs
- Anthropic Claude (claude-3-5-haiku) — for the narrative intelligence layer
- Supabase — for user state, saved arcs, and draft history
- Vercel — deployment, edge functions for the AI calls
Why Claude over GPT-4o for this specific product? Two reasons. First, Claude's instruction-following is tighter for constrained outputs — when we asked it to return structured arc options in a specific JSON format, it was more consistent than GPT-4o in our testing. Second, Claude tends to be more conservative about over-generating. For a tool where we explicitly didn't want the AI writing paragraphs of content, that mattered.
We considered building this on an open-source model (Llama 3, Mistral) to keep costs at zero, but the structural reasoning quality wasn't there at the time we built it. The gap in narrative coherence was significant enough that we stuck with Claude.
What the AI Actually Does Under the Hood
The core intelligence in StoryFlow isn't a single large prompt — it's a chain of smaller, constrained calls.
Arc generation call: Takes topic + audience + content type (blog post, LinkedIn article, video script, pitch narrative) and returns a structured JSON array of arc options with names, descriptions, and ideal-use-case notes. We constrain the output format tightly — no freeform text, pure structured data. The UI renders it.
Section scaffolding call: Takes the selected arc and the specific topic/angle and returns a section-by-section breakdown. Each section object has: name, word-count target, purpose (what this section must accomplish narratively), transition cue (the last sentence of this section should set up...), and common mistakes to avoid in this section type.
Draft review call: This is the most complex. It takes a section, its stated purpose from the scaffolding, and the full arc context, then returns an assessment across four dimensions: narrative function (does it do what it's supposed to?), clarity (can a non-expert follow this?), momentum (does it move the reader forward?), and specificity (is it concrete or generic?). Each dimension gets a 1–3 score and a one-line note. No rewrites, no generated replacement text.
The entire system is built around keeping the human in control of the words. The AI handles the structural meta-layer only.
What Was Unexpectedly Hard
Getting the arc generation to be genuinely useful, not just named differently. Our first version returned five arc options that were basically the same structure with different labels. The differentiation felt cosmetic. We had to spend a lot of time on the prompting and the arc taxonomy to make sure each option was actually a meaningfully different approach that would produce a structurally different piece of content.
The draft review calibration. Getting the AI to be critical without being discouraging, and to be specific without being prescriptive, took a lot of prompt iteration. Early versions either gave vague positive feedback ("this section has good energy") or rewrote entire paragraphs against our explicit instructions. The final prompt is about 400 tokens of careful instruction on what advisory-without-rewriting means in practice.
Supabase real-time state. We wanted users to see their arc and scaffolding update live as they edited their topic or changed their audience. The Supabase real-time subscriptions worked but added latency we didn't anticipate at the edge. We ended up with a hybrid approach — optimistic local state with Supabase sync on blur rather than on keypress.
What We Shipped vs. What We Cut
Shipped:
- Arc generation and selection
- Full section scaffolding with purpose statements
- Draft review for individual sections
- Save/load functionality for arcs and drafts
- Export to markdown and plain text
Cut:
- Cross-section coherence review (checking if section 3 sets up section 4 properly) — the prompt complexity was too high and output quality too variable
- LinkedIn-specific post formatting mode — we built it, tested it, decided it was out of scope for v1
- Team collaboration (multiple users on one piece) — Supabase makes this possible, but we didn't have a real use case to validate against
What We'd Do Differently
The decision to build draft review as advisory-only was right. But we underestimated how much explanation users need to understand why that constraint exists. A lot of early testers tried to use it as a rewriter and were frustrated when it didn't. We'd add more upfront framing around the product's philosophy — "we help you think, not think for you" — before anyone touches the UI.
We'd also build the LinkedIn post mode from day one. Content creators on LinkedIn are the highest-intent audience for a tool like this, and we left that segment underserved in v1.
The Open-Source Repo
StoryFlow is fully open-source at github.com/v12labs-engineering/storyflow. The repo includes the full Next.js codebase, the prompt library with all three core calls documented, a setup guide for running locally with your own Claude API key, and notes on the arc taxonomy we developed.
If you're a developer building tools for content creators, the prompt architecture for constrained AI advisory feedback — scoring without rewriting — might be the most useful thing to borrow from the repo.
Ready to Build?
At V12 Labs, we build production-ready AI tools like StoryFlow for founders and early-stage companies. $6K flat fee, 15-day delivery, full source code ownership from Day 1.
If you have an AI tool idea you want to ship, book a discovery call at v12labs.io. We'll scope it honestly and tell you what's possible in 15 days.