Woordjes — Dutch Learning App

The Problem

I'm an American living in the Netherlands, taking Dutch lessons from a private tutor. Each week she'd email PDFs with a topic title and Dutch-English word pairs — "At the Doctor's Office," "Kitchen Vocabulary," that kind of thing. I needed flashcards, but existing apps are generic: they don't understand Dutch grammar (de/het articles, diminutives, plural forms), can't help with pronunciation, and definitely can't ingest a tutor's PDF and turn it into enriched study cards.

The initial need was simple: take these emailed PDFs, extract the word pairs, and enrich each one with grammar details, pronunciation hints based on part of speech, example sentences, and conjugation tables. What it became was bigger.

My Role

Solo project. Designed, built, and deployed in 4 days. Now shared with other students from the same tutor — each with their own login, their own word list assignments, and their own study progress.

The Approach

Three distinct AI integration points, each solving a different problem:

1. PDF Ingestion (the original use case): Upload a PDF from the tutor and Claude extracts the vocabulary, maps Dutch-English pairs to the card schema, and creates study cards. Handles messy formatting, tables, and mixed-language content. This is what started the whole project.

2. Ad Hoc AI Word List Generation: A chat interface where users describe what they want to study ("give me 20 words for navigating the train system" or "formal vs informal greetings"). Claude proposes a structured word list, and the user can review, refine, add, or remove words in a human-in-the-loop conversation before confirming. Only then does the system generate and enrich the flashcards — no wasted AI enrichment on words the user didn't want.

3. Batch Enrichment: Every confirmed card gets automatically enriched with grammar details (article, plural, diminutive), pronunciation guide tailored to the word's part of speech, conjugation tables (for verbs), and contextual example sentences. Runs as a background queue so users aren't waiting.

What I Built

PDF ingestion pipeline — upload tutor's PDFs, Claude extracts Dutch-English pairs, maps to card schema, creates flashcards automatically.
Human-in-the-loop word list builder — chat interface where users request ad hoc vocabulary, Claude proposes a list, user refines through conversation, then confirms to trigger enrichment.
Part-of-speech-aware enrichment — each word gets grammar details, pronunciation hints tailored to its part of speech (verbs get conjugation tables, nouns get articles/plurals/diminutives), plus example sentences.
Multi-user with roles — teacher/student model with invite tokens. Other students from the same tutor have their own logins, word list assignments, and study progress.
Dual LLM backend — abstracts over Claude API and OpenAI-compatible local LLMs. Per-task model selection (Sonnet for generation, Haiku for parsing) via Django admin.
Background processing queue — daemon thread with priority ordering (PDFs first, enrichment second), progress tracking, HTMX polling for real-time status.
Guardrail rejection system — configurable rate limiting, prompt-level guardrails, rejection detection, IP logging, automatic lockout.

The Result

What started as a personal tool for my tutor's PDFs is now shared with other students. 30 seconds from "I need vocabulary for my doctor's appointment" to enriched flashcards with grammar, pronunciation, and examples — something that would take an hour manually. The human-in-the-loop chat for ad hoc lists means users get exactly the vocabulary they want, refined through conversation before any enrichment runs.

The multi-user expansion happened naturally: other students from the same tutor wanted it, so I added logins, role-based access, and per-user word list assignments. The dual LLM backend lets me switch between Claude (quality) and local models (free, private) per task.

Tech Stack

Backend: Django 5.x, SQLite
Frontend: HTMX, Tailwind CSS
AI: Claude API (Sonnet/Haiku), OpenAI-compatible local LLMs
Processing: Background daemon queue, PDF parsing
Analytics: PostHog
Deployment: Appliku, Docker