Full fine-tuning (FFT)¶
Placeholder — write up FFT recipe details
What FFT is¶
A 3-stage pipeline:
- Tokenizer replacement — swap Whisper's multilingual BPE for a per-lang BPE trained on monolingual text
- Multitask fine-tuning — interleave ASR and text-only batches in a single training stage
- Save best-by-criterion — Group A/B/C per language (see Save criteria)
When to use FFT¶
- Non-Latin scripts where Whisper's BPE fragments badly (chars/token < 2)
- Low-resource languages where text-only data is more abundant than ASR data