Watch videos — not just to understand
but to truly remember the words
Behind LingoTube is a core vocabulary of 5,943 words, cross-tagged by three international authority systems. As you watch, the system knows how hard each word is, how common it is, and whether you should focus on it — so it only reminds you to learn the right word at the right time.
Every word tagged by three systems simultaneously
Difficulty, frequency, and academic relevance are three different questions. A word can be both high-frequency and hard — they're not the same thing. This is exactly what lets LingoTube recommend the right words for your goals.
NGSL
How common is this word in everyday English? Mastering high-frequency words covers ~92% of general text.
Answers: "Should I learn it first?"
CEFR
What level does mastering this word require? From A1 beginner to C1 proficient — an internationally recognized scale.
Answers: "How hard is this word?"
AWL / NAWL
Is this word key in academic writing? Core for IELTS/TOEFL essays, deliberately non-overlapping with everyday high-frequency words.
Answers: "Key for exams or study abroad?"
Simpler = more frequent? Mostly yes — with exceptions
The real distribution of 5,943 words across difficulty × frequency. Darker = more words. Notice the striking dark block at the right of the C1 row.
| Difficulty↓ / Frequency→ | NGSL-1 | NGSL-2 | NGSL-3 | Non-NGSL |
|---|---|---|---|---|
| A1 | 724 | 178 | 50 | 124 |
| A2 | 462 | 300 | 109 | 119 |
| B1 | 259 | 335 | 167 | 141 |
| B2 | 180 | 476 | 411 | 504 |
| C1 | 31 | 48 | 168 | 1157 |
A1 beginner words are almost all high-frequency; but at C1, 1,157 words fall outside the frequency list — they're hard AND uncommon. This means: memorizing only high-frequency words won't get you to advanced level. So LingoTube switches to difficulty/academic-based recommendations at advanced stages, rather than piling on more high-frequency words.
Five difficulty levels — matching what videos to watch
From beginner to advanced, how many words each level needs and what content fits. Bar length = word count at that level.
From one subtitle line to "should I highlight this word?"
Every word in every subtitle runs through this pipeline before the system decides: highlight it, show a definition, add it to your review queue.
Tokenize
Split subtitle into words: running, abandoned…
Lemmatize
running → run, normalize to base form
Lookup
Match → CEFR level? Frequency? Academic?
Compare goal
Within your target level and not yet mastered?
Decide
Highlight + definition + add to review queue
The full annotation every word carries
Filter by level to see how words at different difficulties are cross-tagged by three systems.
Try it with a video right now
Watch a video, click a word, remember it — that's LingoTube.
