What is a core vocabulary?

It’s very difficult to say exactly how many words there are in the English language because it depends how you count them and, of course, language is changing and growing all the time. But even at a conservative estimate, there are well over a quarter of a million distinct English words . That makes the task of teaching vocabulary to learners of English seem a rather daunting one.

Thankfully, Zipf’s Law comes to our rescue. This states that a handful of the most frequent words in the language account for a disproportionately large chunk of any text, either written or spoken. The top 2000 most frequent words, in particular, make up somewhere around 80% of most texts. That makes frequency a good rule-of-thumb indicator of the words we should probably focus on teaching first.

The Oxford 3000TM: then and now

With this aim in mind, the Oxford 3000 word list was first put together back in 2005. Since then, the list has been widely used by learners, teachers, syllabus designers and materials writers to help them choose which vocabulary is worth spending most time over. Fourteen years on, however, it was time for an update. The new Oxford 3000 has had a thorough revision including a new look at the criteria for inclusion and the use of new frequency data based on a much larger and more up-to-date corpus.

Ox3000 logo

Frequency vs. relevance

Whilst frequency is the guiding principle behind choosing which words to include on the list, it doesn’t quite work as a basis for selection on its own. That’s in part because there are a surprising number of words that describe basic things in the world around us and that learners would expect to learn quite early on that actually wouldn’t qualify for a top 3000 on frequency alone. So, words like apple and passport, for example, probably wouldn’t make the cut.

Thus, the new Oxford 3000 balances frequency with relevance to the average learner. As well as how common they are, the list compilers took into account whether words are typically used to talk about the kinds of themes and functional areas common in an ELT syllabus, and the types of tasks and topics needed in English exams.

A core vocabulary as a starting point

It would be wrong, however, to assume that 3000 words will be enough on their own for a learner to read and communicate successfully in English. The Oxford 3000 aims to provide a core vocabulary, that is, a solid basis that students can build around.

At the lowest levels, words on the list are likely to make up the bulk of the learner’s repertoire. So, for an A1 learner, for example, 90% of their vocabulary might consist of basic core words. As learners progress and want to read about and express a wider range of ideas, though, while they will still rely heavily on that core, they will also need to supplement it with vocabulary from other sources. The Oxford 3000 aims to provide a core vocabulary for learners up to roughly B2 level. By this stage, more and more of the vocabulary they acquire will reflect the unique interests and needs of each individual learner.

Click here to access the Oxford 3000, Oxford 5000 and Oxford Phrase List.


Julie Moore is a freelance ELT writer, lexicographer and corpus researcher. She’s written a wide range of ELT materials, but has a particular passion for words and always gets drawn back to vocabulary teaching. She’s worked on a range of learner’s dictionaries and other vocabulary resources, including the Oxford Academic Vocabulary Practice titles.

Watch Julie’s webinar to find out more about how the Oxford 3000 and Oxford 5000 were compiled, how the words have been aligned to the CEFR to guide learners, and how you can use the word lists in your teaching.

Why do we need EAP word lists?

The EAP vocabulary challenge

If you are like me, and your English for Academic Purposes (EAP) teaching typically consists of a mixed group of students from a variety of language backgrounds and a variety of academic disciplines, then you know how difficult it can be to satisfy everyone’s needs. The pre-sessional PhD student who is going to go on to study cosmic black holes may get frustrated if the teacher spends a lot of time engaging with the special terminology of medicine for another student in the class. It is far more straightforward if you are teaching English for Specific Purposes (ESP), the special language needed for groups who share the same discipline, for example a class of marine biologists or a group of town planners.

Given the size of the vocabulary of all our academic disciplines put together, with a total specialist terminology that probably runs into tens of thousands of words, we are faced with what would seem to be an impossible task. However, thanks to the power of corpora (computer-searchable databases of written and spoken texts), we are able to establish a common core of vocabulary which is used across a wide range of disciplines, one that we can use in teaching. You may well already be aware of general English word lists for EAP that are freely available online or which have been incorporated into some of the text books you and your students use. Nonetheless, a general English word list only tells us part of the story, and we need to do more to arrive at something which will genuinely be usable and useful for our EAP students.

A common core?

Let’s consider what a common core vocabulary for EAP might look like. There are different options for exploiting corpora, and each one has PROS and CONS:

  • A straightforward frequency list going from the most frequent to the least frequent words that are shared across many or all disciplines.

PROS: Easy to produce at the click of a mouse if you have lots of academic texts stored in a computer. We can focus on different segments of the list for students at different proficiency levels.

CONS: The list will still be very long, and much of it will be common, everyday words your students already know from general English.

  • A keyword list: this tells you which words are significant and distinct in academic English, when compared with any other type of English.

PROS: More powerful and targeted than a frequency list. We can concentrate on the ‘fingerprint’ or ‘DNA’ of academic English.

CONS: It’s not immediately obvious why a word might score so highly as a keyword. ‘Terms’ is an academic keyword. Is it because universities and colleges break the year up into teaching terms, or is it something else?

  • A list of chunks: chunks are recurring patterns of words. Most corpus software can produce lists of the most frequent 2-word, 3-word, 4-word, etc. chunks in a corpus of texts.

PROS: Chunks are extremely common in all kinds of texts and are fundamental in creating meaning, for example, structuring academic arguments, linking parts of texts, etc. They take us way beyond single words.

CONS: The computer often finds chunks that are incomplete or not easy to understand out of context (e.g. in the sense that).

Is one set of lists enough?

All these different ways of approaching a common core for EAP have pros and cons, as we have seen, and in most cases, it’s true to say that the pros outweigh the cons. But there is another factor, too. Much of a student’s experience of academic life will come through speaking and listening. The students I teach typically must write essays, dissertations and reports, but they also have to attend lectures, take part in seminars and discussions and give presentations. So good academic word lists will consist of different lists for spoken and written EAP, taken from different corpora. Spoken EAP often overlaps in surprising ways with conversational English and yet is still first and foremost concerned with transmitting, creating and sharing academic knowledge. How is that achieved? The big question is: what do we learn from separating spoken and written EAP lists?

Then what?

Even if we build an ideal set of lists, the question remains as to how we can use them. Simply drilling and learning lists is not enough; the real challenge is how to harness the words, keywords and chunks to create continuous texts in speaking and writing. First comes the problem of meaning, so it will be necessary to experience and to practise the common core words and chunks in context; we may find that a particular word or chunk has developed a special meaning in one or more disciplines but not in a wide range of disciplines. It will also be important to exploit technological resources such as links between word lists and online dictionaries and other resources. No one, simple approach will deliver the results we hope to get from word lists, and an integrated approach will serve us best.

Click here for a collection of four different word lists that together provide an essential guide to the most important words to know in the field of English for Academic Purposes (EAP): OPAL (the Oxford Phrasal Academic Lexicon).

OPAL logo


Michael McCarthy is Emeritus Professor of Applied Linguistics at the University of Nottingham. He is author/co-author/editor of 53 books, including Touchstone, Viewpoint, the Cambridge Grammar of English, English Grammar Today, Academic Vocabulary in Use, From Corpus to Classroom, and titles in the English Vocabulary in Use series. He is author/co-author of 113 academic papers. He has co-directed major corpus projects in spoken English. He has lectured in English and English teaching in 46 countries.

Watch Michael’s webinar to find out more about the power of corpora to create EAP word lists. See some examples from OPAL, and get some practical ideas for using the word lists in your teaching.