Show AWL words on this page.
Show sorted lists of these words.
|
A concordancer is a computer program which is used to search through a corpus (plural form corpora), in other words a collection of texts. These texts can be spoken or written texts, and can be collected according to many different principles. For example, there are corpora for newspaper articles, fiction, web pages, as well as for academic English (spoken and written). Some corpora have thousands of words, while others have millions or even billions of words.
For another look at the same content, check out the video on YouTube (also available on Youku). There is a worksheet (with answers and teacher's notes) for this video.
Concordancers function like search engines, providing a list of sentences containing the search term or other information such as frequency. This allows the user to look for patterns, to see how common a word or phrase is, or to understand how the word is used, for example whether it commonly combines with particular words or phrases (i.e. collocations).
There two main types of concordancer: software which is installed on a local machine; and online concordancers. This review looks only at the latter, and considers the following four concordancers, all of which are free to use:
The review focuses not only on their features, but gives some guidance on how to use them to study and improve academic English.
The Lextutor website has many useful tools, including a vocabulary highlighter, reviewed on the AWL highlighters page, as well as the concordancer.
The concordancer defaults to the BAWE (British Academic Written English) corpus containing 8 million words. However, you can select from many different corpora, including BASE (the spoken equivalent of BAWE), COCA (Corpus of Contemporary American English), academic abstracts, and even a corpus of Shakespeare and one of Jane Austen.
Like other tools on the Lextutor site, the concordancer is powerful but not very user-friendly. It takes some getting used to, and students using it may need some training. It does not help that the search interface (above) changes after the first search. The concordance lines, however, are very clear and can be sorted in many different ways: by keyword; by the word to the left or right of the keyword; or by the sub-corpus, i.e. Business, Biology, Economics and so on. This can be useful in identifying common collocations or relative frequency in different academic disciplines, although one negative is that frequencies are usually displayed in a rather ugly box at the foot (rather than the top) of the page.
Below is a partial screenshot for the word effect sorted by words to the left, showing that this word collocates frequently with negative and net and less frequently with negligible.
The search can be set for an exact match, or to start or end or contain the word (essentially a wildcard search), or set to search for the word family, which makes it very powerful indeed. It is also possible to search for associated words (either left or right). Below are some of the concordance lines for negative associated on the right with effect. This shows very clearly that this phrase is used with the verb to have and the preposition on, i.e. to have a negative effect on.
The keyword itself shows as a hyperlink, and clicking on this gives an extended example of the word in use, rather than just the short extract shown on the main search page (see below).
In addition to English, the site has a concordancer for French, German and Spanish, which might be helpful not only for students learning those languages, but also for students of those languages to play around with and understand how it works.
One problem with the site is that it can sometimes be a bit slow, especially when compared to other concordancers reviewed here. That is probably a reflection of how high powered it is, but it can sometimes be a bit frustrating.
The British National Corpus (BNC), created by Oxford University press in the late 1980s, is a corpus of 100 million words in a wide range of genres, both spoken and written, covering fiction, magazines, newspapers and academic texts.
The site has a very clear interface (above). When using it for academic purposes, however, you should set the Section to Academic (see below), to ensure the search is only conducted in academic texts (unless you want results for all types of texts).
Like Lextutor, the site will show concordance lines. Some examples are shown below, again for the word effect.
The concordance lines, however, are not the most useful feature of the BNC site. There are several others which make it stand out from other concordancers reviewed here. One of these is the very visual and clear way it shows collocations, using the Collocates tab. Below is the frequency of different words which precede the word effect. The setting used is for words which occur one word to the left, though it is possible to select left and/or right, and up to four words away from the search word. In addition to some simple words (its, no) there are some useful adjective collocations shown in this search: direct effect, significant effect, adverse effect. The word negative appears much lower down in the list (not shown here, but 35th, frequency of 18), in contrast to adverse, which is 8th in the list, frequency of 50, and the collocation adverse effect therefore seems to be more common in the BNC than negative effect.
Using the Compare tab, it is possible to compare how words are used across different sections (spoken, fiction, magazine, newspaper, academic and miscellaneous). Below is a comparison chart for the word bad, which most writers would agree is not a very academic word, though many beginning academic writers overuse it. The chart shows that this word has the lowest frequency in the academic section, meaning it is used much less frequently in academic writing compared to other forms of writing.
Below, on the other hand, is a chart for the word adverse, which was shown above to be a collocate for the word effect. Here it can be seen that the word occurs in academic writing more frequently than in any other section. The word adverse, as well as the collocation adverse effect, therefore seem very appropriate for use in academic English.
There are a few drawbacks to the BNC concordancer, however. The first is that you need to register to use it. This does not take long, but is not straightforward, and I found it rejected some email addresses. Additionally, for the free account, there are certain limitations. The main one is that the free account allows only 50 searches per day (compared to 200 for the paid account) and a limited number of days (3 days per week, compared to no restriction for the paid account). These should be fine for most users, however. There is also an advert for the paid version which interrupts searches from time to time, slowing it down.
MICUSP (Michigan Corpus of Upper-level Student Papers), as the name suggests, uses a special corpus, namely high level student papers submitted to Michigan University. This makes it a truly unique corpus, since it comprises precisely the kind of writing that academic students using the concordancers listed here might be producing. Despite the limited range of texts, it is a large corpus, comprising 2.6 million words from 829 student papers.
The results for searches display the relative frequency in different disciplines, as well as how often they appear in different kinds of writing. The intial result is for raw frequencies; however, some disciplines contain more words in the corpus than others, and it is better to select the per 10,000 words option instead. Above is an example for the word effect. This occurs most commonly in Economics (11.89 times per 10,000 words), though it is also frequent in Physics (10.49 times per 10,000 words). Additionally, it occurs more frequently in research papers (39%) and reports (37%) than in other genres such as argumentative essays (12%).
Rather than show concordance lines, the concordancer shows extended extracts containing the word. Two examples are shown below, also for the word effect.
There are filters for the concordancer, so that it can be limited to certain grade levels (undergraduate vs. graduate), disciplines, paper types, part of text (abstracts, literature reviews, etc.), or even native vs. non-native speaker use.
One drawback of MICUSP is that it shows extended examples, rather than concordance lines, which makes it more difficult to get a quick, broad overview of collocations or other features of the search term.
The final concordancer in this list is SKELL (Sketch Engine for language learning). SKELL is a free, simplified interface of the more advanced (subscription only) corpus tool, Sketch Engine.
The SKELL corpus contains texts from a range of genres, including news, books, blogs, Wikipedia (39%), other web pages (31%), and academic texts (namely the BNC, comprising 9% of the corpus). It is also available in other languages: Italian, German, Russian, Estonian and Czech.
One of the advantages of SKELL is that it is extremely easy to use, and also extremely fast. As part of its simplicity, it shows only a limited range of concordance lines, 40 in total. These are all short, simple sentences rather than sentence fragments as with the other concordancers. Below are some of the lines for the word effect, from the Examples part of the concordancer.
Additionally, SKELL can be used to find common collocations or word combinations, using the Word sketch part. Below are some of the results for effect. These are grouped according to frequency and type, e.g. verbs with the word as subject (effect occurs), verbs with the word as object (have an effect), adjectives which frequently occur with the word (effect is temporary), and so on. The fourth group (modifiers of the word, bottom left) show that side effects and adverse effects are common adjective + noun collocations for the word effect.
There is also a Similar words tab. However, this is not a true thesaurus, and seems to have limited use for language study.
It should be noted that although SKELL contains the BNC as part of its corpus, it is not an academic concordancer. Most of the examples are from everyday use. That said, it is extremely fast and easy to use, and provides good examples of word use and collocation.
Below is a summary of the main features of the concordancers, and which ones have these features. They are all positive features, meaning a tick (check mark) is good, while a cross is not. The best concordancer overall is undoubtedly Lextutor (although it does not have the most ticks). However, all of the concordancers have something special which makes them worth checking out.
Features |
Lextutor |
BNC |
MICUSP |
SKELL |
Free |
||||
Fast |
||||
No registration required |
||||
User-friendly website |
||||
Offers more than one corpus |
||||
Uses academic corpus |
||||
Shows different disciplines |
||||
Shows different genres (report etc.) |
||||
Shows unlimited examples |
||||
Shows extended textual examples |
||||
Useful for collocation study |
Author: Sheldon Smith ‖ Last modified: 21 June 2021.
Sheldon Smith is the founder and editor of EAPFoundation.com. He has been teaching English for Academic Purposes since 2004. Find out more about him in the about section and connect with him on Twitter, Facebook and LinkedIn.
1
2
3