Text Analysis Software

Tutorials and Reviews  

Laurence Anthony`s freeware concordance program with an easy-to-use graphical interface, suitable for use in language classrooms or for quick research projects. The program can generate KWIC concordance lines and also plots showing the distribution of search hits in a document, and word lists and keyword lists sorted alphabetically or by frequency. All searches can be made using wildcards or full regular expressions (regex). (note: scroll down to Software to download the program.)

Conc produces concordances of texts. A concordance consists of a list of the words in the text with a short section of the context that precedes and follows each word. Conc also produces an index, consisting of a list of the distinct words in the text, each with the number of times it occurs and a list of the places where it occurs. Conc displays the original text, the concordance, and the index each in its own window. Clicking on a word in any one of the three windows causes the other two windows to display the entries for the same word. Version 1.76 is the standard one, but also try the beta version 1.8, which has several added features. One useful feature is the added ability to import a file of search words. Also, in the beta version you can arrange the index according to frequency of occurrences. The maximum size of your corpus depends on your RAM; an old 68040 Mac with 24 RAM can handle half a million words. Free download.

Concordance 3.0 was released 19 January 2002. With Concordance, you can make indexes, count word frequencies, compare different usages of a word, generate KWIC displays (key words in context), analyse keywords, find phrases and idioms, and even publish to the web. Version 3.0 under Windows 2000 and XP can create concordances using e-texts that are encoded in double-byte, East Asian encoding systems.

Concordancer for Windows, Version 3.0. Concordancer for Windows enables one to search for words in text files and display them in the form of concordances. Concordances can be used for text analysis including the investigation of style, grammar usage, vocabulary usage and teaching. The program has been designed so as to be very easy to use for those who have had little or no previous experience with computers. Freeware.

An on-line resource from edic.com's Virtual Language Centre. "Word frequency text profiling can be used in many ways to support teaching, learning and research. The Word Profiler compares all the words in a text with two word frequency lists, it provides a visual profile of the distribution of these words in a text by printing the different frequency bands in different colours. Words which are contained in the first list of most frequent words are left in the default text colour. Words which are found in the second word list (see below) are printed in red and words which are not in either of the lists are printed in blue. The off-list words are listed separately, and this list will contain new or unfamiliar words, as well as genre-specific words. "

Lexical analysis software featuring full regular expression searches and context searches, KWIC (keywords in context) displays and frequency information. Reviews of MonoConc Pro praise its ease of use. Downloadable demo version is fully functional but only displays 20 hits for each search.

WordSmith Tools v. 3 is a suite of lexical analysis software programs for data-driven learning and research. Concord generates concordances showing a search-word in context, collocates, common word clusters. WordList generates alphabetical and frequency-order lists. KeyWords compares the frequency of words in a text with a reference corpus and identifies key words in the text. There are several other utilitity programs included. Mike Scott's web site includes screenshots, links for downloading from his website or from Oxford University Press and links to extras (such as word lists and lemma lists). WordSmith Tools v. 4 is under development and will eventually handle Unicode (double-byte) encoded text for Windows 95 and better.