Corpus Lab 2
Frequency lists & Lexical richness
Assignment Overview
This assignment aims to help you practice the following skills:
- Creating a word frequency list based on a corpus of Japanese (as an example of non-English language)
- Computing and interpreting lexical diversity scores for English text samples
- Computing and interpreting lexical sophistication indices for English text samples
Assignment Details
Complete the following three tasks. Submit a single word file with write-up of each task, with appendix in specified file formats.
Submit the finished assignment through Google Classroom.
Task 1: A Japanese word frequency List
Goals
The goal of this task is to:
- Construct a Japanese word frequency list.
Instruction
- Use
Aozora 500corpus. - Create frequency list using
TagAntandAntConc - Understand frequency distributions using simple text analyzer.
Submission:
- A Japanese word frequency list (
.txtor.tsvformat) - Descriptive paragraphs explaining the frequency distributions of Japanese language.
Success Criteria
Your submission …
Task 2: Replication of Figure 4.19 from Durrant with two more recent lexical diversity indices
Goals
The goals of this task are:
- to compute more recent, robust alternatives to classical indices using TAALED
- to replicate Durrant’s analysis with two more recent lexical diversity indices
Instructions
- Complete the hand calculation of lexical diversity indices on the spreadsheet
- Compute recommended lexical diversity indices — MATTR and MTLD Original — using TAALED.
- Replicate Figure 4.19 in Durrant (2023, p. 72) with the two indices (i.e., MATTR and MTLD Original)
- Discuss implication of the findings.
Submission:
- Spreadsheet file containing hand-calculated lexical diversity scores.
- Descriptive paragraphs explaining the replication of Durrant’s analysis (300 words).
- Research question
- Your hypothesis regarding the replication
- Plots (one for MTLD; the other for MATTR)
- Results and interpretation
Success Criteria
Your submission …
Task 3: Qualitative analysis of lexical sophistication
Goal
The goals of this task are to:
- compute several important lexical sophistication indices
- compare and contrast two texts using the selected indices
- describe the use of vocabulary in the two text based on the quantitative and qualitative information
Instructions
- Two texts from the example used in the classroom
- Using the simple text analyzer, compare two texts based on two indices that you select.
- Select one frequency-based index and another type of index.
- Interpret the results of the analysis and describe the difference in a (few) paragraph(s).
Submission:
- Plots that contains results of the lexical sophistication analysis.
- Descriptive paragraph(s) contrasting two texts based on lexical sophistication (200-300 words).
Success Criteria
Your submission …