Session 7
One-liner
We will go over foundational concepts for multiword units measurements.
π― Learning Objectives
By the end of this session, students will be able to:
- Explain different types of multiword units: collocation, n-grams, lexical bundles
- Demonstrate how major association strengths measures (t-score, Mutual Information, and LogDice) are calculated using examples
π Key Concepts
- Types of multiword units
- Association strengths
- Three approaches:
- Context window
- Dependency bigram
π Required Readings
Durrant (2023) Ch. 7
Gablasova, D., Brezina, V., & McEnery, T. (2017). Collocations in CorpusβBased Language Learning Research: Identifying, Comparing, and Interpreting the Evidence. Language Learning, 67(S1), 155β179. https://doi.org/10.1111/lang.12225
π Dive Deeper - Recommended Readings
Paquot, M. (2019). The phraseological dimension in interlanguage complexity research. Second Language Research, 35(1), 121β145. https://doi.org/10.1177/0267658317694221
Stephanie Evertβs website on collocation measures
- This webpage provides formulas to calculate various Strengths Of Association measures.
Materials
Slides for the session
Reflection
- You can now:
- Describe major types of multiword units and how they differ from each other
- N-gram, Lexical Collocations, Colligations, lexical bundle
- Describe benefits and drawbacks of major Strengths Of Association (SOA) measures
- Discuss two approaches to identify collocation from the text: window-based approach and dependency-based approach.
- Describe major types of multiword units and how they differ from each other