Session 7

Author

Masaki EGUCHI, Ph.D.

Modified

August 5, 2025

One-liner

We will go over foundational concepts for multiword units measurements.

🎯 Learning Objectives

By the end of this session, students will be able to:

  • Explain different types of multiword units: collocation, n-grams, lexical bundles
  • Demonstrate how major association strengths measures (t-score, Mutual Information, and LogDice) are calculated using examples

πŸ”‘ Key Concepts

  • Types of multiword units
  • Association strengths
  • Three approaches:
    • Context window
    • Dependency bigram

πŸ“š Required Readings

  • Durrant (2023) Ch. 7

  • Gablasova, D., Brezina, V., & McEnery, T. (2017). Collocations in Corpus‐Based Language Learning Research: Identifying, Comparing, and Interpreting the Evidence. Language Learning, 67(S1), 155–179. https://doi.org/10.1111/lang.12225

Materials

Slides for the session

Reflection

  • You can now:
    • Describe major types of multiword units and how they differ from each other
      • N-gram, Lexical Collocations, Colligations, lexical bundle
    • Describe benefits and drawbacks of major Strengths Of Association (SOA) measures
    • Discuss two approaches to identify collocation from the text: window-based approach and dependency-based approach.