By the end of this session, students will be able to:
- Describe how LLMs are trained generally and what LLMs do to produce language.
- Explain the benefits and drawbacks of using LLMs for linguistic annotation.
- Demonstrate/discuss potential impacts of prompts on the LLMs performance on linguistic annotation.
- Design an experiment to investigate LLMs output accuracy on a given annotation task.
What word would be suitable in the context?
Fill in the blank
Predicting next word
The picture taken from the following NVIDIA blog
Why is it so intelligent…?
chocolate it learns.prompt (the written instruction) conditions LLM to perform certain tasks.→ Effective prompting approach may help LLM achieve higher performances
In-context learning refers to the use of examples embedded in the prompt as a way to augment the responses of LLM.
→ in-context learning, or few-shot prompting, can help us achieve better results without further “training” of the model.
Kim & Lu (2024) examined the use of GPT to annotate discourse move.
What is discourse move and steps?
| Move | Description |
|---|---|
| Move 1 | Establishing a research territory |
| Move 2 | Establishing a niche |
| Move 3 | Presenting the present work via some steps |
validation-process
Remember COVID test…
True Positive (TP), True Negative (TN), False Positive (FP), and False Negative (FN).| \(Yes_{pred}\) | \(No_{pred}\) | |
|---|---|---|
| \(Yes_{true}\) | TP | FN |
| \(No_{true}\) | FP | TN |
Almost perfect when fine-tuned
step analysis can be difficult
RQ1: To what extent is the accuracy evaluation by ChatGPT comparable to that conducted by human evaluators?
RQ2: How does the accuracy evaluation by ChatGPT compare to Grammarly?
| No. | Code | Error Type |
|---|---|---|
| 6 | <AGV> |
Verb agreement |
| 7 | <AS> |
Argument structure |
| 10 | <CL> |
Collocation |
| 11 | <CN> |
Noun countability |
| 28 | <FV> |
Verb form |
| 31 | <ID> |
Idiom |
| 37 | <L> |
Register |
| 64 | <TV> |
Verb tense |
| 76 | <W> |
Word order |
| 61 | <S> |
Spelling (non-word) |
<?xml version="1.0" encoding="UTF-8"?>
<learner><head sortkey="TR160*0100*2000*01">
<candidate><personnel><language>Chinese</language><age>21-25</age></personnel><score>17.0</score></candidate>
<text>
<answer1>
<question_number>1</question_number>
<exam_score>2.2</exam_score>
<coded_answer>
<p>Dear <NS type="RN"><i>Madam</i><c>Ms</c></NS> Helen Ryan,</p>
... (More text here)
</coded_answer>
</answer1>
</text>
</head></learner>The dataset looked as follows:
Dataset view
Reply with a corrected version of the input sentence with all grammatical, spelling, and punctuation errors fixed. Be strict about the possible errors. If there are no errors, reply with a copy of the original sentence. Then, provide the number of errors.
Input sentence: {each line of sentence}
Corrected sentence:
Number of errors:
Errors per 100 words
Correlations
Linguistic Data Analysis I