Trying to Stack Critical Literacies to Better Understand LLMs
Submitter: John J Silvestro, Slippery Rock U
——————————————————
The experiment:
I taught a lesson to help students grasp how Large Language Models (LLMs) operate and the role that the data they are developed around plays in their outputs. Specifically, I wanted students to understand how the writing generated by LLMs is structured by their data sets and their protocols for turning that data into writing. To engender the ability to think critically about the data+protocols and their effects on writing, I guided students to build upon their existing critical literacies for spell checkers and text predictors.
First, I had students use text recommenders (spell checkers). In Word, students started writing with words they knew to be correct but that the recommenders would likely mark. I then lectured on how text recommenders use data sets/collections of words and percentages of the relationships between the words to mark and recommend words. Next, students used Word’s text predictor by writing in generic sentence structures to try to get the predictor to make recommendations. I then lectured on how text predictors expand text recommender data sets and word-relationship percentages to predict words. Then, students used ChatGPT to generate text to add to their ongoing writing. I lectured on how LLMs work. I covered the massive collection of writing/data, the use of neural networks to identify patterns in the data, text-predictor-like text generation based on the patterns, and human reinforcement training to focus percentages and improve outputs.
Results:
Students seemingly possessed critical literacies for text recommenders, particularly around how the technologies mark certain words. However, students struggled connecting those literacies to the other systems. While somewhat familiar with text predictors, students struggled to get the predictors to work per the lesson. Given their struggles with text predictors, many students were then unsure how to use ChatGPT to add to their writing. I then struggled to connect students’ writing through my explanations of common-crawl data collection, neural networks and pattern identification, text generation based on percentages, and human-driven reinforcement learning.
Ultimately, I covered too much for a single class period. In the future, I will spread the lesson across two classes. I will examine text recommenders in one class, have students experiment with text predictors for homework, and explore LLMs the following class. Spreading out the work will enable students to examine each system individually and then develop a critical and “stacked” understanding of each system. Additionally, I will have students perform the lessons in conjunction with a draft of a text that they are working on as I feel the abstract writing that I asked students to do for my first attempt made the lesson more confusing.
Leave a Reply