CASE STUDY: Effects of Language Context Cues on Word Segmentation in Reading

Unlike alphabetic languages such as English, Chinese texts do not use spaces to delineate word boundaries. This unique characteristic necessitates that Chinese readers rely on other internal cues and contextual information to group continuous characters into meaningful words, a process known as word segmentation. The research paper, “The effects of lexical- and sentence-level contextual cues on Chinese word segmentation,” by Huang and Li (2024) addresses the fundamental question: how do Chinese readers segment words?
Previous studies have identified word frequency, relative word position, and sentence context as important factors in Chinese word segmentation. However, the role of lexical-level contextual cues—semantic relationships between a target word and a word in the prior context—and the interplay between lexical- and sentence-level cues is less clear.
The study aimed to investigate whether Chinese readers utilize lexical-level contextual cues for word segmentation and to explore the relative timing of when lexical-level and sentence-level cues are employed during reading.
Eye Tracking Method for Reading Ambiguous Sentences
The study employed an SR Research EyeLink 1000 eye-tracking system, recording participants’ eye movements at a sampling rate of 1,000 Hz while they read sentences in Chinese that contained overlapping ambiguous strings (OASs). OASs typically contain three characters: the first two form one word, and the last two form another.
Several key eye-movement measures were analyzed:
- First Fixation Duration (FFD): The duration of the initial gaze on a specific region, indicating early processing.
- First-Pass Reading Time (FP): The total duration of all fixations on a region before moving forward, reflecting initial processing and comprehension.
- Go-Past Time (GP): The total time spent in a region until its right boundary is crossed, indicating the effort required to process the information in that region and move past it.
- Regression-Out Probability (RO): The likelihood of the eyes moving back to an earlier part of the text from the current region, often signaling processing difficulties or re-evaluation.
- Second-Pass Reading Time (SP): The cumulative duration of fixations on a region after the initial pass, reflecting re-reading or deeper processing.
- Regression-In Probability (RI): The chance of the eyes returning to a previously read region, also indicating processing challenges or clarification needs.
Context Cues Affect Word Segmentation in Ambiguous Sentences
These measures allowed Huang and Li to observe how lexical- and sentence-level contextual cues affected readers’ immediate and subsequent processing of OASs in Chinese sentences. For instance, the finding of longer Go-Past times and higher Regression-Out probabilities in certain conditions indicated increased processing difficulty or the need for re-segmentation, providing empirical support for the relative timing of cue utilization.
Without the precision and detail afforded by eye-tracking, it would be exceptionally difficult to discern the dynamic interplay between these contextual cues during the rapid process of reading. Traditional behavioral measures, such as reaction times to comprehension questions, would only provide a coarse understanding of the end product of comprehension, rather than the intricate, moment-by-moment cognitive operations involved in word segmentation. The eye-tracking data, therefore, was indispensable in revealing that Chinese readers indeed utilize lexical-level contextual cues, even when not strictly necessary for correct segmentation, and that these cues are processed earlier than, or concurrently with, sentence-level contextual cues. This highlights the indispensable nature of eye-tracking in advancing our understanding of complex reading processes.
For information regarding how eye tracking can help your research, check out our solutions and product pages or contact us. We are happy to help!