How can I measure eye tracking Data Quality?
#1
What is Eye Tracking Data Quality?
Understanding how to maximize, measure, and report eye tracking data quality is a critical skill for eye tracking researchers. This post provides a brief overview of some of the key concepts. For a more detailed discussion please see the following Webinar:

Webinar: Eye Tracking Data Quality

When researchers talk about eye-tracking data quality, they are typically referring to a number of features of the sample level data. The most critical, and widely reported measures of eye tracking data quality are accuracy, precision and data loss.  Other features of sample level data, such as temporal precision and delays, may also be important - particularly for gaze-contingent research.

What factors impact on data quality?
Eye tracking data quality (however it is quantified) is typically the result of a combination of several factors. These factors can include properties of the eye-tracking hardware and software (e.g., the tracker's sampling rate), as well as "participant-specific" factors (e.g., whether the participant is wearing glasses) and even "operator-specific" factors (e.g., level of training / whether the equipment has been set up optimally). 

By far the most important factors (that can be controlled by the researcher) are the physical setup (the geometry between the camera, participant and screen), the camera setup (e.g. focus and Pupil / CR thresholds) and the calibration process.  Calibration is the key determinant of spatial accuracy. 

To check that your equipment has been set up optimally, please see our Setup and Usage Video Tutorials:

EyeLink 1000 Plus Setup and Usage Training Videos
EyeLink Portable Duo Setup and Usage Training Videos

The following  resources provide information about the calibration process:

Why is calibration so important?
Which calibration model should I choose?


Data quality measures:
The following sections provide some brief definitions of some critical terms and concepts. For more detailed coverage of how to calculate metrics such as accuracy and precision, please see the webinar linked above.

Accuracy and Precision:
The concepts of accuracy and precision are critical to understanding eye tracking data quality, and are illustrated below. The center of the target represents the location the participant is (hopefully) looking at, and the blue crosses represent the eye tracker gaze data (samples). Ideally your eye tracking data will be both accurate and precise.

   

Spatial Accuracy
One seemingly sensible definition of spatial accuracy would be "the difference between the actual location of gaze and the location of gaze reported by the eye tracker". The problem with such a definition, is that we have no independent means by which to know where the actual location of gaze is. So in practice, spatial accuracy is generally defined as "the difference between the location of a target (that we think / hope the participant is looking at), and the location of gaze reported by the eye tracker". In the illustration above this would be operationalized as the average distance of each of the blue crosses (eye tracker samples) from the center of the target (the location we hope the participant is looking at).

Accuracy is usually reported in degrees of visual angle (see this blog post for details). It is important to note that any spatial inaccuracy that is measured in this way may reflect some issue with the eye tracker or the participant's relative inability to accurately fixate a target or some combination of both. 

Spatial accuracy may differ across different screen locations  so accuracy is typically measured at different screen locations, not just the center. The calibration / validation procedure gives feedback about spatial accuracy at each of the validation target locations. We generally advise that users aim for an average accuracy of 0.5 degrees across all the validation points, with a maximum error at any single point of < 1 degree. 

Changes in accuracy that occur over time are sometimes referred to as "drift".  Regular drift checks (typically between each trial)  can be used to monitor and correct (via a recalibration) any spatial inaccuracy that may develop during the experiment, thus ensuring that data remains spatially accurate during the actual trial recordings.

Precision (noise)
Precision refers to the repeatability, or reproducibility of a set of measured true values, regardless of the accuracy of these values. Precision is sometimes referred to as "noise" and eye-tracking researchers often discuss potential sources of noise, such as whether the participant is wearing glasses or eye makeup.

Precision can be measured both in real participants and with artificial eyes, which (unlike human eyes) can be kept perfectly motionless, allowing the precision of the eye tracker itself to be isolated. Precision is generally reported as either a standard deviation - the average distance between a set of sample locations from their mean location, or "RMS-S2S" which stands for Root Mean Square - Sample to Sample, and equates to the average spatial distance between consecutive samples.  The two measures are illustrated below - in each case precision can be conceptualized as the average of the length of the red lines.The SD and RMS-S2S measures are sensitive to slightly different types of noise.  Again, these measures are typically reported in degrees of visual angle. Details on how to compute precision metrics are provided in the Eye Tracking Data Quality Webinar.
   

Data Loss
Missing gaze data is sometimes referred to as data loss or tracker loss. It can happen for a variety of reasons (for example if the participant turns their head away from the eye tracker, or rotates their eyes beyond the trackable range of the system). Data loss is inevitable during blinks (as illustrated below). Data loss can be reported as the percentage of samples (across all trials) for which no gaze location data is available. Researchers can minimize data loss by ensuring that their equipment and participant are set up optimally. It is particularly important that the screen is placed at the correct distance so that participants are not forced to rotate their eyes beyond the trackable range of the system (see the Setup and Usage Video Tutorials)

   

Temporal Resolution (Sampling Rate)
The temporal resolution or sampling rate of an eye tracker refers to the number of times per second (Hz) it is able to sample the position of the eye. Recent EyeLink models are capable of sampling at up to 2000 Hz. Temporal resolution impacts on data quality in a number of ways. For example, with high sampling rates, it is possible to identify fixation and saccade durations more accurately.  High sampling rates also enable faster recovery from blinks and other causes of data loss, and are critical for implementing gaze-contingent tasks. Finally, high sampling rates can be helpful in reducing velocity noise.

A related and equally important concept is temporal stability. Temporal stability refers to the consistency of the sampling - for example, a 1000 Hz eye tracker with excellent temporal stability will sample the eye every ms (as opposed to 2ms then 0.5 ms, etc). High temporal stability also means no missed or "skipped" samples.

End-to-End Delay / Latency
Delays are particularly critical for gaze-contingent research, in which the screen is updated based on the location of gaze. In the context of eye tracking, the "end-to-end" delay typically refers to the time between gaze being at a certain location, and the eye tracker sampling the eye, computing the location of gaze, and making that location available to the display (stimulus) computer so that the screen can be updated based on this information. The end-to-end delay is inevitably linked to temporal resolution (sampling speed) of the eye tracker, with faster eye trackers having lower delays. The EyeLink systems have an end-to-end delay of 2ms when sampling at 1000 Hz.