Eye-tracking Scanpath Trend Analysis For Autism Detection

Sukru Eraslan, Middle East Technical University Northern Cyprus Campus, seraslan@metu.edu.tr
Yeliz Yesilada, Middle East Technical University Northern Cyprus Campus, yyeliz@metu.edu.tr
Victoria Yaneva, University of Wolverhampton, v.yaneva@wlv.ac.uk
Simon Harper, University of Manchester, simon.harper@manchester.ac.uk

Abstract

Eye-tracking research enables us to understand how users interact with visual stimuli. It has widely been used to understand user interaction with web pages for various purposes, such as to assess the usability and accessibility of web pages. However, individual eye movement sequences (i.e. scanpaths) tend to be complicated and differ from each other, and therefore the analysis of those sequences is challenging. In our previous work, we propose an algorithm called Scanpath Trend Analysis (STA) that analyses the individual scanpaths of multiple users on a particular web page and identifies their trending path. We also present how STA could be used for autism detection and demonstrate that it works with approximately 60% accuracy regardless of web pages and tasks. In this article, we first summarise STA with its application in autism detection, and then discuss future directions for this research.

Introduction

Eye tracking provides valuable insights into the understanding of user interaction with visual stimuli. It has been commonly used to understand how users interact with web pages for different purposes, such as identifying the usability problems of web pages [1], improving the accessibility of web pages with different techniques such as re-ordering the web-page elements according their usage based on eye tracking [2], and increasing the effectiveness of internet advertising [3]. Eye tracking enables us to identify which elements are visited and which paths are followed in terms of those elements. Figure 1 shows an eye movement sequence (i.e. scanpath) of a single user on the home page of the Apple website. The circles illustrate the fixations where the eyes of the user become relatively stable, and the numbers in the circles present the order of the fixations.

The manual analysis of scanpaths may be feasible in cases where an eye-tracking study is simple and involves a small number of participants and only a few individual scanpaths are produced. However, this is not the case when the number of participants and the number of individual scanpaths increases. This is because an increase in the number of scanpaths leads to higher variance and complexity. There are algorithms available for detecting patterns within multiple scanpaths (e.g., SPAM [4]) or identifying a representative path for a group of scanpaths (e.g., Hierarchical Clustering with the Dot plot algorithm [5]). However, these algorithms are not very tolerant of the differences between individual scanpaths and may lose some popular elements during processing, thus decreasing the representativeness of their resultant paths [6].

This figure shows an eye movement sequence of a particular
user on the home page of the Apple website where the fixations are
represented with circles and the order of the fixations is represented
by the numbers inside the circles. — Figure 1 – An eye movement sequence of a particular user.

In our previous work, we present an algorithm, called Scanpath Trend Analysis (STA), which discovers a trending path followed by multiple users on a particular web page in terms of its visual elements [7]. The trending path can be defined as the most popular scanpath on a particular page among users that show the general direction, is built incrementally and can change when the contents of the elements change [7]. We evaluated the STA with an eye-tracking dataset collected with 40 participants on six web pages by using two different kinds of tasks called searching and browsing tasks. The difference between searching and browsing tasks is that searching tasks require finding a specific piece of information whereas browsing tasks are free-viewing tasks. Our evaluation demonstrates that the STA algorithm performs better than other related algorithms in terms of providing the most similar path to the individual scanpaths.

The STA algorithm has consequently been used in different studies such as comparing the behaviour of web users with autism and neurotypical web users (web users without autism) [9, 10, 11], identifying common code-reading patterns [12], experiential transcoding of web pages [13], correlating cognitive characteristics with interaction and visual behaviour patterns [14], and even daily activity recognition [15]. In the rest of this article, we focus on our recent work that investigates whether the STA algorithm could also be used for autism detection. We mainly give an overview of our experimental results which show that it is possible to detect autism with approximately 60% accuracy across different web pages and tasks [16]. Finally, we discuss our future directions in this line of research and potential applications of STA.

Scanpath Trend Analysis (STA)

Scanpath Trend Analysis aims to identify a representative path of multiple individual paths as a trending path, and it is comprised of three stages: (1) Preliminary Stage, (2) First Pass & (3) Second Pass. Figure 2 shows the overview of the STA algorithm with its three stages. In the preliminary stage, the algorithm takes a series of fixations for each user on a web page and the elements of the page. After that, it finds the corresponding element for each fixation to represent the individual scanpaths in terms of the elements. In the first pass, the algorithm starts the analysis of the individual scanpaths to discover the trending elements by selecting the elements shared by all users and the elements that get at least the same attention as the shared elements in terms of the total fixation durations and total fixation counts. In the second pass, the algorithm locates the trending elements into the trending scanpath based on their overall positions in the individual scanpaths. The full description of the STA can be found in [7] and its Python implementation which is fully open-sourced is available on GitHub.

This figure shows the three main stages of Scanpath Trend
Analysis. The first stage is Preliminary Stage where the individual
scanpaths are represented in terms of the visual elements of a web
page. The second stage is First Pass where the trending elements are
identified. The third stage is Second Pass where the trending scanpath
is constructed by combining the trending elements based on their
overall positions in the individual scanpaths. — Figure 2 – The stages of Scanpath Trend Analysis (This figure is adapted from [8]).

To segment web pages into their visual elements, we usually use the extended and improved version of the Vision-based Page Segmentation because it automatically identifies visual elements based on both the source code and visual representation of web pages and correlates the visual elements with the underlying source code so that these elements can then be used for further processing [17]. The VIPS algorithm generates a tree of elements where there are more and smaller elements in the deeper levels. We mainly use the fifth level as it is found to be the most preferred level by users [17]. Figure 1 shows how the home page of the Apple website is segmented into its elements by using the VIPS algorithm.

To investigate whether or not the success of the STA is limited to a particular web page segmentation algorithm or a specific group of users, we collected an additional eye-tracking dataset by using the same methodology but with 41 different users [18]. We then re-evaluated the algorithm with the additional dataset by using a different web page segmentation algorithm, called Block-o-Matic! [19]. Our results show that the STA algorithm also performs well with a different web page segmentation algorithm and its success is not limited to a particular group of users [18].

Since the number of users required for user studies including eye-tracking studies is a controversial issue over the years, we also investigated the effects of the number of users on the results of the STA algorithm. During our experiments with the combination of the initial and additional datasets, we observed that the STA algorithm could achieve almost the same results with fewer users [20, 21]. In particular, we observed the possibility of achieving 75% similarity to the results of 65 users with 27 users for searching tasks and 34 users for browsing tasks.

Our further experiments with the STA algorithm show that the STA algorithm may experience some problems in providing a trending path when there is a high variance between individual scanpaths. As explained above, to identify the trending elements, the original STA algorithm firstly finds the elements shared by all users and then take the elements with at least the same attention with the shared elements based on total fixation duration and total fixation count. To deal with the problem caused by the high variance between individual scanpaths, we introduced a new parameter to the STA algorithm called tolerance level which allows us to find out the trending elements based on the elements shared by a subset of the users instead of all users [22]. This parameter can be configured manually or determined automatically to achieve the highest similarity to the individual scanpaths.

Overall, the main strength of the STA algorithm is its success in providing the most similar path to individual scanpaths compared to other existing algorithms [7].

Autism Detection with STA

The diagnosis of Autism Spectrum Disorder (ASD) requires a highly subjective, elaborate, and expensive procedure, and it relies on behavioural, historical, and parental report information [23]. Several machine-learning-based models have already been investigated to be used as a potential autism screening tool or used in conjunction with other diagnostic methods of autism. These machine learning (ML) models achieve promising accuracy by using different kinds of behavioural data including functional Magnetic Resonance Imaging (fMRI) [24 ,25], electroencephalograms (EEG) [26] and speech [27]. However, these models have some drawbacks. In particular, the fMRI data collection requires an expensive and obstructive procedure. Besides, the proposed ML models with EEG and speech tend to be overoptimistic due to the methodology used for their validation. Specifically, the researchers divided the dataset into training and testing sets, in such a way that allowed the two sets to contain different data segments coming from the same participants. This could make these two sets artificially more homogenous and similar, thus increasing the accuracy of the proposed models.

In our prior work, we also investigated autism detection with ML models trained on an eye-tracking dataset containing web pages as stimuli [ 28]. Eye-tracking data is relatively easier to collect in comparison with fMRI data, and the Web is a platform frequently used for daily tasks by people with and without autism. We tested various ML models and found that the logistic regression would be the best choice for achieving the highest accuracy. The model uses a set of non-sequential eye-tracking features (including fixation duration, fixation count, time-to-first fixation, etc.) and other web-page features for each element of each page for each participant. When an eye-tracking dataset from a particular person is retrieved, the dataset is transformed into an appropriate format to be used by the model where each record represents a set of features for a specific element on a particular page. Each record is then associated with either participants with autism or neurotypical participants. If the majority of the records are associated with participants with autism, then the person is classified as a person with autism.

We evaluated our machine-learning model with an eye-tracking dataset with 15 people with autism and 15 neurotypical people collected on six web pages for both searching and browsing tasks, and we achieved the highest accuracy as 75% with a combination of specific web pages for a specific task. However, when we analysed the accuracy of the individual web pages, we recognised that the variance between the accuracy of the individual web pages is high. Specifically, the standard deviation of the accuracy values were 6.2 and 10.6 for the browsing and searching tasks respectively, and the minimum accuracy values were 45% and 39% for the browsing and searching tasks respectively.

This work then lead us into investigating whether the STA algorithm could be used for autism detection and provide consistent accuracy among individual web pages [16]. This time, we use a sequential feature of eye-tracking data instead of non-sequential features. Specifically, we consider the behaviour of the participants over time. As the STA algorithm aims to identify a representative path of multiple users as their trending path, we use the STA algorithm to generate a representative path for both people with autism and neurotypical people. These paths could then help us classify people as people with autism or neurotypical people. Specifically, when we know an eye-movement sequence of a particular person on a specific web page, we could compute its similarity to the trending paths of people with and without autism based on Levenshtein distance ¹ and then associate the person with the corresponding user group. This process is illustrated in Figure 3.

This figure shows autism detection with Scanpath Trend
Analysis. The trending paths of both people with and without autism
are identified by using Scanpath Trend Analysis. When the scanpath of
an unknown person is available, its similarities to the trending paths
of people with and without autism are computed. If the scanpath is
more similar to the trending path of people without autism, then the
person is classified as a person without autism. Otherwise, the person
is classified as a person with autism. — Figure 3 – Autism Detection with Scanpath Trend Analysis.

We evaluated our STA-based autism detection approach with the dataset that was previously used for the evaluation of our initial ML-based model and demonstrated that it performed well with approximately 60% accuracy across different web pages and tasks. The standard deviation of the accuracy values was 3.2 and 3.1 for the browsing and searching tasks respectively, and the minimum accuracy value was 55.1% for each task. The advantages of this approach compared to our initial study which did not take into account the sequential nature of the data are that, in spite of the slightly lower overall accuracy, the results have higher consistency across different web pages used as stimuli.

¹Levenshtein distance is a measure that represents the minimum number of editing operations (addition, deletion, constitution) to transform one string to another. It can be used to compute the similarity between two strings as a percentage. As individual scanpaths are represented in terms of visual elements, they are in the string format. Please see [6] for further details about the similarity calculation between individual scanpaths represented in terms of visual elements.

Discussion and Future Work

The work presented here paves the way to have an alternative, supportive, cost-efficient autism detection approach. It has the potential to be used as a pre-screening tool that can be followed by the formal clinical approach for diagnosis. It does not focus on replacing the formal diagnosis methods but rather aims to be a supportive tool for early detection of people at risk.

While we were working on the STA-based autism detection approach, we also collected a new eye-tracking dataset with 19 people with autism and 19 neurotypical people and re-evaluated our ML-based autism detection approach [29]. The new dataset was collected on different eight web pages by using different participants, web pages, and tasks. Even though we used the browsing task again, we put a 30-seconds time limit at this time. Besides this, we used synthesis tasks instead of searching tasks. The difference between synthesis and searching tasks is that synthesis tasks require combining different pieces of information to come up with a new piece of information whereas searching tasks require to find a specific piece of information already given on the page. The results of our evaluation with the new dataset are consistent with the results of our previous evaluation, showing the robustness of the approach across participants, tasks, and stimuli. We are currently planning to re-evaluate the STA-based autism detection approach with the new dataset and compare its results with the results of our ML-based autism detection approach.

Our autism detection studies focused on adults with high-functioning autism whose symptoms may be more subtle and, therefore, more difficult to detect. We plan to evaluate our approaches with children with autism and also use different kinds of visual stimuli apart from web pages, such as natural images. This would allow us to further evaluate the suitability of the approach as a potential autism screening tool. However, there are a number of open questions here. For example, we also need to consider and investigate different ways of segmenting the stimuli, especially if it is a natural image.

With STA, we are currently focussing on eye-movement sequences but we are also interested in how we can combine the results of the STA approach with other non-sequential eye-tracking features (such as fixation duration, time-to-first fixation) to increase the accuracy of autism detection.

The automatic autism detection approaches developed in this work have benefits stretching beyond the clinical setting. Insight into the specific ways participants with autism may interact with web pages can be used to improve the accessibility of web pages where necessary. Specifically, the eye movements of people could be tracked, and in case of autism possibility, web pages can be adapted accordingly. For example, it has been proved that less visually complex web pages and more white spaces between web-page elements would make web pages more accessible for people with autism [2]. Therefore, web pages can be automatically re-engineered by taking necessary actions to decrease the visual complexity of web pages and increase the white spaces between web-page elements.

As mentioned above, STA has been used for different purposes. Autism detection is one of its applications. It has many potentials to be used in different studies for diagnosis of different disabilities as it is proposed here as a binary classification approach. However, of course, the algorithm could also work with more than two groups. For example, if there are multiple groups that can be modelled with sequential behaviour, then the STA-based approach can be used to model the sequential behaviour of these groups, thus allowing us to associate an unknown item with one of those groups. That is why the STA algorithm has also been used for daily activity recognition [15]. Activity recognition can be used for different purposes including health-related and security related issues. In [15], the STA algorithm has been investigated whether it could be used for activity recognition to monitor daily activities of elderly people living alone. It has been used to model the daily activities based on the sequence of binary sensors that become active during the activities at home. Therefore, a number of instances of a particular daily activity is used to generate a sequential model for the activity which can be used for activity recognition. In this case, the output of the STA algorithm is considered as a sequential model of a given activity. In case of an unknown activity at a particular home, its sequence is created by using the binary sensors that become active, and its similarity to the models of the daily activities with are computed based on Levenshtein distance. Finally, the activity is associated with the one with the highest similarity.

References

Ehmke, Claudia and Wilson, Stephanie. Identifying Web Usability Problems from Eye-Tracking Data. In Proceedings of the 21st British HCI Group Annual Conference on People and Computers: HCI.but Not as We Know It - Volume 1 (Swindon 2007), BCS Learning & Development Ltd., 119-128.

Eraslan, Sukru, Yesilada, Yeliz, Yaneva, Victoria, and others. "Keep it simple!": an eye-tracking study for exploring complexity and distinguishability of web pages for people with autism. Universal Access in the Information Society (2020).
Resnick, Marc and Albert, William. The Impact of Advertising Location and User Task on the Emergence of Banner Ad Blindness: An Eye-Tracking Study. International Journal of Human-Computer Interaction, 30 (2014), 206-219.
Hejmady, Prateek and Narayanan, N. Hari. Visual Attention Patterns during Program Debugging with an IDE. In Proceedings of the Symposium on Eye Tracking Research and Applications (New York, NY, USA 2012), Association for Computing Machinery, 197-200.
Goldberg, Joseph H. and Helfman, Jonathan I. Scanpath Clustering and Aggregation. In Proceedings of the 2010 Symposium on Eye-Tracking Research & Applications (New York, NY, USA 2010), Association for Computing Machinery, 227-234.
Eraslan, Sukru, Yesilada, Yeliz, and Harper, Simon. Eye Tracking Scanpath Analysis Techniques on Web Pages: A Survey, Evaluation and Comparison. Journal of Eye Movement Research, 9 (2016), 1-19.
Eraslan, Sukru, Yesilada, Yeliz, and Harper, Simon. Scanpath Trend Analysis on Web Pages: Clustering Eye Tracking Scanpaths. ACM Trans. Web, 10 (Nov. 2016), 20:1--20:35.
Eraslan, Sukru, Yesilada, Yeliz, Harper, Simon, and Davies, Alan. What is Trending in Eye Tracking Scanpaths on Web Pages? In Proceedings of the Measuring Behavior 2016 ( May 2016), Dublin City University, 341-343.
Eraslan, Sukru, Yaneva, Victoria, Yesilada, Yeliz, and Harper, Simon. Do Web Users with Autism Experience Barriers When Searching for Information Within Web Pages? In Proceedings of the 14th Web for All Conference on The Future of Accessible Work (New York, NY, USA 2017), ACM, 20:1-20:4.
Eraslan, Sukru, Yaneva, Victoria, Yesilada, Yeliz, and Harper, Simon. Web users with autism: eye tracking evidence for differences. Behaviour & Information Technology (2018), 1-23.
Yaneva, Victoria, Ha, Le An, Eraslan, Sukru, and Yesilada, Yeliz. Adults with High-Functioning Autism Process Web Pages With Similar Accuracy but Higher Cognitive Effort Compared to Controls. In Proceedings of the 16th Web For All 2019 Personalization - Personalizing the Web (New York, NY, USA 2019), Association for Computing Machinery.
Tablatin, Christine Lourrine and Rodrigo, Ma. Mercedes. Identifying Common Code Reading Patterns using Scanpath Trend Analysis with a Tolerance. In Proceedings of the 26th International Conference for Computers in Education (Philippines, Metro 2018), 286-291.
Harper, Simon, Eraslan, Sukru, and Yesilada, Yeliz. It's All About the Message: Visual Experience is a Precursor to Accurate Auditory Interaction. In Proceedings of the 16th Web For All 2019 Personalization - Personalizing the Web (New York, NY, USA 2019), ACM, 20:1-20:8.
Raptis, George E., Fidas, Christos, Katsini, Christina, and Avouris, Nikolaos. A cognition-centered personalization framework for cultural-heritage content. User Modeling and User-Adapted Interaction, 29 (Mar. 01, 2019), 9-65.
Yatbaz, H. Y., Eraslan, S., Yesilada, Y., and Ever, E. Activity Recognition Using Binary Sensors for Elderly People Living Alone: Scanpath Trend Analysis Approach. IEEE Sensors Journal, 19 (2019), 7575-7582.
Eraslan, Sukru, Yesilada, Yeliz, Yaneva, Victoria, and Harper, Simon. Autism Detection Based on Eye Movement Sequences on the Web: A Scanpath Trend Analysis Approach. In Proceedings of the 17th International Web for All Conference (New York, NY, USA 2020), Association for Computing Machinery.
Akpinar, M. Elgin and Yesilada, Yeliz. Vision Based Page Segmentation Algorithm: Extended and Perceived Success. In Current Trends in Web Engineering (Cham 2013), Springer International Publishing, 238-252.
Eraslan, Sukru, Yesilada, Yeliz, and Harper, Simon. Trends in Eye Tracking Scanpaths: Segmentation Effect? In Proceedings of the 27th ACM Conference on Hypertext and Social Media (New York, NY, USA 2016), Association for Computing Machinery, 15–25.
Sanoja, Andrés and Gançarski, Stéphane. Block-o-Matic: A web page segmentation framework. In International Conference on Multimedia Computing and Systems (ICMCS) (Marrakesh Apr. 1, 2014), 595-600.
Eraslan, Sukru, Yesilada, Yeliz, and Harper, Simon. Eye Tracking Scanpath Analysis on Web Pages: How Many Users? In Proceedings of the Ninth Biennial ACM Symposium on Eye Tracking Research & Applications (New York, NY, USA 2016), Association for Computing Machinery, 103-110.
Eraslan, Sukru, Yesilada, Yeliz, and Harper, Simon. Less users more confidence: How AOIs don’t affect scanpath trend analysis. Journal of Eye Movement Research, 10 (Nov. 2017).
Eraslan, Sukru, Yesilada, Yeliz, and Harper, Simon. Engineering Web-based Interactive Systems: Trend Analysis in Eye Tracking Scanpaths with a Tolerance. In Proceedings of the ACM SIGCHI Symposium on Engineering Interactive Computing Systems (New York, NY, USA 2017), ACM, 3-8.
AMERICAN PSYCHIATRIC ASSOCIATION. Diagnostic and Statistical Manual of Mental Disorders. American Psychiatric Publishing, Arlington, 2013.
Anderson, Jeffrey S., Nielsen, Jared A., Froehlich, Alyson L. et al. Functional connectivity magnetic resonance imaging classification of autism. Brain, 134 (2011), 3742-3754.
Bernas, Antoine, Aldenkamp, Albert P., and Zinger, Svitlana. Wavelet coherence-based classifier: A resting-state functional MRI study on neurodynamics in adolescents with high-functioning autism. Computer Methods and Programs in Biomedicine, 154 (2018), 143-151.
Ibrahim, Sutrisno, Djemal, Ridha, and Alsuwailem, Abdullah. Electroencephalography (EEG) signal processing for epilepsy and autism spectrum disorder diagnosis. Biocybernetics and Biomedical Engineering, 38 (2018), 16-26.
Arlot, Sylvain and Celisse, Alain. A survey of cross-validation procedures for model selection. Statist. Surv., 4 (2010), 40-79.
Yaneva, Victoria, Ha, Le An, Eraslan, Sukru, Yesilada, Yeliz, and Mitkov, Ruslan. Detecting Autism Based on Eye-Tracking Data from Web Searching Tasks. In Proceedings of the Internet of Accessible Things (New York, NY, USA 2018), ACM, 16:1-16:10.
Yaneva, V., Ha, L. A., Eraslan, S., Yesilada, Y., and Mitkov, R. Detecting High-Functioning Autism in Adults Using Eye Tracking and Machine Learning. IEEE Transactions on Neural Systems and Rehabilitation Engineering, 28 (2020), 1254-1261.

About the Authors

Sukru Eraslan is an instructor in the Computer Engineering Program at Middle East Technical University Northern Cyprus Campus (METU NCC). He completed his Ph.D. in Computer Science at the University of Manchester and worked as a post-doctoral researcher at METU NCC for two years and at the University of Manchester for one year respectively. His primary research interest is centred around human-web interaction. Further information about Dr. Eraslan can be found at: http://users.metu.edu.tr/seraslan/index.html

Yeliz Yesilada is an Associate Professor at Middle East Technical University Northern Cyprus Campus (METU NCC) and honorary research fellow in the School of Computer Science at the University of Manchester. She received her Ph.D. in Computer Science at the University of Manchester. Her primary research interest is centred around Human Computer Interaction; in particular user behaviour analysis and modelling, the mobile web and eye-tracking. Further information about Dr. Yesilada can be found at: http://www.yelizyesilada.info

Victoria Yaneva is a data scientist at the National Board of Medical Examiners, USA and a part-time lecturer at the University of Wolverhampton, UK. Her research uses approaches from the fields of machine learning, natural language processing, and eye tracking to improve text and web accessibility for people with autism, as well as develop educational applications such as technology-assisted high-stakes exams. Further information about Dr. Yaneva can be found at: http://www.victoriayaneva.info/

Simon Harper is a Professor of Computer Science at the University of Manchester. His primary research interest is centred around Human Computer Interaction; in particular understanding user behaviour via digital phenotyping and its particular application in Accessibility, Parkinson's Disease, and Type 1 Diabetes & Congenital Hyperinsulinism. Further information about Prof. Harper can be found at: https://www.sharpic.eu/