Helping Students with Cerebral Palsy Program via Voice-Enabled Block-based Programming

Obianuju Okafor, University of North Texas obianujuokafor@my.unt.edu

Abstract

Students with motor disabilities like cerebral palsy face challenges when learning how to program. This is because most programming platforms require a certain level of dexterity that students with cerebral palsy do not possess. Block-based programming environments are one of such platforms. These environments are useful when teaching programming concepts to beginners, however, their drag and drop nature makes them inaccessible to students with cerebral palsy. This limitation deprives students with cerebral palsy of experiences that may allow them explore careers in computing. As a solution, I propose the use of speech as an alternative form of input in the Block-based programming environment, Blockly. This voice-enabled version of Blockly will reduce the need for the use of a mouse or keyboard, hence increase accessibility for students with cerebral palsy. As part of my approach, I incorporate a speech recognition API into the Blockly environment, as well as create a custom function that uses voice commands to simulate keyboard and mouse actions. An exploratory study is currently in the works and a usability study and A/B test will be conducted after the prototype has been fully implemented.

Introduction

Cerebral Palsy (CP) is one of the most common childhood disabilities, affecting approximately three in every thousand live births in the United States [1]. This condition affects the body’s ability to move the hands, arms, legs and muscle co-ordination; thus, making performing mundane tasks like typing on a keyboard or moving a mouse difficult. This limitation is particularly apparent in classrooms today where hands-on courses like ’Introduction to Computer Science’ is increasingly being incorporated as part of the K-12 curricula across the United States. These curricula often make use of Block-based programming environments (BBPE) such as MIT’s Scratch [2] or Google’s Blockly [3] , as a way to make learning programming concepts more appealing to children.

In BBPEs, programs are created by dragging and dropping colorful blocks of code using a mouse (see Figure 1). Their simplicity due to a lack of emphasis on language syntax, has indeed proven to be advantageous in teaching children how to program; however, their heavy reliance on the physical manipulation of mouse and keyboard, poses a barrier for students with motor disabilities such as CP. This limitation excludes a significant portion of students from participating in activities that may allow them to pursue careers in computing. This goes against the ACM code of ethics and professional conduct [4].

This research addresses the obstacles that students with CP face in BBPEs like Blockly and seeks to come up with an effective solution. In my approach, I explore speech as alternative input modality to increase Blockly’s accessibility to students with CP. The goal here is not to replace mouse or keyboard as a form of input in Blockly, but to provide an alternative option for those who cannot use the mouse or keyboard.

Screenshot of the user interface of the Block-based programming application, Blockly.
                In the image, on the left, there is a menu showing all categories, in the middle, there are several blocks combined on the workspace, and on the right, the Javascript code text of the program on the workspace is displayed.
Figure 1 – User Interface of the BBPE Blockly, via Blockly's Website.

This research aims to answer the following research questions:

The rest of the document is structured as follows; in section 2, I talk about the existing research that is most related to my research problem; in the 3rd section, I delve into more details on the proposed solution. I discuss the current status of the project and what my next steps would be in the 4th section; in the last section, I conclude.

Related Work

The most relevant research to mine is that relating to the use of speech to assist programmers with mobility disabilities in programming environments. In [5], [6] Wagner et al. address the challenges people with upper-body motor disabilities face in BBPEs by creating a voice-driven Java tool called Myna. Myna enables programming by voice within the BBPE Scratch. This tool processes voice commands from users, interprets those commands according to a pre-set grammar, and simulates synonymous actions of a mouse and keyboard within Scratch. Similarly, the authors in [7] synthesized the results of a Wizard of Oz based design process, to develop VocalIDE. VocalIDE is a voice-based IDE prototype that allows people with limited dexterity, to write and edit code using a set of vocal commands. They evaluated the usefulness of VocalIDE with 8 participants who have upper limb motor impairments. Their results showed that VocalIDE improved the participants’ ability to make navigational edits and select text while programming.

Desilets et al. created several voice-based programming tools, one of them is VoiceCode [8]. VoiceCode was proposed as a solution for people suffering from Repetitive Strain Injury (RSI). This tool allows developers to use naturally spoken syntax to write code, navigate and modify the code. The naturally spoken commands gets converted to actual code in real time. The system was found to be useful, but it could not be effectively used to teach programming to novices or students with visually impairment. VoiceGrip is the other tool created by Désilets et al. [9]. Akin to VoiceCode, VoiceGrip enables programmers to dictate code using an easy to utter pseudo-syntax, and then that is automatically translated to native code in the appropriate programming language. Voicegrip was created to address some of the usability problems associated with programming-by-voice.

My research is most similar to the one by Wagner et al. [5], [6], in the sense that I also incorporate speech into a BBPE; however, the block-based platform I would be incorporating speech recognition to is Blockly not Scratch. Also, I plan to add speech input modality directly to the original Blockly platform and not as a stand-alone tool as in the case of Myna. Additionally, Myna was not evaluated by enough users, I plan to conduct user studies on the voice enabled Blockly prototype using a larger number of participants.

An overview of how the  system will work.  
    The speech recognition API receives speech input and converts it to text. The text is sent to the custom function, which retrieves the corresponding action and executes it. This causes a change to the Blockly application.
Figure 2 - System Overview

Proposed Solution: Voice-enabled Blockly

Rather than create a voice-based plugin tool that will run in parallel to Blockly as in the case of Myna [5], [6], I decided to add speech as a form of input in the Blockly application. Originally, in Blockly, actions are performed using a mouse or keyboard. I made some adjustments so the same actions can also be performed using speech. I achieve this using the following components:

  1. Speech Recognition API: Speech recognition API is a robust pre-built library that records speech in real-time, converts it to text, and returns the text. As with every API, it can be added to your website or application by adding a few lines of code. The speech recognition API I chose was Web speech API [10], reasons being that it is Javascript based like Blockly, open-sourced, and compatible with most web browsers.
  2. Voice Commands: In the system the voice commands are limited and pre-defined. This helps prevent any ambiguity associated with voice commands. These voice commands can be broken down into 5 categories:
    • Navigation Commands: These are the commands that will be used to navigate through the user interface, menus, drop-down menus, and in between a stack of blocks on the workspace. Using these commands, you would be able to control and move the cursors from one point to another, e.g., "Move up", "Move down".
    • Placement Commands: These commands will be used to select blocks on the menu and place them on the workspace. They will also be used to move blocks around on the workspace, e.g., "Select Block", "Move Block Up".
    • Control Commands: These set of commands are responsible for opening and closing menus or deleting items on the workspace, e.g., "Open Menu", "Close Menu", "Delete Block".
    • Edit Commands: These commands will be used to edit block text or comments, e.g., "Enter text hello world".
    • Mode Commands: These commands will be used to toggle the different modes in the system. There are several modes, keyboard, edit, and speech modes, e.g., "Speech Mode", "Keyboard Mode", "Edit Mode".
  3. Custom function: This custom function maps voice commands to actions. It entails a switch statement where each command is paired with a corresponding action function that they trigger. When triggered the action function simulates keyboard press or mouse click actions.

As Figure 2 shows, the system will work in the following way: the speech recognition API receives speech input through the device's microphone, the audio input is processed, converted to text, and then returned. The generated text is passed through a custom function. In this function the text is compared to all switch statement conditions and when there is a match, the associated action is executed, this triggers a change to the Blockly user interface. If there is no match in the switch statements, the message "command doesn't exist" is read back to the user through a screen reader.

Current and Future Work

The work being presented is still in its early stages. I have proposed this topic to my committee member as my dissertation topic and it has been approved. Currently, I am finalizing the details of an exploratory study which I will be conducting in the beginning of July. I am in contact with an organization called Ability Connection [11], which caters to people with disabilities. Some of their members with CP have agreed to participate in the study. In the study, the participants will be given a list of commands, which they will have to vocalize to the speech recognition system. This study addresses RQ 3 and RQ 4 (Section 1. It will help us determine the feasibility of a speech-based system on people with cerebral palsy. It will also help us determine the appropriateness of the commands which will be used in the system. I will synthesize the results of this study to carefully re-design commands for the speech recognition system.

In parallel to planning the exploratory study, I have also been working on developing a prototype of the voice enabled Blockly system. Once this prototype is ready, I plan to conduct a usability study and A/B test with people with CP, some of which will be the same participants from the exploratory study. In the usability study, the participants will be given tasks to perform using the prototype, at the end they will be given a survey. This study will answer RQ 2 (Section 1). In the A/B test, some participants will use the prototype to perform tasks while the others will use the original Blockly application. They will also be given a survey at the end. This study will address RQ 1 (Section 1). I will use feedback from both studies to influence design decisions in the voice enabled Blockly system.

Conclusion

In this paper I present my research problem, which entails the challenges that students with CP face in Block-based programming environments such as Blocky. As a solution I propose the use of speech as an alternative form of input in the Blockly platform. Research exists that incorporates speech recognition into the BBPE scratch, however it was implemented as a stand-alone tool and was not adequately evaluated. In my approach, I integrated a speech recognition API into the Blockly environment, to enable actions to be performed using voice commands and not just with mouse clicks or keyboard presses. The end goal is to create an accessible environment for students with motor disabilities, thus encouraging them to explore careers in computing.

References

  1. "Data and Statistics for Cerebral Palsy | CDC." https://www.cdc.gov/ncbddd/cp/data.html (accessed Jan. 18, 2022).
  2. "Scratch - Imagine, Program, Share." https://scratch.mit.edu/ (accessed Jan. 18, 2022).
  3. "Blockly | Google Developers." Accessed: Apr. 27, 2020. [Online]. Available: https://developers.google.com/blockly.
  4. "Code of Ethics." https://www.acm.org/code-of-ethics (accessed Jan. 18, 2022).
  5. A. Wagner and J. Gray, "An empirical evaluation of a vocal user interface for programming by voice," Artif. Intell. Concepts, Methodol. Tools, Appl., vol. 1, no. 2011, pp. 307–324, 2016, doi: 10.4018/978-1-5225-1759-7.ch012.
  6. A. Wagner, R. Rudraraju, S. Datla, A. Banerjee, M. Sudame, and J. Gray, "Programming by voice," pp. 2087–2092, May 2012, doi: 10.1145/2212776.2223757.
  7. L. Rosenblatt, P. Carrington, K. Hara, and J. P. Bigham, "Vocal programming for people withupper-body motor impairments," Proc. 15th Web All Conf. Internet Access. Things, W4A 2018, Apr. 2018, doi: 10.1145/3192714.3192821.
  8. A. Désilets, D. C. Fox, and S. Norton, "VoiceCode: An innovative speech interface for programming-by-voice," Conf. Hum. Factors Comput. Syst. - Proc., pp. 239–242, 2006, doi: 10.1145/1125451.1125502.
  9. A. Desilets, "VoiceGrip: A Tool for Programming-by-Voice," Int. J. Speech Technol. 2001 42, vol. 4, no. 2, pp. 103–116, Jun. 2001, doi: 10.1023/A:1011323308477.
  10. "Web Speech API - Web APIs | MDN." https://developer.mozilla.org/en-US/docs/Web/API/Web_Speech_API (accessed Jan. 18, 2022).
  11. Ability Connection, "Home | Ability Connection." https://abilityconnection.org/ (accessed Jan. 13, 2022).
  12. About the Authors

    4th year Computer science PhD candidate. I have a bachelor’s degree in software engineering and a master’s degree in computer science. My research interests include human computer interaction, software engineering, accessibility. For my dissertation project I am making the block-based programming environment Blockly accessible to people with motor impairments using speech.