Designing Innovative Science Assessments That Are Accessible for Students Who Are Blind Abstract

Eric G. Hansen; Lei Liu; Aaron Rogat; Mark T. Hakkinen; Marjorie Darrah

Designing Innovative Science Assessments That Are Accessible for Students Who Are Blind

By Eric G. Hansen, Lei Liu, Aaron Rogat, Mark T. Hakkinen, and Marjorie Darrah

Eric G. Hansen and Lei Liu are research scientists at Educational Testing Service.

Aaron Rogat is a post-doctoral research associate at Purdue University.*

Mark T. Hakkinen is a managing senior research developer at Educational Testing Service.

Marjorie Darrah is an associate professor at West Virginia University.

* This work was performed while employed at Educational Testing Service.

Abstract

This study was an initial effort to design and develop an accessible simulation-based science assessment task, involving evaporation of water, and to examine its usability via a largely qualitative case study of the experiences of three grade 8-9 students at a residential school for students who are blind. The task had three simulations of water particles in various stages of evaporation. Students used the science task in four conditions: (a) screen reader and supplementary non-speech audio, (b) game-controller-based haptics, (c) tablet-based vibrotactile haptics, and (d) tactile graphics. We found that students were able to use the task in all conditions with some success, though they experienced a range of usability issues. These issues could be addressed through improvements in one or more of (a) student familiarity with the access technologies, (b) design of the task content, and (c) the access technologies themselves and their applications in assessment settings.

Keywords

Computer-based assessment, simulations, haptics, tactile graphics, assessment design

Author Note

This research was supported by Research Allocation funds through the Cognitively-Based Assessment of, for, and as Learning (CBAL) initiative at Educational Testing Service (ETS). We gratefully acknowledge: Israel Solon for science content development; Jennifer Grant for programming the task; staff at eTouchSciences for their work on developing the Novint Falcon app; Carlos Cavalie for data collection and preparation; Cary Supalo for advising on the usability of the science task for individuals with visual disabilities; Lauren Kotloff for her work on the literature review; and Jim Allan, staff, and students at the Texas School for the Blind and Visually Impaired for assistance in field testing.

There is great interest in using innovative science assessments, including simulations and other interactive elements, to assess what students know and can do in science (Pellegrino, 2013; Quellmalz & Pellegrino, 2009). In fact, both national and international assessments of science understanding, such as the National Assessment of Educational Progress (NAEP) (National Assessment of Educational Progress, n.d.; National Center for Educational Statistics, 2012) and the Programme for International Student Assessment (PISA) (Organisation for Economic Co-operation and Development, 2010), have recently developed computer-based simulations to collect evidence about student knowledge and skills in science. Simulations are interactive elements that model or represent scientific phenomena that are too small or large, too slow or fast, or too complex or hazardous to present in real life. Simulations may be immediately repeatable and allow closer attention to specific components or processes. Importantly, simulations and other interactive task elements seem well suited to assessment and the learning of science “practices,” such as developing and using models to conduct investigations (Liu, Rogat, & Bertling, 2013), which may otherwise be difficult to carry out. Considering the importance of simulations in science education, it is critical that simulation-based science assessments be made accessible to students with visual disabilities. Achieving that goal will involve developing design principles for simulation-based science assessments tasks, evaluating the prototypes with individuals with visual disabilities, and iteratively improving and documenting design principles and best practices. This study represents a beginning effort of this type.

Research Literature

There has been extensive research on simulations in science education during the past two decades. A meta-analysis by D’Angelo et al. (2013) concluded that “simulations can have a significant positive impact on student learning and are promising tools for improving student achievement in science” (p. 4). Furthermore, an additional review by Scalise et al. (2011) identified best practices and design features for simulations for learning. However, these reviews did not focus on disability access or on assessment. Currently, there is a scarcity of research literature specifically on simulation-based science assessments for students with visual or other disabilities. This section describes some of the key relevant research literature.

Technologies to Support Access to Simulations by Students With Visual Disabilities

The highly visual nature of simulations can present accessibility barriers to individuals with visual disabilities, and it is important that accessibility for this population be considered from the earliest stages of assessment design, rather than attempting to retrofit an existing task for accessibility (Thompson, Johnstone, & Thurlow, 2002). Visually disabled individuals may benefit from screen readers (which read text aloud to users), tactile graphics (also called raised-line drawings), and hard copy (e.g., paper) braille. Several other newer technologies may also provide benefits. For example, a refreshable braille device has a flat panel through which small pins (representing braille dots) raise and lower under the control of a computer, thus allowing a braille reader to read the content. Tasks that need to present geometric or complex objects may make use of 3-D printing. Haptics, involving touch (described below) and sonification (Levy & Lahav, 2011) may also be useful in improving access to simulation tasks by students who are blind. Students with low vision may also need simulations presented with visual enhancements, such as magnification and color modification. Text-to-speech (TTS) is increasingly being implemented in test delivery systems, though the quality of the rendering may differ among (a) voices (some machine-like, others more human-like), (b) screen readers (e.g., JAWS versus Window Eyes versus NVDA), and (c) browsers (e.g., Internet Explorer versus Chrome versus Safari). The use of tablet-based computers with built-in screen reader technology (e.g., Apple iPad with its built-in VoiceOver screen reader) may also prove to be a viable interface for test takers with visual impairments. The use of touch-based devices, such as the iPhone, iPad, or Android tablets, is becoming commonplace among individuals who are blind. A prime example is the increased use of iPads as an alternative to braille notetakers. The braille notetaker is essentially a dedicated personal digital assistant with a built-in refreshable braille display costing upwards of $4,000. However, a student who is blind can utilize a $500 iPad in conjunction with a $1,500 Bluetooth braille display at half the cost.

Haptics

Among the new technologies mentioned above is haptics. Haptics (involving the sense of touch) can be implemented using a variety of techniques, including vibrotactile touch surfaces, wearable haptics, and force-feedback game controllers. These technologies are beginning to undergo small-scale evaluations for individuals with blindness. Sjostrom (2001) explored using force-feedback game controllers for computer interfaces mainly in games, and Darrah and colleagues (Darrah, 2014; Darrah, Murphy, Speranski, and DeRoos, 2014) used game-controller-based haptics for math and science lesson applications. Vibrotactile haptics using Android tablets may provide cost savings due to the device’s low cost and wide availability. One example of Android-based haptics applied to a STEM task is a simulated “sweetness” meter for math (involving ratios of scoops of water to scoops of fruit beverage powder) that employs vibration haptics on an Android tablet. Though not formally evaluated, it is intended to be accessible by students who are blind (Cayton-Hodges et al., 2012; Hakkinen, Rice, Liimatainen, & Supalo, 2013). Hakkinen has also developed a prototype vibrotactile haptic approach that makes lines and other geometrical shapes accessible through vibrations in an Android tablet computer (Hakkinen, Rice, Liimatainen, & Supalo, 2013; Liimatainen, Sahasrabudhe, & Hakkinen, 2014). Similar approaches are being explored by Goncu and Marriott (2011) and Giudice, Palani, Brenner, and Kramer (2012). Therefore, haptics may be a potentially fruitful method to resolve key accessibility issues with simulations.

Resources for Designing Accessible Simulation-based Science Tasks

Based on the research literature and our own experience, a variety of principles are relevant for developing an accessible simulation-based science assessment. The following are several design resources that specifically have bearing upon providing accessibility. First, Evidence Centered Design (ECD) (Mislevy, Steinberg, & Almond, 2003), including its accessibility extensions (Hansen, Mislevy, Steinberg, Lee, & Forer, 2005; Haertel et al., 2010; National Research Council, 2004) can help remove accessibility barriers, such as for individuals with visual disabilities, while still allowing us to make valid claims about what students know and can do. Second, standards and guidelines for accessibility, such as the Web Content Accessibility Guidelines (WCAG) of the World Wide Web Consortium (W3C) (Caldwell, Cooper, Reid, & Vanderheiden, 2008) can reduce the need for reinventing accessibility solutions. Third, Universal Design for Learning (UDL) (Rose & Meyer, 2002) emphasizes that designers of instruction and assessments should provide: (a) multiple means of representation, (b) multiple means of action and expression, and (c) multiple means of engagement. Quellmalz, Silberglitt, and Timms (2011) reported using UDL in the design of SimScientists simulation-based assessments. Finally, Next Generation Science Standards (2013) can provide an up-to-date framework for developing science assessments with (and without) simulations.

Purpose

The purpose of the project was to design and develop an accessible simulation-based science assessment task and then evaluate its usability for students who are blind. Specifically, we examined the experiences of three students in grades 8 and 9 at a residential school for students who are blind. Our approach was largely qualitative, supplemented by basic quantitative results. In keeping with the UDL principle of using multiple representations, we sought to explore and compare several different ways of representing the simulations that are found within the science assessment task. The task had multiple paragraphs of text, a range of questions (multiple choice, short numeric, extended text response), and three simulations of water particles in various stages of evaporation. Students used the task in four conditions. First, the base condition included all task content, both simulation and non-simulation, and was accessed via screen reader (text-to-speech). This was followed by three other conditions (also called access methods) in the same order, involving only the simulations. These other conditions were: (1) the Falcon condition, using an innovative game-controller-based haptic device (Novint Falcon); (2) the Android condition, making innovative use of the vibrotactile haptic capabilities of Android tablets; and (3) the tactile graphics condition―a “business as usual” condition. It was thought that the tactile graphics condition would provide a useful comparison to the innovative Falcon and Android conditions. The Android approach was pioneered by one of the co-authors (Hakkinen) and was, therefore, of special interest. Following are the research questions:

What can students who are blind glean about the simulations from the base condition?
Are students able to use the simulations in the Falcon, Android, and tactile graphics conditions as intended?
Are there instances in which a student’s incomplete understanding of the particle model interferes with their ability to perform well on the task?
Is there evidence of student engagement and learning?
How usable are the non-simulation parts of the task for students using JAWS screen reader?

Method

Our approach was to collect usability-related data on three cases (Yin, 2003), that is, three students who are blind, by means of interviews and researcher observations, with particular focus on the simulation portions of the task. The results were mostly qualitative in nature.

Participant Characteristics

Key criteria for selecting participants were: (a) in grade 8 or higher; (b) visually disabled with little or no sight, who rely primarily on audio and/or touch (e.g., braille, tactile graphics); (c) have no severe cognitive or English language limitations (e.g., limited English proficiency); (d) near-average or better reading level on state assessment; and (e) familiar with accessing computer-based materials.

All three students were females in grades 8 or 9 (aged 13 to 14 years) in a residential school for students who are blind. All were native speakers of English. All three had minimal vision but some light perception. Two of the three described themselves as low vision and the other as blind. All three were actively learning how to use text-to-speech technology (e.g., VoiceOver on the Apple iPad; and JAWS), tactile graphics, and braille at the school. Student 3 (S3) had been at the school for several years, and the other two had been there no more than a year. All three used Apple iPad tablet computers with the VoiceOver screen reader several times a day. Frequency of use of the JAWS screen reader was lower.

With regard to latest mathematics and science grades, S1 and S2 had received A’s and B’s in both science and mathematics. S3’s latest science and mathematics grades were “pass” rather than letter grades. As indicated by their agreement to the statement, “I was already familiar with the access method before the study,” as well as by other observations: (a) all three students were familiar with tactile graphics, with S2 having a relatively low level of familiarity; (b) only S2 was familiar with the Falcon, having used the device (though not the particular application created for and used in this study) on a weekly basis in science class; and (c) none of the students reported familiarity with Android haptics.

Instruments

Key instruments were the background interview (demographic information, prior use of assistive technology, and educational background), the science task (described below in detail), an observation form (used by the researchers), and a post-session interview (PSI). This section focuses on the science task. The four conditions for the task are detailed in a later section.

The Task

The task was based on a longer task from another ETS CBAL project. Our task focused on target measuring the core idea of matter and the practice of developing and using models and was intended to be appropriate for grades 7 to 9 (Liu et al., 2013).

Scenario

The task presents a scenario that asks the student to imagine herself in the following situation. Screen 1 established the scenario, which is that the student who is taking the assessment (and who is also a character in the scenario) is trying to help her virtual friend Pat make sense of something strange that she observed. Pat recounts that while in a cool room, she put 1/8th of a teaspoon of water into a container and put a tight lid on it. Pat then moved the container to a warm room. She left the container in the room and returned two hours later. She was surprised to find that the water had disappeared from the bottom of the container. Screen 1 also invites the student to write an explanation of what happened to the water in the container in the warm room. Screen 2 explains how the student goes to her neighbor, Armand, a science teacher, to help her understand the situation. Armand asks the student questions about the liquid and gas forms of water, which the student answers by typing into an extended text entry box. On screen 3, Armand explains how scientists use models. Screens 4, 5, and 6 present the student with simulations. Simulations 1, 2, and 3 represent the containers at three points in time, 0, 30, and 120 minutes, respectively, after being moved to a warm room. (See the screen for simulation 1 in Figure 1.) On the screen for simulation 1 there is an optional practice exercise (activating it plays a “ping” sound each time two particles collide). Each simulation can be played by clicking on the play button or by using the space bar or enter key. Once play is initiated for a simulation, it plays for about five seconds then stops.

Figure 1. Screen for simulation 1 of the base task (after having run the simulation)

For all simulations, there are ten water particles in the container. Simulation 1 (liquid) shows the water particles bunched together at the bottom of the container. When played, the pings occur at a high frequency. Simulation 2 (liquid and gas) shows some water particles in liquid form at the bottom of the container and some in gas form in the upper part of the container. Pings occur with reduced frequency. Finally, simulation 3 (gas) shows particles distributed more evenly throughout the container and pings occur with a further reduced frequency. Each of the simulations displays the mass of the contents of the cylinder (in mass units). This display is constant across simulations, always showing “1.2”.

Following each simulation is a brief explanation of what the student should have learned to that point. For example, here is the explanation following simulation 1:

“Now that you have explored the model, answer the following questions for the container containing liquid water at the bottom. Keep in mind that there is only liquid water at the bottom of the container.”

Similar explanations follow simulations 2 and 3 as well. Following this explanation, the student is asked to respond to several questions (as shown in Figure 1). For the purposes of this study, the two most important questions concern the spatial arrangement of particles: (1) “How are the particles spaced?” (choices: [a] “All are spaced close together,” [b] “Some are spaced close together,” or [c] “None are spaced close together”). (2) “Are the particles evenly distributed throughout the container?” (choices: “Yes” or “No”). The keys (correct answers) for the three simulations are shown in Table 1 below.

Table 1. Key for Spatial Arrangement Questions Posed to Students After the Simulations

Simulation	Particles spaced close together	Evenly distributed?
1 – Liquid	All	No
2 – Part liquid and part gas	Some	No
3 – Gas	None	Yes

On screen 7 are two tables, accessed via “tabs” on the screen. In one table are the student’s answers to the questions that followed each simulation. In the other table are answers given by Armand. This is the end of the base task presented to the student. (There were actually two additional screens but these were not presented due to time limitations.)

In terms of media, the task (including simulations) presented content visually (as a typical computer-based assessment), supplemented by a small amount of nonspeech audio (pings) to signify collisions between simulated water particles.

Development of the Task

The task was designed to be formative (i.e., to facilitate learning). The task was developed based on a competency model that is aligned with the new Next Generation Science Standards (Next Generation Science Standards, 2013). ECD was used to identify focal “knowledge, skills, and other attributes” (KSAs) that were part of the construct to be assessed, in this instance to, “develop and use models in the context of the particle model of matter for the process of evaporation in a closed container.” ECD was also used to identify nonfocal KSAs (i.e., not part of the construct) that were still required (demanded) of users in order to perform well in an assessment situation (e.g., hear, feel) (See Appendix B). In principle, any nonfocal, required KSA not possessed by the student may result in an accessibility barrier.

The team developed a script for the science task, consisting of a word processor document with text and simple visuals, plus additional notes pertaining to the access features. After review by the project team, a storyboard was developed in PowerPoint slides by an ETS assessment developer. Finally, the task was implemented in ETS’s web-based “C4” research assessment delivery platform, a platform that supports a wide range of innovative computer-based assessment research projects. The team sought to follow the W3C Web Content Accessibility Guidelines (Caldwell et al., 2008) to be keyboard navigable with the targeted web browser (Firefox) and screen reader (JAWS version 14). Materials and procedures were developed for the various conditions.

The assessment task (including all four conditions) underwent an initial tryout with a college freshman who is blind, after which a few minor adjustments were made to the task prior to evaluation with the three participants.

The Four Conditions

The four conditions consisted of the base condition (involving screen reader access to the full task) and three other conditions, sometimes referred to as the three “access methods,” which pertained only to the simulations.

Base Condition

This condition involved use of the JAWS screen reader program to access the full task and included not only simulations 1, 2, and 3, but also preparatory and follow-up materials. Because of the highly visual and dynamic nature of the simulations within the task, the base condition was thought to fall short in terms of the experience for students who were blind, thus creating the need to explore other access methods in the three other conditions. This condition was chosen to represent a minimal and less-than-adequate way of presenting content to students who have visual disabilities.

Falcon Condition

This condition used the Novint Falcon game-controller-based haptics hardware, costing about $250, which provided a platform for an app developed by eTouchSciencesⁱ for this project for the delivery of simulations 1, 2, and 3. The app for the Falcon condition (see Figure 2) ran on a separate laptop from the application that presented the screens described earlier. Besides the three simulations, the Falcon app had a familiarization simulation, which involved one particle (rather than multiple particles). For both the familiarization and regular simulations, the app provided both a static mode (where the particles are not moving) and a dynamic mode (where the particles are moving and emitting sounds to represent collisions of particles). While using the app in either static or dynamic mode, the student used the controller grip to navigate a “sensor,” or 3D cursor, throughout the inside of the cylinder (container) and could feel resistance whenever the sensor contacted a particle or interior wall of the container. The physical grip can move the arms of the device a maximum of about ten centimeters in each of the x, y, and z axes, though for the application in this study the maximum movement in each direction was about five centimeters.

Figure 2. Novint Falcon game controller with the round grip held by a user

Photo of the Novint Falcon game controller

Android Condition

This condition used vibrotactile haptics on an Android tablet and consisted of one vibrotactile static graphic for each of the three simulations (1, 2, and 3). The cost of this was basically the cost of an Android tablet and the development of the software to drive the vibrations based on the student’s touch. Vibrations were produced by a piezoelectric motor that is a standard part of Android tablets. The Android haptics were developed by utilizing scalable vector graphics and a prototype of haptic style sheets to create the tablet-based tactile display. The screen had dimensions of 13.4 centimeters by 21.8 centimeters, with a diagonal measure of 25.5 centimeters. On the screen the height (along the side) of the cylinder was 5.7 centimeters and each particle had a diameter (height) of 0.9 centimeters. See Figure 3.

Figure 3. Simulation 1 for Android (haptic interface)

Android haptics image for simulation 1 (liquid)

Tactile Graphics Condition

This consisted of one tactile graphic (embossed on thick paper) for each of the simulations (2, 3, and 1). Note that the tactile graphics were presented in this nonstandard order (2, 3, 1) as a matching exercise to see if the student could match each tactile graphic to its respective simulation. The cost of this was essentially that of authoring the graphic, the braille paper, and the one-time cost of the braille embosser (printer). The images portrayed in the tactile graphics were essentially identical (except for scale) to the images used in the Android condition. Embossed on an 8 1/2 by 11 inch tactile graphic sheet, the height along the side of the cylinder was 8.5 centimeters, and each particle had a diameter (height) of 1.4 centimeters (see Figures 4 and 5).

Figure 4. Simulation 2 for tactile graphics (dashed boundary represents an 8 1/2 by 11 inch sheet of braille paper)

Tactile graphic for simulation 2 (liquid and gas)

Figure 5. Simulation 3 for tactile graphics

Tactile graphic for simulation 3 (gas)

Summary

Table 2 summarizes, for each of the four conditions: (a) what the student is expected to do; (b) modes for experiencing the simulation (e.g., static versus dynamic); (c) the use of sound, vibration, and force-feedback; and (d) KSAs needed to perform well.

Table 2. Summary of the Four Conditions

Condition	What the Student is Expected to Do	Modes for Experiencing the Simulation (e.g., static versus dynamic)	Use of Sound, Vibration, and Force-Feedback	KSAs Expected of Student (Who is Blind) to Perform Well in the Condition
1. Base condition (regular presentation plus text-to-speech [TTS])	Use the JAWS screen reader program to navigate through the content―both non-simulation and simulation―running on a laptop with a Firefox browser.	Static and dynamic. Once the dynamic simulation is invoked (via “play” button), it continues without interruption to completion.	Text-to-speech (synthesized), plus non-speech sounds (“ping”, aka “click”) to signal a collision between particles when running the simulation (dynamic mode)	Hear the pings of particle-to-particle collisions.
2. Falcon (access method)	Access each of the three simulations via a special application running on a different laptop from that which is running the main task.	Static and dynamic. Although individuals tried both modes, counting of particles occurred using static mode.	Audio instructions and student-invokable auditory where-am-I hint (“You are in the bottom half of the container”). The controller knob for this application had a maximum movement of about five centimeters in each of three dimensions.	Feel and move the Novint Falcon controller knob and feel force feedback.
3. Android (access method)	Access a static image for each of the three simulations via an Android tablet using vibrotactile haptics. This image is the same used for the corresponding tactile graphic.	Static only	Vibration (and sound of the piezoelectric motor) when touching a container boundary (subtle sound/vibration) or particle (heavier sound/vibration)	Feel vibrations on one finger when touching an on-screen object.
4. Tactile graphics (access method)	Access a static image of each of the three simulations using tactile graphics produced by a ViewPlus SpotDot embosser. This used the same image as used for Android.	Static only	No sound/vibration	Feel tactile graphics.

Study Procedure

Each of the three students experienced the conditions in the same order―base, Falcon, Android, and tactile graphics. The session for each student, which lasted about 90 minutes, had two researchers present. One researcher administered the base and tactile graphics conditions. The other observed and administered the Falcon and Android conditions. The session began with a background interview. This was followed by the base condition using the JAWS screen reader.

Then the Falcon condition began with the researcher briefly explaining to the student how to place their hand on the grip of the game controller device. A tutorial that was part of the app helped guide students in the commands. Students used the familiarization item (with 1 particle) in both static and dynamic modes of operation. The researcher provided brief guidance and feedback as necessary as the student examined all three simulations in the Falcon condition.

Next was the Android condition, for which the researcher presented a familiarization item (1 particle) and provided basic orientation on interpreting the vibrations. The student then examined all three simulations in the Android condition.

Finally, the tactile graphics condition for all three simulations was presented with essentially no familiarization or guidance. As will be explained more in the results section, due to early difficulties in accessing the intended spatial arrangement information from simulations using the three access methods, we adjusted our data collection away from inquiries about spatial arrangement of particles within the container and instead focused more narrowly on asking students to count the particles in the simulated container. Following the session, the researcher administered the post-session interview.

Results

Key Results

1. What can students who are blind glean about the simulations from the base condition? Students were able to access some key information about the simulations. For example, two of the students were able to access the mass of the particles in the container (1.2 mass units). One student missed viewing the display of mass units so, when answering the question, was directed by the researcher back to the display. The students were also able to correctly confirm, for each simulation, that the particles were moving. However, they generally gave incorrect answers to the two spatial arrangement (distribution) questions that followed each simulation in the base task. Specifically, for simulation 1 all three answered both questions incorrectly. For simulation 2, all three answered both questions incorrectly, except that S3 correctly indicated that “some” particles were spaced close together. For simulation 3, S1 gave all incorrect responses, for S2 data were not collected, and S3 gave one correct and one incorrect response.

We were not surprised that students seemed mostly unable to glean much correct information from the base task; indeed, we expected that students would feel a need for additional access methods. These expectations were generally supported by students’ answers to questions about particles’ spatial arrangement in the post-session interviews. All three students disagreed or strongly disagreed they “could have answered the questions after the simulation[s] by relying only on the audio without any of the three access methods: Novint Falcon, Android tablet (vibrotactile haptics), and tactile graphics” (PSI-3, PSI-4, PSI-5). Thus, these responses suggest that students believed, at least retrospectively, that in order to answer the items within the task, they needed information beyond that which was available to them in the base condition.

2. Are students able to use the simulations in the Falcon, Android, and tactile graphic conditions as intended?
Notwithstanding some usability successes, students had difficulties accessing the intended information from simulations using the three conditions. Students were slower than we expected in finding the particles, particularly with the Falcon and Android conditions, which caused us to reduce our inquiries about spatial arrangement of particles within the container and, instead, focus more narrowly on asking students to count the particles in the simulated container. We realized that the ability to count particles was not necessarily part of the construct to be assessed. Nevertheless, it was felt that the task of counting would provide a simple, objectively ascertainable indicator of the ability to obtain basic information from the simulations.

Researchers made sure that students using the Falcon experienced each simulation in static mode (in addition to elective use of dynamic mode), because the process of counting would be difficult in dynamic mode. As noted earlier, the images were only static for both the Android and tactile graphics conditions. Our focus on counting particles was consistent with the idea that, unless students could detect or count particles rather well, they could not be expected to provide accurate answers about spatial arrangement. Thus, a count of particles would provide a fairly objective task that would be indicative of basic student access to the simulations.

In Table 3 are shown the results of obtaining those counts with the three access methods. After discussing these counts we will describe some of the challenges and successes students experienced in using the access methods. It should be noted that, at no time, did we provide corrective feedback or otherwise disclose the actual number of particles in the simulations (i.e., 10 particles).

Table 3. Student Count of Particles for Three Access Methods and the Three Simulations

Condition	Simulation	Actual No. of Particles	S1	S2	S3
Falcon	1	10	DNC	4	5
Falcon	2	10	40^a	4	25
Falcon	3	10	4 or 5	6	10
Android	1	10	1^b	7	14 to 15
Android	2	10	3 or 4	5	11 to 12 without vision; 12 with vision.^c
Android	3	10	5	8	12 without vision; 8 with vision.^c
Tactile graphics	1	10	9 or 10	10	9
Tactile graphics	2	10	10	8	8
Tactile graphics	3	10	10	9	10

Legend:
Simulation 1 all liquid; simulation 2: part gas, part liquid; simulation 3: all gas.
S1: student 1; S2: student 2; S3: student 3.
DNC = “Did Not Collect” (i.e., no data collected)

To the best of our knowledge the students made their counts without relying on residual vision.

Notes:

^a The count of 40 particles is extremely high. We see three possible explanations. First, the student have may have been influenced by the pings (auditory cues) encountered in the previous (base) condition (TTS). The student may have assumed that each ping represented a particle whereas it actually indicated a collision between two particles. Second, the student may have been influenced by the jostling of particles felt in dynamic mode of the Falcon. And third, the student may have inadvertently counted the same particles repeatedly. (There was no mechanism in the Falcon for marking a particle as having been counted.) Based on the observed behavior of the student during the session, we think the third explanation less likely than the first two.

^b We are not sure why this was so low. A count of 1 would have been correct for the sample item but not for simulation 1.

^c S3 counted first without using vision then again using residual vision.

In brief, one can see from Table 3 that: for Falcon, there was one accurate answer (i.e., 10) (S3 for simulation 3); for Android, there were no accurate answers; and for tactile graphics, there were four accurate answers (or five if one counts an answer of “9 or 10” as correct). Of course, it is important to note that tactile graphics was the last in the sequence, and the accuracy of this method could have been influenced by earlier exposure to the other conditions. Below are described some additional details about the usability of the three access methods for the participants.

Falcon

S1 and S3 appeared to learn quickly how to operate the Novint Falcon device, which they had never used before. (S2 had used it weekly in science class.) All students had some difficulty finding the particles in the Falcon condition. Students were observed “missing” the particles by navigating the sensor in front or back of the particles (as well as by navigating too high or low, or too far to the left or right). Thus, the 3-D nature of the Falcon simulation may have provided additional opportunities to miss than would be encountered by flat simulations like the Android and tactile graphics conditions. Notwithstanding the difficulties encountered, the Falcon condition had the highest level of student agreement with the statement, “I would recommend the access method for science simulations.” Both S2 and S3 indicated “strongly agree.”

Android

As is evident with the particle counts, students had some difficulty in finding the particles using Android haptics. One reason may be the smaller size of the Android particles—0.9 centimeters in diameter (height), in contrast to 1.4 centimeters for tactile graphic particles. One additional issue was the presence of a low level background vibration that occurred in the region of the graphic even when not on either a particle or line. This background vibration feature had been implemented (based on feedback on an earlier prototype) in order to help students know when they were touching the actual graph area. However, this feature may have limited the clarity with which objects were sensed via haptics. Finally, it must be noted that the Android condition was the only condition with which all our participants had no prior experience, which may have contributed to usability difficulties.

Tactile graphics

Tactile graphics did produce significant hints of relatively good usability. Consider, for example, the results of the several Likert scale interview questions regarding the access methods. Tactile graphics garnered the highest level of student agreement for these questions (the highest level being determined arithmetically where comparisons are based on sums, i.e., where SD equals 1 point, D equals 2 points, N equals 3 points, A equals 4 points, and SA equals 5 points).

“The access method was easy to use.”
“I would like to use this access method frequently.”
“I was confident about using the access method.”
“I was already familiar with the access method before the study.”

Furthermore, tactile graphics had the lowest level of agreement with the following statements:

“I think that the system was cumbersome to use.”
“I would need someone else to help me use this access method.”

It should be noted that tactile graphics was the only one of three access methods with which all three students had prior experience.

3. Are there instances in which a student’s incomplete understanding of the particle model interferes with their ability to perform well on the task?
Challenges encountered by students in finding particles appeared to limit students’ ability to understand the spatial arrangement of particles. This limits our ability to make inferences about how students’ reasoning regarding the particle model relates to their understanding of the simulations. Below we report an instance in which a student’s apparently incomplete understanding of the particle model seems to have interfered with her performance on the task. Specifically, S3 appeared to have an incomplete understanding of the particle model of matter, which seems to have interfered with her ability to appropriately interpret the phenomenon of evaporation as portrayed in the base task. For example, for simulation 1, she responded “yes” to the question “Are the particles evenly distributed throughout the container?” The correct answer for that item was “no,” because the particles were all located at the bottom of the container, rather than distributed evenly through the container. When asked to elaborate, it appears that she had based her “yes” response on an interpretation of what she had been taught, which she stated as “Because [liquid] water takes the shape of the container.” Thus, in the absence of a haptic or tactile representation of the container and particles, she may have thought that the particles were dispersed to conform to the shape of the entire container (top as well as bottom). Also, for simulation 3, which she correctly understood to be a gas state, she incorrectly indicated that “no,” the particles were not “distributed evenly throughout the container.” When asked about her “no” response to this question, she stated, perhaps based on an interpretation of what she had been taught, that “Gas doesn’t form a shape.” We, thus, hypothesized that: (1) she believed she had been taught that a liquid “takes the shape of the container,” but did not realize that (in gravity in a sufficiently large container) a liquid takes the shape of only the bottom of the container, not the whole container; and (2) she believed she had been taught that a gas “doesn’t form a shape” (i.e., is formless), but did not realize that a gas ordinarily takes the exact shape of the container it is in. Thus, it appears that her performance on the base task may have been adversely impacted by an incomplete understanding of the particle model. In other words, these instances implied that the student was responding to the question based on her prior incomplete knowledge instead of evidence collected from the simulation. Limitations in time and in the usability of the three access methods did not permit us to further confirm those hypotheses or to determine whether the apparent misconceptions were later resolved.

4. Is there evidence of student engagement and learning?
In response to the base task, S3 made several statements that suggest a sense of interest and engagement. When she found that she could listen to the sound of a single collision between particles, she said, “That is cool!” When she played and listened to the simulation (with the pings signaling particle collisions), she said “That’s so fun!” Just prior to using simulation 2, she asked: “Did you make all this? It is fun!”

After experiencing all conditions, in response to the statement “The task was interesting to take,” both S1 and S2 indicated “agree.” In response to the statement “The task was fun to take,” S1 indicated “agree” and S2 indicated “neither agree nor disagree.” In response to the statement “The story about Pat and the disappearing water was interesting,” S1 indicated “strongly agree” and S2 indicated “neither agree nor disagree.” With respect to learning from the task, the two students (S1 and S2) who were asked indicated that they “agreed” with the statement “I learned about science by taking this task.” (S3 was not asked any of these questions due to time limitations.) When S2 was asked what she learned, she said that she learned that “Evaporating water is a gradual process.”

5. How usable are the non-simulation parts of the base task using a JAWS screen reader? Generally, the base condition was mostly usable to the students. Students were able to navigate through most of the task using the TTS capability of JAWS (version 14) with almost no help. This included navigating from screen to screen and from paragraph to paragraph within a screen. Students used either the down arrow or tab to move from one paragraph to the next, and either the enter key or the space bar to select an answer. All were able to type the extended text responses and numeric entry items within the task. Two of the three participants were able to respond to the multiple choice items; the other student (S2) indicated that she had not yet learned how to respond to multiple choice items but quickly caught on when told how to do it (e.g., press the space bar to select). All students were able to use the laptop keyboard to type alpha or numeric responses. All three played the sample “ping” sound prior to simulation 1―the ping signified a collision between two particles. All were able to play the base-task simulations, but not see them. When students reached the table that showed data aggregated from the three simulations, one student (S1) was able to begin to navigate through it, but was unable to access the other table (showing Armand’s data). The other two students indicated that they did not know how to access tables in JAWS at all. (As described in note 1 of Table B1, students arguably faced an accessibility barrier due to the difficulty of accessing the tables.). As noted earlier, students were not asked to access the two screens after the summary tables.

There was room for improvement in usability. For example, with regard to the statement, “It was easy to use the screen reader to navigate through the task” (PSI-1), two students (S1 and S2) indicated “neither agree nor disagree” and S3 indicated “strongly disagree.” With regard to the statement, “It was easy to understand the synthesized speech of the screen reader” (PSI-2), S1 indicated “strongly agree,” S3 indicated “agree,” and S2 indicated “neither agree nor disagree.”

Discussion

It appears from this study that, notwithstanding some usability successes (e.g., screen-reader navigation and use of much of the task in the base condition, some success in using other access methods), visually disabled individuals needed greater levels of support in order to make this simulation-based assessment task accessible and usable. For example, students had difficulty understanding the spatial arrangement of particles and were unable to navigate through the tables that aggregated responses to questions about simulations. These challenges contributed to difficulties experienced by students in integrating their content knowledge and their skill of modeling practice (Liu et al., 2013); there were instances where students ignored the evidence from the simulations that was conflicting with their incomplete prior knowledge. We found in this study that the more traditional accommodation (tactile graphics) appeared to have some advantages relative to newer technologies for exploring the static representations and successfully counting particles in the simulations. Haptic-based technologies, such as the Falcon, introduce the ability to improve access to dynamic aspects of a simulation, which are harder to convey using tactile graphics; therefore, such haptics may be better suited to tasks where students are asked specifically about the movement of the particles.

A variety of strategies should be pursued to address the usability issues identified. Some usability challenges might be addressed through improved student familiarity with the access technologies. For example, students would need additional proficiency with JAWS in order to access the tables. Additional familiarity would also be needed in specific haptic technologies (Falcon and Android). Other usability challenges might be addressed through improved design of the task content. For example, the simulated particles might be made larger or otherwise more distinctive to better enable detection. Or, in the Falcon app, attractive forces might be implemented to pull users toward particles, better enabling their detection. Finally, other usability challenges might best be addressed by changes to the access technologies themselves. Ideally, for example, Android haptics would allow multi-touch (i.e., not just one finger). Multi-touch might offer an exploration strategy that is consistent with physical tactiles; yet, current tablet-based vibrotactile technology cannot support multi-touch, and there is no widely available technical solution. Wearable haptics, as explored by Goncu and Marriott (2011) and Sullivan, Sahasrabudhe, Liimatainen, and Hakkinen (2014) are one solution. Determining best practices for addressing such issues will involve new research and development.

Also, it is important to note that any innovative access method may impose a need for ways to familiarize students with it, including use in the curriculum as well as through practice and familiarization materials. Furthermore, caution is needed for interfaces involving small displays and input devices, which may increase demands upon the student for sight, fine motor control, sense of touch, hearing, etc. Also, it is crucial to support the development of standards-based solutions to the accessibility challenges of simulations and other innovative assessment tasks, which can help assessment developers avoid the need to reinvent solutions.

Limitations

A few of the limitations of the study include: (a) few participants, (b) little data collection time, and (c) not addressing possible effects of the order of conditions (e.g., there was no counter-balancing of order of conditions). Also, the Falcon simulations were designed with the idea that students would be asked about the spatial arrangement of the particles. If the designers had known that emphasis would be given to counting particles, they may have implemented additional design features to make counting easier, for example, having each particle, as it is touched, announce its status as either previously touched or newly touched, thus making it easier for the student to both count particles and to understand their spatial arrangement. It also seems appropriate to note a general limitation or challenge of computer-based simulations; that is, that they may inadvertently move the student one step further away from the real experience, where the “real” science experience may be a truly hands-on experience, such as the use of real sensors and probes that are accessible to students with visual disabilities. Perhaps such sensors or probes could be made accessible and integrated into assessment delivery environments.

Recommendations

Based on this study and other experience in the domains of accessibility and science education, we offer the following recommendations for designing accessible innovative science assessment tasks.

Use a methodology such as Evidence Centered Design to guide the design and development process (Hansen & Mislevy, 2006; Mislevy et al., 2003).
Provide students with visual disabilities with extensive opportunities to become familiar with assistive technologies in the context of the targeted science content, such as through classroom instruction and materials for practice and familiarization.
Don’t overlook the advantages of traditional technologies (e.g., the multi-touch capabilities of tactile graphics) when exploring promising, innovative access methods (e.g., haptic devices).
Exercise caution with devices that have small displays or input mechanisms, which may cause accessibility barriers due to excessive demands for skills that are not targeted for assessment (e.g., sight, dexterity).
Continue to support the development of standards-based solutions to the accessibility challenges of simulations and other innovative science assessment task features.

References

Caldwell, B., Cooper, M., Reid, L. G., & Vanderheiden, G. (Eds.). (2008). Web content accessibility guidelines 2.0. Cambridge, MA: World Wide Web Consortium. Retrieved from http://www.w3.org/TR/WCAG20/

Cayton-Hodges, G., Marquez, E., van Rijn, P., Keehner, M., Laitusis, C., Zapata-Rivera, D., & Hakkinen, M. (2012, May). Technology enhanced assessments in mathematics and beyond: Strengths, challenges, and future directions. Paper presented at the Invitational Research Symposium on Technology Enhanced Assessments, Washington, DC. Retrieved from http://www.ets.org/Media/Research/pdf/session1-cayton-hodges-keehner-laitusis-marquez-paper-tea2012.pdf

D’Angelo, C., Rutstein, D., Harris, C., Bernard, R., Borokhovski, E., & Haertel, G. (2013). Simulations for STEM Learning: Systematic Review and Meta-analysis (executive summary). Menlo Park, CA: SRI International. Retrieved from https://www.sri.com/sites/default/files/brochures/simulations-for-stem-learning-exec-summ.pdf

Darrah, M. (2014). Computer haptics: A new way of increasing access and understanding of math and science for students who are blind and visually impaired. Journal of Blindness Innovation and Research, 3(2). http://dx.doi.org/10.5241/3-47

Darrah, M., Murphy, K., Speranski, K., and DeRoos, B. (2014). Framework for K-12 education: haptic applications. In M. O’Malley, S. Choi, & K.J. Kuchenbecker (Eds.), Proceedings of IEEE Haptics Symposium 2014, 409-414.

Goncu, C., & Marriott, K. (2011). GraVVITAS: Generic multi-touch presentation of accessible graphics. In P. Campos et al. (Eds.), Human-computer interaction–INTERACT 2011: 13th IFIP TC 13 International Conference, Lisbon, Portugal, September 5-9, 2011, Proceedings, Part I, 30-48, Berlin: Springer. doi: 10.1007/978-3-642-23774-4_5

Giudice, N. A., Palani, H. P., Brenner, E., & Kramer, K. M. (2012). Learning non-visual graphical information using a touch-based vibro-audio interface. In M. Huenerfauth (Chair), Proceedings of the 14th international ACM SIGACCESS conference on Computers and accessibility (pp. 103-110). New York, NY: ACM. Retrieved from http://dl.acm.org/citation.cfm?id=2384916

Haertel, G., DeBarger, A. H., Cheng, B., Blackorby, J., Javitz, H., Ructtinger, L. . . . Hansen, E. G. (2010). Using evidence-centered design and universal design for learning to design science assessment tasks for students with disabilities (Assessment for Students with Disabilities Technical Report 1). Menlo Park, CA: SRI International. Retrieved from http://padi-se.sri.com/downloads/TR1_UsingECDandUDL.pdf

Hakkinen, M., Rice, J., Liimatainen, J., & Supalo, C. (2013, March). Tablet-Based Haptic Feedback for STEM Content. Presentation at the International Technology & Disabilities Conference 2013, San Diego, CA.

Hansen, E. G., & Mislevy, R. J. (2006). Accessibility of computer-based testing for individuals with disabilities and English language learners within a validity framework. In M. Hricko & S. Howell (Eds.), Online assessment and measurement: Foundation and challenges (pp. 214-262). Hershey, PA: Information Science Publishing.

Hansen, E. G., Mislevy, R. J., Steinberg, L. S., Lee, M. J., & Forer, D. C. (2005). Accessibility of tests for individuals with disabilities within a validity framework. System: An International Journal of Educational Technology and Applied Linguistics, 33(1), 107-133. doi:10.1016/j.system.2004.11.002

Levy, S. T., & Lahav, O. (2011). Enabling people who are blind to experience science inquiry learning through sound-based mediation. Journal of Computer Assisted Learning, 28(6), 499-513. doi: 10.1111/j.1365-2729.2011.00457.x

Liimatainen, J., Sahasrabudhe, S., & Hakkinen, M. (2014). Access to 2D and 3D graphics using vibrotactile feedback and sonification. Presentation at the International Technology & Disabilities Conference 2014, San Diego, CA.

Liu, L., Rogat, A., & Bertling, M. (2013). A CBAL science model of cognition: Developing a competency model and learning progressions to support assessment development. (ETS Research Report No. RR-13-29). Princeton, NJ: Educational Testing Service. Retrieved from http://www.ets.org/Media/Research/pdf/RR-13-29.pdf

Mislevy, R. J., Steinberg, L. S., & Almond, R. G. (2003). On the structure of educational assessments. Measurement: Interdisciplinary Research and Perspectives, 1(1) 3-62. Retrieved from http://www.jakestone.net/wikipics/pdfs/mislevy-ecd.pdf

National Assessment of Educational Progress. (n.d.). Science 2009: Interactive computer tasks and hands-on tasks. Retrieved March 6, 2014 from www.nationsreportcard.gov/science_2009/ict_summary.aspx

National Center for Educational Statistics. (2012). Science in action: Hands-on and interactive computer tasks from the 2009 science assessment. Washington, D.C.: Institute of Education Sciences, U.S. Department of Education. Retrieved from http://nces.ed.gov/nationsreportcard/pdf/main2009/2012468.pdf

National Research Council. (2004). Keeping score for all: The effects of inclusion and accommodation policies on large-scale educational assessment. Washington DC: National Academies Press.

Next Generation Science Standards. (2013). Next generation science standards. Retrieved from www.nextgenscience.org/next-generation-science-standards

Organisation for Economic Co-operation and Development. (2010). PISA computer-based assessment of student skills in science. Paris: OECD. Retrieved from http://www.oecdbookshop.org/browse.asp?pid=title-detail&lang=en&ds=&ISB=9789264082021

Pellegrino, J. W. (2013). Proficiency in science: Assessment challenges and opportunities. Science, 340(6130), 320-323. doi: 10.1126/science.1232065

Quellmalz, E. S., & Pellegrino, J. W. (2009). Technology and testing. Science, 323(5910), 75-79. doi: 10.1126/science.1168046

Quellmalz, E. S., Silberglitt, M. D., & Timms, M. J. (2011). How can simulations be components of balanced state science assessment systems? [Policy brief]. San Francisco, CA: WestEd. Retrieved from http://simscientist.org/downloads/SimScientistsPolicyBrief.pdf

Rose, D. H., & Meyer, A. (2002). Teaching every student in the digital age: Universal design for learning. Alexandria, Va.: Association for Supervision and Curriculum Development. Retrieved from http://www.ascd.org/publications/books/101042.aspx

Scalise, K., Timms, M., Moorjani, A., Clark, L., Holtermann, K., & Irvin, P. S. (2011). Student learning in science simulations: Design features that promote learning gains. Journal of Research in Science Teaching, 48(9), 1050–1078. doi: 10.1002/tea.20437

Sjostrom, C. (2001). Using haptics in computer interfaces for blind people. In M. Tremaine (Chair), CHI '01 Extended Abstracts on Human Factors in Computing Systems (pp. 245-246). doi: 10.1145/634067.634213

Sullivan, H., Sahasrabudhe, S., Liimatainen, J., & Hakkinen, M. (2014). Opportunities and limitations of haptic technologies for non-visual access to 2D and 3D graphics. In K. Miesenberger, D. Fels, D. Archambault, P. Peňáz, & W. Zagler (Eds.), Computers Helping People with Special Needs: 14th International Conference, ICCHP 2014, Paris, France, July 9-11, 2014, Proceedings, Part II (pp. 8-11). Springer International Publishing. doi: 10.1007/978-3-319-08599-9_2

Thompson, S. J., Johnstone, C. J., & Thurlow, M. L. (2002). Universal design applied to large scale assessments (Synthesis Report 44). Minneapolis, MN: University of Minnesota, National Center on Educational Outcomes. Retrieved from http://www.cehd.umn.edu/NCEO/onlinepubs/synthesis44.html

Yin, R. K. (2003). Case study research: Design and methods (3rd Ed.). Thousand Oaks, CA: Sage.

Appendix A: Image Description for Figure 1

Screen shot for simulation 1 (liquid). At the top of the screen, spanning the width of the screen is a narrow bar with navigation buttons (e.g., to go back and forward). The remainder of the screen is divided into left and right halves. On the left half are story dialogue, directions, and the closed container with 10 water particles. Below the container is a "play" button to start the simulation. On the right half of the screen are instructions and several questions (e.g., "How are the particles spaced?", "Are the particles evenly distributed throughout the container?").

Appendix B: A Fragment of the Assessment Argument

Table B1 shows a fragment of an overall, retrospective, ECD-based assessment argument (Hansen & Mislevy, 2006) for the science task and its conditions for the study participants. It is a “fragment” in that it does not illustrate other valuable information, such as student levels in the various KSAs, or strategies for addressing accessibility barriers.

It is an “overall” argument in several senses. First, it pertains to the task as a whole―including both the non-simulation content and the simulations, including the alternatives for the simulations. Second, for simplicity, it pertains to all three study participants as if they were one person. Third, it focuses on overall requirements of the full set of conditions (for example, it asserts that hearing was required, because the task, as administered, required hearing). And fourth, it focuses on the task’s overall assessment purpose, rather than calling out the learning aspects or phases of the task.

The argument is “retrospective” in the sense that it takes into account additional knowledge gained only during the study; for example, it calls out how the task required the KSA “know how to use JAWS to access tables,” something that we did not realize at design time would be a significant issue.

A key outcome of the argument is the recognition that where a participant is not able to satisfy the nonfocal requirement of a task situation, then an accessibility barrier may exist. For example, our participants in this study were unable to satisfy the nonfocal requirement for “know how to use JAWS to access tables” (row 8). The task’s requirement for this KSA paired with the students’ lack of capability in this KSA arguably caused an accessibility barrier that should be addressed by one or more of increasing the student capability in the KSA (such as through instruction in the use of JAWS for accessing information in tables) and reducing the requirement for this KSA (such as by providing some easier―perhaps non-tabular―method for accessing the information in the tables).

Table B1. Fragment of the Assessment Argument: A Portion of a KSA Value Matrix for the Study Participants

Row number	KSA	Focal value (i.e., KSA is part of construct)	Requirement value (i.e., task situation demands the KSA)
1	See	n/a	n/a
2	Hear	n/a	Yes
3	Decode	n/a	n/a
4	Feel	n/a	Yes
5	Know braille codes	n/a	n/a
6	Comprehend English	n/a	Yes
7	Know how to use JAWS to access non-table content	n/a	Yes
8	Know how to use JAWS to access tables	n/a	Yes
9	Use access methods to count particles	n/a	n/a ^{(see note 1)}
10	Use the simulations to discover the spatial arrangement of particles in the container	n/a	Yes
11	Develop and use models in the context of particle model of matter for the process of evaporation in a closed container	Yes	Yes

Note 1. This KSA was not a significant part of the design but is shown here because of its importance later on during the execution of the study. The task did not necessarily require (demand) that the student possess this KSA in order to perform well on the task, nevertheless, we did require this nonfocal KSA as part of the study to serve as an indicator of the usability of the three access methods (ability to obtain information via the access modes).

End note:

eTouchSciences. (2015). Get in touch with science and math! Retrieved from www.etouchsciences.com/ets