American Action Fund for Blind Children and Adults
Future Reflections Special Issue: The Individualized Education Plan (IEP) ACCESS
by Chancey Fleet
Reprinted from Braille Monitor, Volume 60, Number 7, July 2017
From the Editor: Most of us who are blind learn an assortment of nonvisual techniques for handling our everyday tasks. Through our creative uses of hearing, touch, and common sense, we often explore realms that challenge society's expectations. Nevertheless, there are times when information from someone who can see is very helpful. Until recently, our only options were to ask for assistance from a sighted family member, friend, or stranger, or to secure help from a volunteer or paid reader. Technology now offers some promising new possibilities, which Chancey Fleet calls visual interpretation. Chancey is a technology educator with the New York City Public Library system.
If you're a blind person with a smartphone or tablet, you can use it to get visual information on demand. This genre of service is relatively new and can go by many names. You might hear it called remote visual assistance or crowdsourced vision. Personally, I prefer the phrase "visual interpretation" because it precisely names the process of turning visual information into something more useful, and because the concept of an interpreter is familiar to people in many walks of life.
Working with a remote visual interpreter can be liberating. You decide what your interpreter can see, when the interaction begins and ends, and whether you need a second (or third, or tenth) opinion. A virtual interpreter can't touch anything in your environment, so you can't be tempted or pressured to abandon a task that is "too visual" to someone else's hands. Remote visual interpretation can be an empowering option when you'd like to limit the extent of your interactions with the public or avoid turning friends and colleagues into de facto describers. It can be a great help when no one is available to give you the information you need.
A variety of apps provide remote visual interpretation. Although they vary in price and functionality, and although there are differences in whose eyes are on the other end of the connection, there are some things you'll want to consider when you use any of them.
Whatever you do, read some online reviews before you buy, or take a friend's headphones for a test drive. People tend to feel deeply about their audio gear, and no one choice is right for everybody.
How it works: Snap a picture or upload one from your camera roll, and a combination of machine vision and crowdsourced web workers will send you a quick description. Typically, your answer arrives within twenty seconds and is short enough to fit on a fortune cookie.
When it shines: For the simple things. TapTapSee is great at identifying products and describing photos in brief. I use it on a daily basis to sort and label mystery items in my home and office, get real-time feedback about the photos I'm taking, and double-check that my pen has ink and my handwriting is legible. TapTapSee descriptions are text-based messages that can be read with magnification, speech, or Braille.
How it works: Take one or more pictures, or upload them from your camera roll. Type or record a question, and listen for text and audio replies to come rolling in from sighted volunteers over the course of twenty minutes or so.
When it shines: For rich detail, diverse opinions, and a nuanced understanding of what different people notice when they look at an image. I use BeSpecular to ask for detailed descriptions of clothing and jewelry, ideas about what to wear with what, guidance in picking the "best" photo from a set, and impressions of photos and objects that are important to me. Once I've heard five or six different takes on the same image and question, I can find the patterns of consensus and divergence among the responses and arrive at my own informed understanding of the image. BeSpecular finds a happy medium between the brevity of TapTapSee and the live connection used by other apps. There's something special about BeSpecular's format of long-form questions and answers. Outside the rhythm of a live conversation, BeSpecular answers feel almost like postcards from a sighted correspondent passing briefly through your life. They're often full of detail, personality, and emotions such as surprise and humor. Once, while delayed on a train at Union Station in Washington, DC, I asked BeSpecular to relieve my boredom by describing the scene outside my window. One respondent sent me an audio reply that explained, in a tone that was equal parts delighted and chagrined, that I had unfortunately sent her the most boring view she had ever seen. It was one train car, an empty John Deere forklift, and a cloudy sky.
How it works: Connect to a sighted volunteer who speaks your language and have a conversation about what he or she sees through the lens of your camera.
When it shines: For exploring, sorting, and troubleshooting. Every time I arrive at a new hotel, I check in with BeMyEyes to take the decaf coffee pods out of play, sort the identical little bottles in the bathroom, and learn the thermostat and media controls. I also use it to find out which food trucks are parked on the streets near my office, decipher mystery messages on computer screens, and grab what I need from my local bodega. Since BeMyEyes is powered by volunteers, I try to make the interaction upbeat and fun and let the person I'm working with decide whether they'd like to bow out of a long task after a certain amount of time. There are just over half a million sighted volunteers and about 35,000 blind users currently registered with the service, so you can call as often as you like without fear of bothering the same person over and over. The system will always connect you to someone for whom it is a reasonable hour, so Americans calling late at night or early in the morning will be connected to wide-awake people in Europe and Australia. Since the volunteer base is so large, you're likely to get through to someone quickly, even when lots of other blind users are connecting.
How to pronounce it: It's a hard I, so pronounce it as "Ira."
How it works: Use your phone's camera or a Google Glass wearable camera to connect with a live agent. Agents can access the view from your camera, your location on Google Maps, the internet at large, and your "Dashboard," which contains any additional information you'd like placed there.
When it shines: For tasks that are long, context-dependent, or complex. An Aira agent can start from any address, use Google Streetview to find a nearby restaurant, glance at online photos to clue you in to whether it's upscale or casual, suggest and explain the best walking directions to get there, read the daily specials when you arrive, and show you where to sign and tip on the check when you're ready to leave. Agents have watched and described completely silent YouTube videos with me so that I could learn origami models, counted heads in my local NFB chapter meeting, described twenty minutes of nothing but socks until I found the perfect sock souvenir, read online guitar tabs for me so I could write them down in my own notation, helped me pick out nail polish, and taken spectacular photos through my camera for my personal and professional social media accounts. Aira agents are great at reading handwriting, diagrams, and product manuals that seem to have as many pictures and icons as words. When I can't read something with OCR (optical character recognition), Aira can almost always help.
Aira agents are paid, trained professionals. Most of them are unflappable, effective describers who are up for any challenge. Since you pay for their time, you should feel comfortable about asking for what you need, being assertive about the type of descriptive language that works for you, and calling whenever the need arises.
Like any new technology, remote visual interpretation solves old problems and creates new ones. To use it well, we need to understand what it requires in terms of power, data, planning, and effective communication. We must employ it with sensitivity to our own privacy and to the legitimate concerns that people sharing space with us may have about cameras. Just as each of us makes different decisions about when and how to use a screen reader, the descriptive track of a movie, or a sighted assistant in daily life, each of us will have our own ideas and preferences about how visual interpreters fit into our lives. Blind and sighted people working together are just beginning to discover how to use language, software, and hardware in ways that employ visual interpretation to our best advantage. Collectively, we still have a lot to learn. The journey is long, but the view is phenomenal.