Today's Artificial Intelligence: Our Future for Reading and Writing STEM?

Braille Monitor December 2024

Today's Artificial Intelligence: Our Future for Reading and Writing STEM?

by Al Maneki

From the Editor: Whether you are fascinated by artificial intelligence, skeptical of the claims made about it, or somewhere in between, this article is for you. If you are one of us who has been frustrated by the inability to get mathematics textbooks produced in Braille and competent helpers to read and write what you need to excel in this field, the article you are about to read will speak directly to your frustration, offer some suggestions, and provide a good bit of hope. If you enjoy reading this magazine because it occasionally stretches your mind and takes you to places you’ve never been before, you have a treat ahead of you.

Al Maneki has one of the finest minds I know. He is generous with his time and gladly offers his significant intellect in helping us move forward when it comes to science, technology, engineering, and math. It is obvious from his contributions and accomplishments that he has made a number of great decisions in his life, but the one I think I admire most is his decision to marry Sharon Kelly. I understand she had a part in that decision as well, but wow, what a dynamic duo! Here is Dr. Maneki’s article:

The author acknowledges the valuable contributions of David Austin, Rob Beezer, Michael Cantino, David Farmer, Karen Herstein, Alexei Kolesnikov, Martha Siegel, and Volker Sorge to this article. They have read several drafts, including the final version of this article. They have offered many valuable comments and suggestions for improvements. As always, the author assumes all responsibility for errors and oversights. For comments and questions, please email me: [email protected]. -Al Maneki

Introduction

Much bandwidth has been allotted in the public media lately about the growing uses of Artificial Intelligence (AI) and its impact on our daily lives. We have heard from the boosters who claim that AI will dramatically reduce the drudgery of human existence by eliminating the most boring tasks. We have also heard from the nay-sayers who claim that this time is different: “Let’s not go there.” They worry that this time AI will eliminate so many jobs, even those requiring intellectual abilities, that our livelihoods will go to ruin, leaving us with little purpose in life. These pessimists maintain that the job growth stimulated by AI will not be sufficient to compensate for the job losses created by it. The jury is still out on this. Along with the benefits that AI has already produced, we have also seen its abuses, e.g., the production of videos, for nefarious purposes, showing perfect replicas of prominent personalities espousing points of view that are contrary to their well-publicized value systems.

AI has been around for many years. I took a course in AI over thirty years ago. The textbook for this course was Artificial Intelligence by Patrick Henry Winston, now available as a free PDF download from https://courses.csail.mit.edu/6.034f/ai3/rest.pdf. The AI of today is much more robust than the AI of thirty years ago. Computing hardware today is cheaper, faster, and smaller. Advances in AI’s computing hardware are a prime example of Moore’s Law, which futurist Ray Kurzweil has discussed with us in any number of speeches he has delivered before our annual NFB conventions. Moore’s Law states that “the number of transistors on an integrated circuit will double every two years with minimal rise in cost.” Since I studied from Winston’s book, AI has benefited from further advances in machine learning. For example, Alexei Kolesnikov, one of the automated Nemeth translation collaborators, used Google’s NotebookLM to produce a podcast based on an earlier draft of this article, available at https://wp.towson.edu/akolesni/files/ 2024/10/Podcast-about-Al.wav, using only the draft’s Word file and no input from humans. Advances in AI have also been stimulated by spectacular advances in neural networks or models inspired by the structure and function of biological neural networks in animal brains. According to the popular science writer David Berlinski in his book The Advent of the Algorithm (Library of Congress Braille edition, BR13263), the concept of the algorithm, an essential component of AI, has its humble beginnings in the ancient Greek writings of Aristotle.

In the remainder of this article, I will use the terms “AI”, “AI algorithm” and “AI device” interchangeably. I also want to familiarize readers of NFB’s publications with some of AI's terminology and how AI works generally. Then, based on my limited experiences with AI, I present my views on AI’s possible impact on how blind people can benefit from it in the areas of learning and doing STEM.

A Bit of Background

Although the term “AI” has only recently appeared in the public discourse, the NFB has been involved in AI research before this term came into vogue. In 1975, I attended my first national convention in Chicago. During one of the general sessions, Dr. Kenneth Jernigan introduced us to Ray Kurzweil to talk about his remarkable reading machine. Dr. Jernigan opined that Kurzweil’s reading machine would expand the availability of reading materials to us. The first Kurzweil Reader was a floor model console, too heavy to be moved by one person. Unfortunately, Dr. Jernigan did not live long enough to see the day when the KNFB Reader, much more powerful than the first Kurzweil Reading Machine, was tiny enough to be loaded as an app on our smart phones.

In more recent history, the NFB has sponsored the Blind Driver Challenge, the initiative to develop a vehicle which a blind driver could operate independently. As a result, in 2011, Mark Riccobono independently drove a prototype through an obstacle course on the Daytona International Speedway. This test vehicle was developed by an engineering team from Virginia Tech University. It should be noted here that obstacles on the course were laid down after Mark began to drive this vehicle. These two examples now fit neatly into a branch of AI known as machine vision.

It is easy to imagine other ways in which machine vision can help blind people. For example, if we are in an unfamiliar building, an AI could, using a map of that building, direct us to a specific location. If we must get to a certain floor in that building, the AI could direct us to the elevators, or even direct our hands to the key panel to call for that elevator. Applications of machine vision to serve as visual aids may already be under development. However, this is not the area of AI I want to discuss.

Let us remind ourselves again that many technologies have been previously oversold to us. In this case, AI will be no different. Regardless of the technology, there will always be tasks and functions we can perform more efficiently with alternative techniques. As technology changes, however, we may have to adapt some of these techniques to work with new devices and new modes of thinking. While it is difficult to see exactly what impact AI will have on our lives, we should always examine the offerings of AI carefully. We should not hesitate to call out its flimflams when the promoters claim that what they have to offer is the next wonder drug or the greatest invention since sliced bread. At the same time, we must also recognize the benefits of AI when truly innovative ideas are brought before us for consideration. As in the past, we, the organized blind, will work closely with the innovators and inventors who best understand our needs.

STEM activities may be broken down into two parts: learning it by reading textbooks, lecture notes, and research papers; and disseminating our work in Braille or print. In this article, I will pay special attention to the uses of AI for the translation of spoken mathematics into Nemeth Braille. There will be an obvious spillover into other STEM subjects because math is virtually involved in every aspect of STEM.

With AAF (American Action Fund for Blind Children and Adults) funding, the work I have been conducting with my academic colleagues has been involved in the automated translation of PreTeXt-specified math content into either Nemeth Braille or synthetic speech. An additional NSF grant has allowed us to make improvements in the automated process of translating graphics from print to tactile form.

Moving in the reverse direction, given a document in Nemeth Braille, we may wish to have it produced in a printed format, enabling us to communicate our ideas and results with sighted readers. To date, no research has been done in this direction of information flow. This is where I think AI comes into the picture. More on this later. There is also the challenge of converting a verbal description of a mathematical diagram into embossed or printed formats. This poses an even greater challenge. But if you believe the pronouncements of AI’s most enthusiastic advocates, all things are possible with AI.

The two essential components of AI are the algorithm, a mechanical procedure for arriving at the most probable conclusions or deductions based on a given set of data, and the computing hardware that is necessary to make the calculations required by the AI’s algorithms.

In order to arrive at correct conclusions, i.e. the actions taken by human subjects given that set of data, the best algorithms require enormous quantities of data/response pairs that have been accumulated. We may think that it is an easy task to recognize the voice of a specific individual or to identify the face of a particular person in a huge crowd. But underlying these tasks is an enormous quantity of work performed by our brains to make these correct judgments. To duplicate these human mental tasks electronically requires an enormous amount of computer power. Today, our best computers can barely approach human mental capacity. However, with the development of modern microchips, i.e. the integrated circuits that are packed into minute pieces of silicon, we are better able to process these massive quantities of data/response pairs. We may think of these integrated circuits as direct translations of lines of computer code specified by the AI algorithms. Thus, the millions of calculations required by these algorithms can be computed in microseconds.

But What Can It Do for Us?

When I think of what AI could do for us in STEM, I think of my own experiences using human readers. My entire math career has involved the use of readers one way or another. I’ve had many readers all of these years, some better than others. The best readers were with me for longer periods of time. With uttering every symbol (comma, dot, left parenthesis, left bracket, etc.), the transmission from written to spoken math or vice versa can be extremely time-consuming or most boring. But, given time and experience, the rapport that developed between me and my reader could, in most instances, enable us to dispense with all of this mathematical verbiage and communicate the exact context entirely from the manner of speaking. For the most part, we tend to speak quite consistently in terms of pauses and inflections of voice. The direction of speaking went both ways. When a textbook or research article was being read, I was the listener. When I was dictating a homework assignment, course or seminar lecture, or research paper, my reader was the listener. No matter who the speaker was, the listener deduced the exact mathematical context from the consistent manner in which the pauses and inflections were employed.

As an example of inexact verbiage, the phrase “a slash b plus c” could be interpreted as either a/(b+c) or (a/b)+c. With experience, I could understand which was meant (depending on how my reader read it), or my reader could understand it (depending on how I said it). There are numerous examples of this type of ambiguity in spoken math. With sufficiently many samples of how an individual speaks math compared with the correct written expressions of those spoken samples, an AI algorithm could “learn” how to interpret your spoken math.

There are cases where a spoken expression bears absolutely no resemblance to what is written. The instance of this which immediately comes to my mind is that of the binomial coefficient, a staple in many required undergraduate courses. The binomial coefficient is represented by a column of two positive integer variables, n and k, with k less than n, in which n is written above k with elongated parentheses surrounding the column formed by these two variables. Here, the binomial coefficient is defined as n!/((n-k)! k!), where n! represents the product of integers from 1 to n, (n-k)! represents the product of integers from 1 to n-k, and k! represents the product of integers from 1 to k. When we refer to the binomial coefficient of n and k in speech, we could say “the binomial coefficient of n and k” or “n choose k” or “n C k”. (The use of the word “choose” here refers to the fact that there are exactly “n choose k” ways in which a subset of k objects can be chosen from a set of n objects). The AI algorithm should contain instructions to recognize either of these three spoken forms as the column of n and k described above.

Word processors and software editors are equipped with compilers which enable them to recognize spelling and syntax errors. When a compiler detects such an error, it offers the user a range of choices to correct this error. It also enables the user to instruct the compiler to ignore the error in this case. In a similar vein, a UEB or Nemeth compiler could be developed and installed on refreshable Braille displays to aid users with suggestions for correct code usage.

AI software is not needed if all we want to do is to have a tool which aids the user in typing correct UEB/Nemeth Code. Here, the UEB/Nemeth compiler is sufficient. However, AI will come into the picture if we are ever to produce UEB/Nemeth Code on our Braille displays directly from speech. In this case, a UEB/Nemeth compiler is an absolute prerequisite if an AI algorithm is to be written for UEB/Nemeth Code from human speech. If the produced Braille code is not consistent with what the speaker wants, the compiler will be the means through which the user communicates the corrected code. A UEB/Nemeth compiler could serve as the conduit through which an AI algorithm “learns” the correct AI interpretation of spoken text.

The typical math document that I, or anyone else, dictates to an AI will consist of a mixture of UEB and Nemeth output. It would be most desirable if the AI were smart enough to know when UEB was to be used and when Nemeth Braille was to be used. Short of this capability, we should have a switch on our Braille displays to set the AI in UEB mode or in Nemeth mode, depending on what is needed.

Just as a word processor still requires the user to have a knowledge of the rules of English grammar, an AI algorithm would still require its users to have a command of the UEB and the Nemeth Code. Without this knowledge a user is totally dependent on what the AI recommends, a most unsatisfactory situation.

When I was doing mathematics in graduate school and at my job, in the interest of saving time, I developed my own shorthand Nemeth, ignoring the rules for exact usage according to context. After all, I knew what I was writing about, so the context was always clear to me. I would then dictate a math document to my human reader who would write my spoken material into perfect printed notation. It seems to me that the ideal Nemeth AI algorithm would work in the same way. As I am reading from my Braille notes, the algorithm would translate it into perfect Nemeth Braille code. Since the rules of UEB and Nemeth Braille are precise and since the printed math notations are precise, AI should not be required to translate from Braille to print, or vice versa. As far as I know, we still do not have software for reverse Braille to print translation.

If you have taken a number of math courses, you have probably endured the professor who has lectured by speaking minimally and writing minimally. He would often say something, then point to this or that item on his blackboard, and simultaneously say something like “From this (pointing) and that (pointing), we conclude that …,” or he would write his conclusion that seemed to have no resemblance to what he had previously written or said. Often times such antics would leave even the sighted members of the class befuddled and confused. An AI, possibly installed on our smart phones, could combine spoken and blackboard materials into a more comprehensible form that would benefit everyone.

Also, think of the ways in which a math AI could assist in test-taking. Suppose there wasn’t sufficient time to produce a test in Braille, large print, or spoken form. University accessibility support services hesitate to let us use our own readers. The readers they provide don’t always know how to read the math content. How much simpler it would be for us and for the DSS offices if there were a math AI to read test questions to us and take dictation of our answers if needed.

Perchance to Dream…

The problem with reading math or building OCR software for math is that math is not consistently read linearly. There are times when within a line you must read vertically (think of subscripts and superscripts, limits of integration, or binomial coefficients). In our work on automated Nemeth translation, we have evaded this problem by extending the PreTeXt authoring language to specify items for Nemeth translation and UEB. It may be possible to build a neural network capable of parsing a page of printed math and reconstructing it for tactile or spoken formats.

Even if the math AI that I have suggested were to be built, I hope that such an AI would never dispense with our need for human readers. The value of personal contact and working relations should never be discounted. The reason I was so successful in getting classmates to read was that it afforded us time to study together, learn by asking questions of each other, and sharing what we had learned. The use of readers also gave me experiences that have carried over into daily as well as professional activities. I developed the confidence to sell myself to potential readers by explaining how beneficial it would be for both of us. I learned how to schedule my study time efficiently, how to adjust to the schedules of others, and how to plan the work I needed my readers to do in a limited amount of time.

Given the state of AI algorithms and hardware today, the AI that I have described for translation from human speech to Braille/print is achievable. But it is unreasonable to expect commercial adaptive technology vendors to undertake the massive research and development efforts that are needed to put this kind of AI application together. The number of Nemeth readers is just a small fraction of UEB readers. What is needed here is a massive collaborative effort between the organized blind movement, the universities, the science and math organizations, and the adaptive technology vendors. Before we can begin this collaborative effort, we, the organized blind, must have a clear and unified understanding of the AI products that we want. This article is just the first step in coming to this understanding. Others will have different ideas that need to be considered. Once we know exactly what we want, then we will be in a strong position to promote our ideas, recruit the talent that is needed, and secure the needed funding. This effort will require much more than the generous funding that AAF has previously given us. Obviously, the talent we have gathered around us for automated Nemeth translation will not be sufficient. Professionals with other skill sets will have to be recruited. However, we are off to a strong start with the team we currently have in place. We have established strong working relationships with the academic and government sectors. We need to extend these relationships.

I don’t have the foresight possessed by futurists on the order of Ray Kurzweil. But I remain convinced that given the language recognition possessed by today’s smartphones and smart speakers, what we want is well within the realm of possibility. I’m not suggesting that my ideas for the future of AI in STEM are entirely correct, but I hope that this article will stimulate others into thinking about what AI could do for us and bringing their ideas to the table. Perhaps the brighter souls among us could even take part in writing the code for some of these AI applications. After sixty years of learning and doing math, and watching all of the technological developments, I find myself on the side of the boosters, at least in the area of AI applications to help us with STEM. The broad goals we set now may only be accomplished incrementally. But let us never lose sight of what we are after. Let the future of AI for us begin, now!

(back) (contents) (next)