Implications of Digital Talking Books and Beyond

by George Kerscher

From the Editor: George Kerscher is a research fellow with Recording for the Blind and Dyslexic and the product manager for the DAISY Consortium.

Mr. Chong asked me to talk about the problems associated with using publisher files. He also asked me to go through what DAISY is and what it's about, and also tell you about the open-electronic-book technology that is evolving. He said to do this in twenty minutes. I'm going to move really rapidly, and I apologize for my machine-gun presentation, but there is a lot of content, and I think Michael [Gosse] is going to convey the kinds of information that you as leaders in your organizations need to know in order to make strategic decisions.

First of all, talking about the publishers and publisher files, which we have been using to some extent over the past dozen years or so: When you ask a publisher for the files, they usually have postscript available because their product is print. They're print publishers; that's what they do, that's what they know how to do. That's their area of expertise. In order to get something more useful, you have to go upstream in the publishing process to what created those postscript files. In the K-twelve arena, for example, it's Quark predominantly--70 percent--and in higher education it begins to diversify, and you get all kinds of different file types.

It's getting better, but the publishers may not even have those files to begin with. They may subcontract that composition out. They may give it to typesetters. They may not think that that is very important to them because they've got the postscript files, which are their bread and butter. So in many cases they don't have the files at all, but that's getting better because the publishers are beginning to wake up to the fact that these files may be important.

In general, the publishing field is not the computer and technology field. They're not necessarily on the cutting edge of technology every minute of the day and night. They established procedures that they use, and that's exactly what they do. So availability of the files is the problem.

Of course the file types vary even within a particular set. If we get Quark files, we have to keep what version of Quark was used to produce it, whether it was MAC or Windows, and the macros and other styles are different from publisher to publisher or even within the same publishing house, from department to department. So there's a wide variety in the files.

The types of files that we get are heavy with visual markup and visual presentation: all kinds of information about where things are positioned on the page, what font it is, whether it is bold--all that kind of information. Rarely is the information marked up with any kind of semantically important information. We don't know what things are. We don't know what's a paragraph. We don't know what's a heading. We know all kinds of visual things, but it's not marked up semantically with meaning. So when we get these files, we have to convert them, and it usually takes a fairly skilled person, a programmer, by and large, to have the first go at the files to do some of the analysis and try to figure out what to do. A lot of work is involved with technical people or trained technicians to convert these files into something useful.

That's the experience that we've had, and it is time-consuming. These are the same kinds of questions that the publishers have had to deal with when they make files available, for example, to the state of Texas.

I'm going to move on to the DAISY Consortium. The DAISY Consortium is made up of libraries and organizations that provide information to people who are blind and print-disabled throughout the world. The major organizations, the full members, include Recording for the Blind and Dyslexic in the United States and RNIB in the UK, and we have full members in Sweden, Denmark, Germany, Switzerland, Australia, New Zealand, O.N.C.E. in Spain, and in Japan the Japanese Society for Rehabilitation of Persons with Disabilities. In addition, we have associate members all over the world, on every continent, except Antarctica. We have about twenty-four associate members. In the United States and Canada they are CNIB, who's been very active in working on the development of the training materials and the guidelines, a very active group--love them; INLB in Montreal recently joined; also APH; the Hadley School for the Blind; Arkenstone; and at the beginning of the year Minnesota State Services for the Blind.

DAISY stands for Digital Audio-Based Information System. Originally it started to develop a replacement for the existing analog technology. But that's changed a bit, though the name has stayed the same. The mission of the DAISY Consortium is to develop the information technology for the next generation for people with disabilities. It also intends to develop ways of sharing the information with libraries around the world. Four-track has been dominant in North America, but in other countries six-track; two-track; half-speed, two-track: all kinds of different formats have been used, and that difference in format has prevented inter-library loans between the various organizations. So we're striking out to accomplish that mission.

What we've been developing is actually a multi-media presentation because audio by itself has no structure to it. So we use text to add the structure and navigation points to the information. What we've actually got is marked-up files, tagged files, using HTML. We are also using XML and developing all new technologies in XML. As you may know, HTML--the version 4.0 that we have right now--is the last HTML. From here on out it's going to be XML because it's much easier, much simpler to process. It's a much better system. Actually XML is a simplification of SGML, which has been around for a long time but just too bulky and cumbersome to use. So XML is it; XML is good.

It also uses DTD, document type definition, which identifies the components of the document. We have paragraphs and headings and list items, things like that. The other component, in addition to the text marked up in XML, is SMIL (synchronized multi-media integration language). This is a standard of the W3C [World Wide Web Consortium]. So throughout, what DAISY has been using is international standards based primarily on W3C recommendations in order to develop its own specifications.

We have come up with six categories of books. One is very simple with just a title. It is not what most of us would prefer. Probably one of the most common types of books will be a book that has a full table of contents, page numbers, and the full audio recording so that you can navigate through the digital Talking Book or go to any page, chapter, section, or subsection and move around very quickly to get to what you want.

The other types of books include full text. The ideal, of course, is to have full text and full audio, and that particular multi-media would provide everything that has been recommended by the work groups--Curtis Chong headed up a work group which identified the user requirements. It's the full-text and full-audio category that meets the user requirements laid out by that body and by a similar body in Europe established by the European Blind Union. That's the ideal.

Also, once you've got the full text, the Braille is possible, and Joe is salivating. Joe Sullivan is ready to go, and we're really excited about his efforts and work in taking input of these types of files and outputting great Braille.

The last type of book is just text. There is no audio associated with it. You would be using synthetic speech, refreshable Braille, large print, or a combination of those things to read that type of book.

The standards that DAISY has developed so far--2.0 was a recommendation in September of '98, and that provides for headings, page numbers, and audio; but that's the extent of it. Two point zero one, which was just passed a couple of weeks ago in September, adds additional functionality, including full text in XML, and it includes footnotes, sidebars, and figure descriptions that can be turned off as an option by the reader. Gilles [P`epin] is going to be doing a section later on the players and reading-devices side of things, so I will leave all that for him.

Of course one of the most important things that is going on right now in standards is the National-Information-Standards-Organization (NISO) work that NLS initiated with Digital Talking Books, bringing together the key players in North America who are involved with Digital Talking Books and developing the specifications. DAISY was invited to participate in this work, and there's been an absolutely fantastic hand-in-glove cooperation between all the organizations to make sure that the standards developed by NISO lead to a single world-wide standard. I would expect that the DAISY specifications that come out in the future will actually be NISO specifications and recommendations. So you can rest assured that all the players intend to come up with one worldwide standard and not a divergent group. It's been great working with everybody on that team. Michael Moody in particular has been a fabulous person to work with.

The DAISY Consortium is developing production tools. We will very soon be coming out with our first release. It's in final beta. We're about a month away from releasing something called LpStudio/Pro. It will be a full-fledged production tool, beginning to end, for digital talking books. It's been a year and a half in development and most of the money contributed by the members has gone toward that software development. We're also working on A to D conversions (analog to digital) in which we digitize existing tapes. The hard part of that is going back and adding the structure so that students have the kind of navigation they want in their textbooks.

The distribution mechanism that we're looking at right now is CD ROM. It's very common, and we think that's going to be around for I don't know how long, but CD ROM distribution is something that is here right now and that we're doing. We know that RFB & D, for example, will have to support the cassette for some time. So the CD ROM and the cassettes will be in parallel for a certain period of time. It's a real pain to have to do that. But it's something that's inevitable when you've got an infrastructure as strong as we've got in the four-track system in North America.

Of course long term the goal is to have Internet distribution either streaming this information or downloading it to solid-state devices. Everybody wants to get there as quickly as possible and as the band width improves, we'll move in that direction. Because we're using Internet-based standards and technology, that kind of transition is really built in. Some people are saying, "What are we going to do: are we going to distribute on CD ROM? or wait?" But the digital technology allows for the migration to any type of distribution mechanism that is right for a particular organization.

Moving on to Open Electronic Book: who are the players here? We've got Microsoft, who is very interested in electronic books; manufacturers of Rocket Book and Soft Book, which is just a hand-held book with a screen and some functions like page forward; Glass Book, who's very interested in security and digital-rights management. Many publishers are involved. About fifty to seventy-five organizations have been involved in writing the standard that was published on September 21. OEB (Open Electronic Book) 1.0, was published at that time. It was really designed for legacy documents.

They know that we've got this problem with the publishers having postscript files, some of them having Quark, but they don't have real good files that are marked up in XML. The specification really leans heavily toward making it easy for the publishers to convert to this format, 1.0. In this way books can become available. The devices can grow and prosper. At the meeting Microsoft said that it is going to have Microsoft Reader available in the first quarter of 2000 to support open electronic books. There's a lot of support. This is the first time we've really seen a large effort in the publishing arena to move forward with electronic books.

Open E-book is like an amoeba; there's no formal organization. I was involved with the 1.0 standards, trying to take this train which was moving forward at an incredible pace and make sure that some of the accessibility features get put into it. They worked from a base of HTML 4.0 and wrote a DTD in XML that is based on 4.0. We were able to carry with it all the accessibility features developed by the WAI [Web Access Initiative] by using that strategy. It was nip and tuck at times about being able to do that. But by and large the OEB folks see that the requirements the blind community has are really no different from the requirements that the sighted community has, in file specifications that is.

So OEB now needs to form a real organization that can move this forward. That's underway. There will be a meeting December 14, 15 in San Francisco.

A big issue with the publishers is digital-rights management. That's the protection; it's a wrapper around the information that prevents the data from being sent willy-nilly wherever anybody pleases. Before there is large-scale use of electronic books, the publishers are going to have to be convinced that their intellectual property is secure, and digital-rights management is the way to do it. That's something that will be evolving over the next two years.

The OEB folks know that 1.0 specification is really designed for legacy data, and they want to move toward a second specification that is much richer. That's where the work of the DAISY Consortium and NISO and Open E-Book in the 2.0 version have a possibility to converge their standards. With the DAISY-NISO work we've got objects identified like footnotes, note references, sidebars, notices, index, lists. These things are not defined in OEB. Only the very basics are part of OEB 1.0. They know they need that; They know they want it; and they plan to put it in 2.0. What I am suggesting is that it's strategically very important for our organizations to work toward convergence of standards with mainstream publishing. I think that's a very important point that we are going to be raising at the next NISO meeting next month in Louisville--a very important opportunity there.

Sooner or later publishers are going to start dual publishing, which means they won't just focus on print, but they'll also focus on the electronic book. They'll want to have both books available at the same time. At this point the use of XML and that technology becomes natural. You have the XML marked up with structure and content, and, through a style for print, you go ahead and print. For a style for the electronic book, you include those kinds of styles. It's also a matter of styling for large print. It's a matter of styling for Braille.

A different style can be applied to the same information; it's just a presentation difference. So the structure and content are all there. We're using the same information, but the way the information is presented is controlled by styles, and of course the reading device that one may use to present that information. We have a huge opportunity at this point, when publishers start dual publishing, to make books accessible essentially right out of the box. There will be a single source file for everything, the whole document in a variety of different types of presentation.

What can be done by libraries right now? We can start digital mastering at this point. We are ready to start doing that kind of thing pretty much full scale. We want to work on A to D collection conversion to convert what we've got that is worth moving forward into the future and turn that into the new format. We want to test that with consumers to assure that we're providing the kinds of structure that they need. We want to plan for the launch in the United States of Digital Talking Books right now.

What are the possible future outcomes? The single source file could become ubiquitous in publishing so that they use one single source file for mastering their information, and then they go to different formats at will. We could have access to information at the same time as everybody else. We could use the OEB source for producing full-text, full-audio books. The libraries could also add additional information to the OEB books with figure descriptions and other types of things, elaborations that need to be made on these types of books. We could eventually get to practically a full-audio and a full-text rendering of the information. And eventually the libraries can get down to trying to pay some serious attention to the very tough problems of math and science, which we know are sorely neglected at this point.

I didn't touch on any of the international copyright issues in this presentation, but they need to be resolved before we're going to be able to distribute outside borders. And I didn't touch on DAISY's efforts in developing countries to move this technology and bring this information to developing countries, but I want you to know that's a very important item on the strategic plan within the DAISY Consortium. If you want to know more about DAISY, I'm available to talk any time.