Michael Gosse (19507 bytes)
Michael Gosse

Publishing Tools: Converting Obstacles to Opportunities

by Michael Gosse, Ph.D.

From the Editor: At the time of this conference Dr. Gosse was the Systems Engineer at the National Center for the Blind.

Good morning, and thank you for giving me this opportunity to address the U.S./Canada conference on Technology. It is a great privilege to have the opportunity to address you this morning and to begin a dialogue on what I believe is a very important topic that will have significant impact on the lives of blind people throughout the world. My presentation today will focus on issues surrounding the interface between desktop publishing software, publishers, and the Digital Talking Book standard file formats.

But this issue is really larger than the community assembled at this conference. The consumer market for electronic books encompasses the entire population of the earth. If the true capacity for electronic publishing is realized, people who can't read, speak the language, or physically turn a page will have access to books from across the world. The National Federation of the Blind, through its NEWSLINEŽ and Jobline initiatives has demonstrated that access to published information can be provided in a timely manner to the population through a simple telephone. The National Library Service for the Blind and Physically Handicapped is leading an effort to develop standards that will provide the basis for electronic books to be available to the blind. This DTB standard will allow books for the blind to be produced more quickly and will provide features that give blind people access comparable to that enjoyed by the sighted.

There was a time when even print books were scarce because they were reproduced by hand. As a result only those who could afford books were able to read. This was dramatically changed with the advent of moveable-type printing. This invention allowed for mass production, greatly reducing the cost of books and other printed material. Since then, reading and writing have become essential for success in almost every facet of modern society. Now, in the information age, it is expected and mostly assumed that all people read. And yet, reading and writing are only a part of what the Information Age represents.

It is no wonder that in this high tech world software tools have been developed to assist people in all aspects of information generation and retrieval. Word processors correct our spelling and grammar, suggest that it knows we're writing a letter, offers assistance, and allows others to collaborate and edit our work. Most word processors these days even go so far as to offer templates for various tasks. Readers on the other hand can search huge volumes of information for key words and even concepts, but only if the data are published in electronic form and only if the document is properly prepared.

If you have done a search on the Internet lately, you probably have a clue about the effectiveness of the search process. You may find thousands of documents that match your search requirements, but few are relevant to your desired objective. This is not so much the fault of the search engine as of the electronic document. Search engines rely heavily on document structure to add significance to the words. Poorly marked-up documents result in poor search performance. As long as documents are produced with visual appeal as the central focus, the blind and sighted alike will be unable to take full advantage of the information revolution.

The National Information Standards Organization (NISO) is assisting the National Library Services for the Blind and Physically Handicapped (NLS) with the development of a digital talking book standard. The DTB standard will consist of file specifications for the delivery of synchronized text and audio as well as other multimedia content. Rather than reinvent information systems for text and multimedia, the NISO DTB committee has wisely chosen to reuse standard technology. A rigorous set of end-user requirements has led to a file specification based on industry standard concepts.

Extensible Markup Language (XML) and a DTB Document Type Definition (DTD) are the core components of the standard. These technologies appear to be the central components of the future of Internet publishing. Most leading Internet publishing software supports both XML and DTD's. This means that Digital Talking Books can be marked-up and published using commercial, off-the-shelf products. Experience suggests that future versions of these standards will retain backward compatibility and therefore increase the useful life span of each generation of digital talking books.

In contrast most desktop publishing software, such as Quark Express, produces documents in proprietary file formats. Although many tools are available which translate between file formats, information is often lost during the translation process. Several de-facto standards have emerged for portability of files between software packages and computer platforms. Among these are Portable Document Format (PDF), supported by Adobe, and Rich Text Format (RTF), supported by Microsoft. The main objective of these formats is to produce the same appearance of the document across hardware and software packages, platforms, and versions.

In both these representations document structure is essentially ignored. For example, translation of table data into PDF from a desktop publishing tool may be accomplished by positioning the table cell data using tabs, spaces, or pixel coordinates on the page. Visually the document will look the same, but all information relating to the tabular nature of the data is discarded.

Transformations of this nature are called lossy. In situations like this, human intervention and visual inspection are often required to restore the meaning of the original document. Automating the process is nearly impossible, defeating a significant advantage of electronic information exchange. File exchange formats like PDF and RTF may be acceptable for works that will wind up in hard copy, but they are antiquated for electronic-information processing. There is more to a book than simply making it look appealing on the page.

In order for electronic information publishing to reach its full potential, data exchange must retain all structure, content, and related information. For document translation to be useful for the electronic publication formats such as Open E-Book, DAISY, or Digital Talking Books, developers must retain the visual presentation as well as the content and intent of the author and publisher. Furthermore, additional information can greatly enhance the value of the published document. Pictures, graphs, and charts can be enhanced with detailed text descriptions. The same is true for other multimedia data such as sound and video clips.

This is where the publisher's task begins. If the software vendors provide the tools to allow accurate electronic representation of published data, then the publishers must take the initiative to use the tools consistently and to their fullest extent. Styles, templates, and procedures must be developed to ensure consistent representation of text and multimedia data. It is essential to keep in mind that a little work up front will save significant effort down the road. Finally, publishers must take the initiative and insist that the software tools provide direct translation into standard electronic book formats including the NISO DTB.

Desktop publishing vendors have produced powerful tools to automate the publishing process. Now they need to enhance those tools to provide a mechanism that allows the author and publisher to do more than just position text and images on the page. Publishers need to be able to define content and structure and publish the result in standard file formats. These modifications to existing software tools will require significant thought, insight, and cooperation--not to mention a little modification to existing code.

The future of books, magazines, and newspapers lies in their reincarnation in electronic form. The blindness community has seen the potential for electronic publication. We have explored the possibility and taken significant steps to embrace it. The Internet has served as the great equalizer that will soon level the playing field for the blind. Without the cooperation and resources of the rest of the world, equal access to printed material for the blind may be delayed. But, in the not-too-distant future, blind and sighted alike will rely on our computers to retrieve, analyze, and read more than any one person can manage today. Publishers and software vendors can decide to make the changes now or later, but the changes will come. It is safe to say that the best products will make billions and the mediocre will be forgotten.