Voice Dream e-reading app: Stellar for text to speech - and promising as a general readerBy David H. Rothman, Published on May 19, 2013
The latest: An update of this post focuses on education-related issues of read-aloud apps. Also, I've just tried a promising Voice Dream beta with paging. Finally, NPR on May 20 ran a segment on developer Winston Chen. - D.R.
A Catch-22 dogs those of us who most often read e-books visually but also want to hear them when we're exercising or driving.
The usual e-bookware doesn't always come with or work with text to speech capabilities. Even if it does, we can't control the aural part as closely as we'd prefer.
I myself like the Moon+ Reader Pro Android app, and I'm in love with the added-on "Amy" voice, a British-accented delight from another developer, Ivona, now an arm of Amazon. But I can't revisit already-viewed text quickly enough while I'm hearing audio by way of the Moon-Ivona combo.
A special read-aloud program isn't the ultimate answer, either, since I'll then be stuck with a weak app for general use. Even based solely on text-to-speech performance, in fact, this category of software can disappoint.
Enter the Voice Dream Reader app for iPads, iPhones and iPod Touches. At $10 it's more expensive than the average app but provides enough value to justify the cost.
Winston Chen (family photo below), a Boston-area man and a middle-aged IBM alum, created VoiceDream during a year's stay on an Arctic island where his wife was teaching. Voice Dream is not a full solution to the above dilemma. But it comes enticingly close, letting me e-mail notes and snippets and enjoy some other important features of a full-strength reading app for general use--while at the same time giving me more precise control over the spoken text than other TTS alternatives do in the iOS world. Significantly, more book-like paging is on the way as an alternative to the existing scrolling. (Update, May 13: Sure enough, a just-related beta has paging--I've tried it and will say more about this and other features in the next day or so.)
A list of Voice Dream's glories is here. The app even includes its own Web browser, as well as the ability to find and download Project Gutenberg books with minimal fuss, and Chen tells me he's open to working with the Digital Public Library of America by way of an API, which could mean similar capabilities. Voice Dream even hooks into Dropbox's search feature. And print-impaired people using Bookshare can also benefit from integration.
The list of positives goes on and on. I still pang for the charming "Amy" to show up in Voice Dream despite her Amazon connection and the risk that the company monopolistic tendencies will overcome a genuine chance to earn goodwill. Hey, Jeff! You can do the right thing. But meanwhile VD--there, I said it; sorry!--offers a built-in Acapella speech engine and a free "Heather," an American-accented voice. You can still hear the robot in "Heather," but she is almost as good as "Amy" (herself not quite 100 percent human-sounding). At least 60 voices in 20 languages are available for a few dollars each: "English, Mandarin Chinese, Japanese, Spanish, French, German, Italian, Swedish, Danish, Norwegian, Finnish, Dutch, Portuguese, Russian, Czech, Catalan, Polish, Turkish, Greek, and Arabic." More languages and other major enhancements for Voice Dream, as both a visual and audio reader, are on the way, including a mode to enjoy books one page at a time rather than scrolling.
Already Voice Dream is living up to its name for members of the accessibility community, in addition to those without disabilities.
Not everyone likes everything in the app, to go by the reviews of the paid version in the Apple app store, even if the average rating is a respectable four-star plus. Still, compared to other iOS apps that allow aural reading from a wide variety of books, this one shines. vBookz EPub and VBookz PDF, for example, as far as I can determine, will not let you take notes, and Blio won't allow you to export your notes to email, your printer, or other destinations, as Voice Dream does.
Mind you, the other products are far from losers; Blio offers multimedia capabilities, for example. But if you especially value accessibility mixed with annotation- and sharing-related features--"musts" for truly superior software in such areas as the upper grades in K-12--then Voice Dream is the champ. vBookz and Blio can't seamlessly pick up items for reading from Instapaper or the Web (the screenshot shows the Voice Dream library filtered to display only Instapaper items--double-click for a better view). What's more, those rivals lack Voice Dream's rich selection of dozens of optional voices, selling for just a few bucks a throw.
Furthermore, Voice Dream's promo says it can read ePub, PDF (though some complain it isn't true to the appearance of source PDF--which would be a nice option, if Chen could offer it, even if it meant that TTS wouldn't work while you were in that mode), Word, RTF, Apple Page, PowerPoint, .txt, and HTML.
Given Voice Dream's obvious merits for nonDRMed books such as public domain titles, librarians and educators should not just try the app (a mere $5 for institutions buying 20 copies or more, and no cost for a demo with very limited read-aloud capabilities) but also provide Chen with detailed feedback. No, I haven't any financial ties with the company, direct or indirect, and if I run across an alternative better than Chen's, I'll talk it up. I'm just eager to see a good and socially useful product succeed.
Chen estimates that 30 percent of the app's users are blind, 50 percent suffer from dyslexia or attention deficit disorders and the like, and 20 percent lack any print-related disabilities. I myself suffer from a minor disability, my difficulty seeing light-weight fonts against a white background; and I resent the persistent indifference of Amazon and many other hardware and software vendors to needs of people with similar challenges. Voice Dream to the rescue, at least for nonDRMed texts read on iOS devices!
Selections within Voice Dream's rich assortment of fonts should please not only the contrast-challenged but also a much larger group, the millions with dyslexia, in the United States and elsewhere. As for those with attention issues, a focused reading mode--shown here--displays just a small amount of text at a time for greater concentration.
Of more importance for me, Voice Dream provides excellent control over both the written and spoken versions of the text. Want to revisit a previous paragraph on your screen without interrupting the audio reading? Voice Dream makes this a snap in either the full page mode or the the focused reading one, and it moves quickly even within long books. In other words, this app excels for people with dyslexia and others whose enjoyment and comprehension of books could benefit from simultaneous visual and audio presentation of text. The yellow in the above screen shot jumps from spoken word to spoken word, so users can associate the actual texts with the spoken sounds. If you are not disabled in the least but want to read Henry James closely, Voice Dream could still be of interest.
An e-mail Q & A with Winston Chen (edited)
Q. Am I missing something, or is there no way to go into a paging rather than scrolling mode?
A. I'm almost finished with this new feature, but still working on perfecting it. It'll work the same as scrolling, except it stops at page boundaries. I really want to avoid doing the left-right page flipping that most book reader want to emulate.
Q. What about the ability to pick up bookmarks and notes and highlights from machine to machine, even across platforms? Via Dropbox you could implement that capability if it isn't there already. Perhaps if there are no legal issues, you could even make the syncing compatible with that of a great Android app, Moon Reader.
A. Synching across devices is easily the most requested feature. The problem is that Apple iCloud synching for the kind of database I used is currently unreliable for bi-directionary synchronization. Until Apple fixes it--which I'm hoping for iOS 7--turning this feature on would be a bad idea.
Q. I truly truly loathe DRM but am wondering if you could arrange with Adobe and/or OverDrive and other library vendors for VoiceDream to be usable with library books. OverDrive, of course, is the main show.
A. I have not reached out to Adobe yet, but I plan to.
Q. How many people are now using VoiceDream?
A. Voice Dream now has 20,000 customers for the paid app and 110,000 customers for the free app.
Q. Will you be expanding to platforms beyond iOS? Which ones? When?
A. The next platform is probably Android. But I haven't committed to it.
Q. What would you do if Amazon dangled a nice offer in front of you? I was among the first to discover Stanza, and I pounded the table for it when I owned TeleRead--only to see it vanish down Amazon's maw. Can you assure us that you won't sell your company, at least in the next five years? And not to Amazon? More recently I talked up Ivona on the LibraryCity site and elsewhere, and then sure enough...
A. I'm aware of Stanza, which I still use despite all. I don't have any wish to sell the company. This company gives me such fulfillment that a lump of cash cannot fill. My main market and future focus remains building tools for people with learning disabilities, primarily for education. I have no intention to be a general purpose eBook reader, even though some people use my app that way. [Oh, but Voice Dream is so close now to being an acceptable GP reader, especially with paging on the way! - D.R.]
Q. Are there any personal reasons for your interest in Voice Dream? Do you have any reading-related disabilities? Anyone in your family?
A. I knew nothing about reading disabilities until after the first release of the app was out. But I quickly moved the product in that direction when I realized that for these people TTS is not just a nice-to-have but a life-changing tool. Then, it was a matter of keeping close to my customers, listening to customers, and making my product better for my customers.
Q. Is it possible that angels should take more risks on one-person app shops like yours, given your success? Or could it be that the real reason for your success is that you were more interested in serving humanity than in making a buck?
A. Since I started working on this app, I couldn't get away from it. I believe a good profession needs to (1) produce decent income, (2) have enjoyable day-to-day work, and (3) makes a positive impact on society. I've always believed that the best jobs in the world have all three and have them in balance. That pretty much sums up my vision for Voice Dream.
I'm working on some very exciting features for the next release which should be out in a few weeks:
-Personal Pronunciation Dictionary
-Adjust default speech rate, pitch and volume for each voice
-Speech rate in the voice settings now overrides the default speech rate for the voice used.
-More navigation options: rewind or fast forward by sentence, paragraph, page, chapter, highlights, bookmarks, 15, 30, and 60 seconds.
-Footer indicate page number, percentage, chapter name, and current navigation unit for rewind and fast forward.
-Additional voices: Chinese, Japanese and Korean.
-Sort by Add Date, Title, Author and Size for articles in the Home screen
-Option to scroll page by page in addition to free scrolling
-Two finger double-tap to play/pause
More on the future of Voice Dream
I asked Chen if he would also add the ability to adjust line-spacing, a more-than-just vexing omission from the current Kindle Fires [update: actually my Fire now has line-spacing, but it didn't work properly when I tested it on a few files]. No luck. In fact, he says I'm the first to request it. Perhaps I shouldn't be surprised since so many users of text to speech programs have low expectations. Anyone else care to speak up, maybe in the comments area of this post? Fortunately Voice Dream's present line-spacing is close to optimal for me. But I truly, truly believe that Voice Dream fans should have that choice.
Another possibility for Voice Dream, as I see it, might be additions within the existing "Advanced settings" menu leading to all kinds of customizations without confusing novices, who could simply ignore them. Chen says that might end up on his list.
Returning the topic to libraries, I asked Chen if he would sell Voice Dream to the DPLA, the innovative Douglas County library system in Colorado, or another noncommercial library organization to distribute to users, maybe even as FLOSS, short for free and open-source software. He is willing to strike library deals, but with a limitation--the product could only be used with content from the respective library collections involved. Same for deals involving publishers' collections. Otherwise, he says, that would be the same as selling the company, which, in the interest of control over the app, he doesn't want to do. Too bad. Whatever Chen has in mind, it is not genuine FLOSS. I myself could envision talented developers like Chen getting large up-front payments for their products, if they beta-tested well, and then additional money could come later on, with the understanding that the developers would hang around to develop new features and oversee user support. Fees might also reach the developers as the user base grew. The advantage of a pure FLOSS approach instead is the possibility of tapping a larger pool of talent, with the main developer still influencing the evolution of the product.
At least, thanks partly to the existence of the DPLA--extremely API-oriented--companies like Voice Dream will have a head start in the library world even if they market primarily to individuals rather than libraries directly, and I'd hope that Douglas County's e-book-tech initiative and any others would take the same open approach. The proprietary DRM issue will remain for public libraries, so that's still a complication. But as noted, Chen is trying to overcome it. In the end, perhaps one scenario would be for libraries to evaluate Voice Dream and other products and promote the acceptable ones on their sites and take small cuts. Maybe they'd be sold under different brand names from the usual offerings and include library-related optimizations. Just a few thoughts. Who knows how this will shake out.
Some good news is that a free "Lite" version of Voice Support is available from the Apple app store and includes everything in the paid version but the ability beyond more than a few hundred characters at a time. So you don't even have to gamble $10 for at a meaningful look at Chen's baby in action. If you value accessibility blended with powerful sharing and annotative capabilities, go for it and share your own reactions with us.
In the "first edition" of this post, I asked readers for any corrections. Winston Chen kindly pointed out two typos, and he also gave me his own thoughts on the quality of the voices.
"Bridget is shockingly good and in my view just as good as Amy from Ivona. Paul is the best male American voices from any TTS company. They're my best-selling voices. And, to my knowledge, no other mobile TTS reader offer NeoSpeech voices. They're the same voices used in Kurzweil for $1,500 a seat." I bought "Paul" and will still "voice" my preference for "Amy" over him and the already-purchased "Bridget," but they're both excellent by today's standards of TTS for consumers, and this is strictly subjective anyway. Hello, LibraryCity visitors--your own thoughts on the inflection, general pleasantness, and other traits of the various voices? About the review in general, Chen said: "I really appreciate your deep insights, some of which other reviewers failed to pick up. For example, how I handled the interaction between the visual and the auditory. It took me a long time to work out the complex logic for a satisfactory solution. In other words, the feeling of fluidity was actually very hard to achieve. With the new release, it'll get even better. For example, it'll save visual and speech locations independently."
My Sister The Retired Teacher, meanwhile, weighed in with her own take when I took my iPad with me to lunch at her house. Dorothy taught special-education kids among others during her many years in the classroom, and, in fact, she holds a masters degree in this field. She liked Chen's efforts. Read-aloud books are hardly new, but Dorothy appreciated Voice Dream's ability to work with any title in the common formats it supports, not just with a particular book or collection. Dorothy offered her own suggestions for Voice Dream. First, since the needs of children and other readers vary, users should be able to choose between (1) just words being identified in distinctive colors when spoken, (2) lines being marked instead, and (3) a mix of the two other modes, Voice Dream's current approach. She felt that some readers with special needs might actually find the word-level identification to be too much of a distraction. Anyone else's thoughts on this?.
Second, Dorothy wanted default colors used for identification of spoken words to be closer together and for the app to avoid the bright yellow, since stark color clashes might upset some children with attention challenges. While readers can adjust colors, she thinks that a quieter approach would be better to start out with. Here again, I'd welcome others' reactions. Perhaps the solution would be a choice of themes--canned color combos for different kinds of readers--even if I myself very much like Voice Dream's current default mix.
Third, after trying out Voice Dream's dictionary, Dorothy also called for the ability to increase the size of the font with young children's needs in mind. No problem as a future change, I'd hope.
Fourth, Dorothy called for continuous scrolling rather than Voice Dream's jumping ahead by the page. She felt that would be less jarring to young reader-listeners.
And fifth, she was concerned that even with different voices used, Voice Dream sometimes pronounced two words as one.
Inaccurate aural spacing does not bother me too much as a reader-listener if it isn't too common and too serious; but people learn to read in part by listening and processing the information not just word by word but also syllable by syllable. So if Voice Dream and its voices and other TTS products can improve in that respect, it'll increase their value to young children. Yes, even Ivona's Amy has her own pause-related quirks. At the same time, parents should remember that the most accurate TTS voices in standard U.S. or U.K. English or other varieties still can't replace the warmth and interaction of old-fashioned reading to a daughter or son. Parents, not just teachers, ideally will discuss stories with young children and ask questions on such matters as the motives and actions of characters within the material. I'm confident Chen would agree with this standard tenet of early-childhood education.
Despite the concerns above, I see Voice Dream as an incredibly useful way to impart information--on a variety of subjects--to older children who need it or who, like me, simply enjoy a book while exercising.
Encourage text reading constantly? Help students overcome text-related learning disabilities? Yes! But should all other learning cease until students are perfect readers? Some may never be and could more or less live as if in the Jules Verne novel where the telephone-delivered news replaced printed newspapers. I hate the possibilities of In the Year 2889, one reason I'm grumpy when librarians are agnostic about the value of text vs. other media, given the efficiencies of words on paper or on the screen. Not always, but so often, they're the best conveyors of facts and emotions. But I'm also a realist about individuals and see Voice Dream-style programs as an essential way for e-libraries to better serve millions of the print-impaired and help keep books alive, however people enjoy them. The current DPLA has given some attention to presentation issues, but not nearly enough, considering all the complexities here, one reason why I favor intertwined but separate public and academic systems online. The former would care more about presentation issues for the masses, including K-12 students and people in the lower socio-economic groups, two frequently overlapping categories, alas. Besides, the separate organizations could still share a common technical services organization in close touch with leading researchers on and off the campus.
Stay tuned. I'll give Winston Chen a chance to reply to Dorothy and me. Meanwhile a big thanks to both of them, especially since the education-related issues raised here about text to speech are generic, not just limited to Voice Dream.
Close to 9 p.m.: A quick response from Chen to Dorothy's reactions
"I just read through your update. Your sister's suggestions are spot-on. I already added the ability to disable word highlighting and line highlighting, so there will be four modes: no highlighting, word highlighting only, line highlighting only, and both word and line highlighting. While she has a point about the word highlight color being too strong, I'm not going to tamper with those defaults. That has a way of infuriating customers like nothing else."
Editor's note - this article was re-published with the author's permission from his blog, Library City.