Tuesday, October 25, 2022

Old Meets New: Feeding ViaVoice speech-to-text into DALL-E-2

Circa 1999, when our boys were in school, we discovered that the iMac could turn speech into text. And the achievement was amazing, in a primitive sort of way. The iMac was a sluggish computer by today's standards, and IBM's ViaVoice program was freestanding, using only the algorithms stored on floppy disks, with no access to online algorithms.

We took turns reading aloud sections from The Three Musketeers, Tom Swift, and The Boxcar Children to see what would happen. 

ViaVoice made a heroic effort. A 1994 IBM factsheet claimed its software could render speech with complete accuracy. Well, no. It did turn out authentic words, and sometimes the sentences were grammatically correct, but the meaning rarely carried through.

Fed a line about somebody liking to eat ice cream, ViaVoice returned with "Islam is a beautiful blue dream."

We liked that phrase, and lately I fed it into DALL-E-2. Here's the result of the computers' unlikely partnership:



Saturday, October 15, 2022

DALL-E-2: Weird and somewhat wonderful

Some thoughts on the AI image generator DALL-E-2 ...  While not a substitute for a professional artist, it's good for working out ideas, and for illustrating books for one's grandkids. 

Results were the most interesting when it struggled with my text prompt.  

When I asked for a giant mech embracing the Statue of Liberty, it substituted the mech for Lady Liberty instead:

It likes to make sunbeams, even when I didn't ask for them, as when rendering a mech in the harbor:

When I asked for a mech in the pose of The Thinker, it plugged in a despairing superhero guy. Was it thinking of Ozymandias?


When I asked for a city park in the evening, it came up with a statue of a floppy sea creature, and cut off the bottom third of the image:

This eerie result after asking for a whale pulling a boat under a full moon:


More to come!