Oxide and Friends

Oxide and Friends Twitter Space: August 30th, 2021

A brief history of talking computers
We’ve been holding a Twitter Space weekly on Mondays at 5p for about an hour. Even though it’s not (yet?) a feature of Twitter Spaces, we have been recording them all; here is the recording for our Twitter Space for August 30, 2021.
In addition to Bryan Cantrill and Adam Leventhal, speakers on August 30th included special guest Matt Campbell, as well as MattSci, TVRaman, Jessamyn West and Dan Cross. (Did we miss your name and/or get it wrong? Drop a PR!)
Some of the topics we hit on, in the order that we hit them:
  • Brian Dear’s The Friendly Orange Glow
  • [@2:47](https://youtu.be/b9GVJg0LRX4?t=167) Matt’s intro 
  • [@4:15](https://youtu.be/b9GVJg0LRX4?t=255) The Echo ][ sound sample 
    • Wargames computer: GREETINGS PROFESSOR FALKEN. Listen > SHALL WE PLAY A GAME?
      > Love to. How about Global Thermonuclear War?
      > …
      > Is this a game or is it real?
      > WHAT'S THE DIFFERENCE?
      > …
      > What’s it doing?
      > It’s learning…
      > …
      > A STRANGE GAME.
      > THE ONLY WINNING MOVE IS
      > NOT TO PLAY.
  • [@7:46](https://youtu.be/b9GVJg0LRX4?t=466) 
  • [@12:14](https://youtu.be/b9GVJg0LRX4?t=734) Apple to PC 
    • Keynote Gold, Master Touch, Zoom Text
  • [@14:53](https://youtu.be/b9GVJg0LRX4?t=893) Keynote Gold sample
  • [@17:17](https://youtu.be/b9GVJg0LRX4?t=1037) GUI screen readers 
  • [@21:58](https://youtu.be/b9GVJg0LRX4?t=1318) Meeting another sight impaired person on a MUD
  • [@26:44](https://youtu.be/b9GVJg0LRX4?t=1604) Early programming experiences 
  • [@28:47](https://youtu.be/b9GVJg0LRX4?t=1727) Emacspeak user base
  • [@31:34](https://youtu.be/b9GVJg0LRX4?t=1894) Things were getting better on the Windows side.. 
  • [@36:12](https://youtu.be/b9GVJg0LRX4?t=2172) Linux 
  • [@44:53](https://youtu.be/b9GVJg0LRX4?t=2693) Editors for the visually impaired? 
  • [@49:36](https://youtu.be/b9GVJg0LRX4?t=2976) Working on accessibility (a11y) for pay 
  • [@57:46](https://youtu.be/b9GVJg0LRX4?t=3466) 
  • [@1:03:11](https://youtu.be/b9GVJg0LRX4?t=3791) Handheld devices 
  • [@1:08:09](https://youtu.be/b9GVJg0LRX4?t=4089) What should software engineers know about accessibility? 
    • Use a mature UI framework!
    • Microsoft UI Automation is the successor to MSAA.
  • AccessKit by today’s speaker Matt Campbell!
  • [@1:12:34](https://youtu.be/b9GVJg0LRX4?t=4354) DECtalk samples!
  • [@1:15:25](https://youtu.be/b9GVJg0LRX4?t=4525) One of the most important settings a blind person will want to change in their speech synthesizer is how fast it talks. 
  • Alt text image captions
Topical recent conference presentation: - Emily Shea (2019) Voice Driven Development video
If we got something wrong or missed something, please file a PR! Our next Twitter space will likely be on Monday at 5p Pacific Time; stay tuned to our Twitter feeds for details. We’d love to have you join us, as we always love to hear from new speakers!

Creators & Guests

Host
Adam Leventhal
Host
Bryan Cantrill

What is Oxide and Friends?

Oxide hosts a weekly Discord show where we discuss a wide range of topics: computer history, startups, Oxide hardware bringup, and other topics du jour. These are the recordings in podcast form.
Join us live (usually Mondays at 5pm PT) https://discord.gg/gcQxNHAKCB
Subscribe to our calendar: https://sesh.fyi/api/calendar/v2/iMdFbuFRupMwuTiwvXswNU.ics

Speaker 1:

Alright. What's that? Let's go ahead and get get going here. Matt, thanks so much for, for joining us, and I have some other folks that are gonna swing by as well. The so, you know and, madam, I'm obviously very, really looking forward to getting your perspective and story and all this.

Speaker 1:

I'll tell you that for me personally, I my eyes are kinda opened on this, in reading the friendly orange glow by Brian Deer, which a book we talked about before that I really, really enjoyed. And I don't know if, Matt, if you've read the friendly the friendly orange glow. Very well written, super interesting history. And in particular, he tells the story of Brody Walker, who was a or he is. This is still living, but he's a, was a gymnast at Stanford and had this really debilitating accident that left him a quadriplegic.

Speaker 1:

And he, what the discover he'd already been kind of he'd been doing work with Plato, but he got his Plato terminal that was brought into his basically, his hospital room, and he, wrote Ma Jong for the the Play Doh. And it was this kind of exquisitely designed Ma Jong, all of which using a a a typing stick, that he was manipulating with his mouth. And he later, was part of a program called homework from control data that was, deliberately designed to give Play Doh terminals to to the disabled, which I thought is also super interesting. It left me with a lot of questions that I that that Brian Deer doesn't necessarily answer. And then the just the last bit I'll say on this that was that was really interesting to me is that he got a, that it got something called the Personix headmaster, which allowed him to type much more quickly with by moving it with head movements as opposed to the mouth stick.

Speaker 1:

And he was able to leave the mouth stick behind. And just reading about the truly life changing difference the the technology had for Brody, you I I don't know. The course of your eyes get opened about how meaningful, deeply, deeply, deeply meaningful this is for those folks who are really able to do something just qualitatively and quantitatively different because the technology has been made accessible to them. And then the kind of the footnote was the personics headmaster was not gonna be manufactured anymore. And you're just like, oh my god.

Speaker 1:

This is a key technology.

Speaker 2:

Oh, man.

Speaker 1:

So, anyway, so so, Matt, that's kinda my that's where I got, you know, interested in this and also just realized how ignorant I am of this aspect of history. So I'm really excited to have you here. And, maybe with that intro, you can, give us your perspective and your your story on Sure. On accessibility, in computing.

Speaker 2:

So, thank you. So I'm going to be, at least primarily talking about blindness and low vision, because that is the, the area that I'm familiar with. Just a little bit of personal background, I've been legally blind from birth. I have enough sight that I can I can read the computer screen up close, like, way up close if the, if if the fonts are a little bit larger than usual? And and my limited sight also helps me, yeah, move around, know where I'm going, but I I use I I do use a white cane when I'm outside of my home and I can't drive.

Speaker 2:

And I often use a screen reader, which, as I'll explain in more depth, is a program that that, well, basically reads aloud what's on the screen. So I I'm told that I don't look that I don't look this old, but I am about 40 years old. I was born in 1980. And so talking computers were an integral part of my childhood, and, I've read a little bit about the very early history of talking computers. I know, for instance, that there was a talking computer terminal called Total Talk in 1981, but the first talking computer that I was exposed to as a child was the app was an Apple 2e running a speech synthesis card called the Echo.

Speaker 2:

And if you'll indulge me, I would like to play a brief audio clip of the Echo, Hopefully, the low tech solution of pointing my phone mic at my desk speakers will work well enough. Can I do this? This will be Absolutely.

Speaker 3:

Let's do it. Yeah. Hello. I'm an echo to speech in my. I can take learn on that and determine it into speech.

Speaker 3:

But you make logon and videos and phonewes work out. And I'm also pronounced punctuation. The up comment question mark semicolon yonder sign. Now type in what you want me to say or type in to end this demo.

Speaker 2:

So did that come through well enough?

Speaker 1:

Wow. Yes.

Speaker 2:

Now that was not the pen that was not the pinnacle of speech synthesis technology even when it came out in 1982, but it was relatively low cost. The hardware, the the card that you could put in an Apple 2 was basically a a sound output card using some kind of low bit rate LPC, linear predictive coding. The speech synthesis software was, simple enough, you know, primitive enough to fit in the 16 ks memory expansion card of an Apple 2 plus

Speaker 3:

Wow.

Speaker 1:

Where it

Speaker 2:

would be out of the way of any basic programs that you might want to run. And it was, for the most part, only basic programs that could work with the primitive screen reader that came with that synthesizer. The screen reader was called TextTalker, and it would, would redirect the the, IO routines so that, the output would, would go through the speech synthesizer. And in fact, the speech synthesis would block the output. So if if a program was outputting multiple lines, then, visually, you would you would see one line at a time, being displayed on the screen as as they were spoken.

Speaker 1:

Wow.

Speaker 2:

And So can I ask

Speaker 1:

you a question about this, Matt? Sorry.

Speaker 2:

Sure.

Speaker 1:

So the you said that this is 1982?

Speaker 2:

That was when that, that was when that card came out.

Speaker 1:

Yes. Because I I mean, I just

Speaker 2:

But I was I was first exposed to it as, like, a a 6 year old in 1986 or 7.

Speaker 1:

Well, so it it also so the this the the first time I feel I heard speech synthesis, quote, unquote, speech synthesis, of course, watching War Games 1983.

Speaker 2:

War Games, I'm told that was fake.

Speaker 1:

It is fake. It is fake, and it makes me feel so much better that I mean, I was blown away by the speech synthesis. I don't know if you know the story on that. The it is fake. To make it sound convincingly fake do you know what they did?

Speaker 1:

And this is just, like, genius at some level. They read the sentences backwards, and then they reached out to them. Oh, wow. So it's just a person reading it. And I remember the timing, like, wow.

Speaker 1:

That's, like, really that and but now hearing the synthesis

Speaker 2:

They could've just used an echo, though. Well, but Why didn't they just use an echo?

Speaker 1:

Because, honestly, the techno the, quote, unquote, technology they had was further along than the actual technology, because

Speaker 2:

Now I do I do happen to have an example on hand of what was probably the pinnacle of speech synthesis technology in 1982. If this is like a a a 15 second clip.

Speaker 1:

Absolutely.

Speaker 4:

That's right.

Speaker 2:

So and and what you're going to hear before the actual speech synthesizer is, a man named Dennis Clatt, who was the inventor of Dextalk. And and he, in 1986, compiled a bunch of recordings, of various snapshots in the history of speech synthesis technology, and he introduces each one. So you'll hear him, and then you'll hear this, PROS 2,000 system.

Speaker 3:

32. The Speech Plus Incorporated Pros 2,000 commercial system, 1982. 4 hours of steady work faced us. A large size in stockings and hard to sell. The play was there when the sun rolled.

Speaker 3:

A lot of youth attached cream settlement.

Speaker 2:

So Wow. Quite a bit more advanced, but also I have no doubt quite a bit more expensive. You probably, not many, school public school systems could afford one of those.

Speaker 1:

And is this being designed with accessibility in mind at this time?

Speaker 2:

I mean, I I I don't know if the I don't know if the PROS 2,000 was. The, the echo, From from what I understand, the original manufacturer, Street Electronics, had had accessibility in mind as one of the possible applications. The their their, Textalker screen reader was was certainly designed Right. With that in mind. The the maintenance of Textalker later in the eighties got taken over by the American Printing House for the Blind.

Speaker 1:

And then, Matt, just ask you another question because I didn't wanna gloss over it. So you just you discovered this as a 6 year old in you said in in late 8, like, 1986, you said, I think. Yep. Uh-huh. So was that the first speech synthesis that you had heard?

Speaker 1:

And, like Yes.

Speaker 2:

It was.

Speaker 1:

And so take me to that moment. Was that I mean, that must have been an amazing, memorable moment for you, I imagine.

Speaker 2:

So so I I I have a couple of, I have a couple of early memories from that time. So, I was in 1st grade, and all throughout my elementary school years and through part of middle school, I was always I always attended whichever school in town had the program for the blind and visually impaired kids, the the special ed program. And so sometimes and so I would spend a a good deal of time each week in the the the room with the in the classroom with the special ed teachers that worked with us. And I have one, I think my earliest memory of hearing speech synthesis was the one of the teacher one of the teachers in that room was working with me on my handwriting because remember, I do have some vision I can handwrite with some difficulty because my head has to be up against the paper to see what I'm writing. But while I was working on my handwriting, one of the other blind students would be working on a talking computer on the other side of the room.

Speaker 2:

And I I think I envied them, but, but my turn came soon enough, because because at some point in my 1st grade year, they started teaching me how to touch type. And, there was a, there was a program for the Apple 2 called talking text writer, which was a talking word processor using this echo speech synthesizer. You couldn't you couldn't use any of the mainstream word processors because the screen reader was too primitive. It wouldn't work with anything that wrote directly to screen memory. So but but there was a talking word processor, and I I remember that it that the echo seemed to struggle with the hard g sound.

Speaker 2:

You might have you might have noticed that in the clip. And it also didn't pronounce my first name, Matthew, correctly.

Speaker 3:

Really?

Speaker 2:

It was like Interesting. and and, you know, I asked my teacher why it couldn't say things correctly, and she tried to explain, but, you know, I was I was a 6 or 7 year old, and the teacher wasn't particularly computer savvy. So, I mean, the the the explanation couldn't be very satisfactory. But, yes,

Speaker 1:

and the 6502, and how much RAM? Yeah.

Speaker 2:

Right. Yeah. Yeah. And and and Text Talker having to fit in the upper 16 k of, of memory. So that was and and, I had access to and the the the computers that were adapted for us in the public schools were based were Apple twos with echo speech synthesizers until 1994 when I was in 7th grade.

Speaker 2:

Wow. Wow. And at that point, at that point, we got they they they brought in a PC running MS DOS, And there were 2, there were 2, access technology a we have an abbreviation in our industry called AT, which variously stands for access technology, adaptive technology, assistive technology. Take your pick. But there were 2 AT programs on this PC.

Speaker 2:

There was a screen well, first of all, there was a built in speech synthesis card called the Keynote Gold. And there was a screen reader called Master Touch, so called because it had a hardware peripheral that came with it. It was like a touch sensitive tablet that you could run your finger along to review the screen. I don't know that that part ever really caught on, but, and there was also a screen magnification program called ZoomText, which would so, basically, it would you would have the it would display a portion of the screen magnified, and you could you could pan around, and it would automatically track your cursor and and things like that. And so since since I had some usable vision, they taught me how to use both, the screen reader and the screen magnifier.

Speaker 1:

And that must have been a hell of an upgrade from I mean

Speaker 2:

Yes. It it definitely was. The the MS DOS screen readers since well, first of all, the PC had more room, although you were still dealing with the cursed 640 ks memory limitation, and all of the fun of multiple TSRs coexisting in that. But the MS DOS, but at least the MS DOS screen reader was able to read from, from screen memory because the PC, as I recall, was a much more interrupt driven platform than the Apple 2, so so that there were more ways that a DOS screen reader could kind of stick its hooks in and and and provide access even if the application wasn't fully cooperative.

Speaker 1:

It is truly the glass half full of the of DOS is the fact that a that an accessibility application gets its hooks in. Because, of course, there are so there's

Speaker 2:

plenty of also the glass half full of of Windows

Speaker 1:

Oh, interesting.

Speaker 2:

As I'll get to.

Speaker 1:

Yeah. Interesting.

Speaker 2:

Now I have a I have an audio sample of the Keynote Gold synthesizer.

Speaker 3:

This,

Speaker 2:

again, was probably not the pinnacle of speech technology for its time, but it was a definite upgrade from the echo.

Speaker 3:

Fix space rest. Stop button alt. Bless s. See virtually virtually recorder dialogue. Stop button alt.

Speaker 3:

Bless s. Overview list, Audacity to live 31.

Speaker 2:

Okay. That's enough of that. That was just some guy making a recording of of of keynote gold with a keynote gold with a modern Windows screen reader. He must have had some old hardware lying around, but that was just one that I happened to find on the net. So, that is a Yeah.

Speaker 1:

So the thing that sounds like the Cylon in the original 19 seventies battle star collection.

Speaker 2:

Yeah. Now with the keynote gold, I'm pretty sure that the speech synthesis was being done by some kind of processor on the card Because when when we turned on our DOS PC immediately before the BIOS could have even finished doing its power on self test, you would hear keynote gold.

Speaker 1:

So Awesome.

Speaker 5:

And that

Speaker 4:

at that time, was hardware required? I I ask because you know, my earliest remember, these kinds of technologies was talking moose, I think, in, like, the late eighties on the Mac, which, you know, with all all this, it was very evocative of.

Speaker 2:

So, the Mac could certainly do software synthesis. I remember so I I first got online in in 93 using the 2,400 baud modem in my mother's computer. And I remember reading in 'ninety four about a DOS screen reader that could use the the software synthesizer that came with the SoundBlaster 16 card. But what I particularly remember reading about it was that this software synthesis option didn't work with some terminal programs because of the way that the software synthesis would tie up the CPU. So

Speaker 1:

I mean, like, you think about, like, technically, there's a I mean, I'm so impressed with the echo, but just as you're not as you're you're talking. Like, you really have to get your hooks into a lot of the system in order to be able to kinda pull this off. This is, technically, this is really challenging to pull off.

Speaker 2:

Oh, and I haven't even got into GUI screen reading yet.

Speaker 1:

Right. Yeah.

Speaker 2:

So so, let's get into that now. So the first, screen reader for a GUI was a program called Outspoken for the Mac, which was released in 1989. If there are any, pre OS 10 Mac fans in here, you may remember a screensaver called After Dark from a company called Berkeley Systems. That company, apparently, their their real bread and butter, at least at first, was developing these accessibility tools apparent from what I read in Wikipedia that it was initially under contract to the National Institutes of Health. But, anyway, Outspoken came out in 1989.

Speaker 2:

The first Windows screen reader, called Window Bridge came out in 1992. And, I'm sure you guys are dying to know how these things could have possibly did what they did. So, basically, what, what Outspoken and what Outspoken relied on exclusively, as far as I'm aware, was basically hooking into in intercepting calls to, like, the quick draw graphics routines. So so we could build up a model of what was being drawn onto the screen. And the the term that was coined for this that basically everybody adopted was, an off screen model, OSM.

Speaker 2:

So, so if you had an application that did its own text rendering, rather than using QuickDRAW or in the case of Windows, the, graphics device interface, GDI, if you had an application that did its own text rendering, then it would not be accessible with one of these screen readers.

Speaker 1:

And how the circa the early nineties, how many applications were using the Windows facility to to render text versus doing it on file?

Speaker 2:

I think most of them at

Speaker 1:

this point. So it it would work with most and, I mean, in terms of the off screen model, how would it represent those things that are strictly visual? Is it really focused on reading text, or is it trying to

Speaker 2:

So, a screen reader could so by by intercepting these routines, the calls to these routines, a screen reader could, at least some of the time, detect if if an icon was being drawn to the screen, and then it could do by doing, like, a a check sum or a hash or similar of the contents of the icon, it could give that icon an identity, and then it it, it it could say something like, yeah, graphic 53 or something. And then if you if if if a sighted person was was working with you to adapt the the system, you know, to to to help you configure the system, then they could label the graphics and and store those labels in database.

Speaker 1:

And did you yourself you you used this technology as well, the the the early Windows technology? Or

Speaker 2:

I did not. Okay. And so, again, being online in 94, I kind of and and and following I I followed 1 or 2 forums, about this technology, and I I kinda remember reading about a, about the existence of Windows screen readers, but it wasn't something I pursued at the time because I never had well, first of all, since since I I have some usable vision, I was I was primarily using the computer that well, I I was completely using the computer that way when I was at home because I never had access to these hardware speech synthesizers at home. So I I didn't use I I didn't I didn't get exposure to any of these, early, gooey screen readers. And in fact, I didn't I didn't really start to to learn about what was going on in that area until, 1998, when I was about 17.

Speaker 2:

And so, by that time, I had been out of the public school system for a few years. In my 8th grade year, I I was my parents moved me to the same private religious school that my siblings were attending. And, you know, I was I was okay with it at the time, but that I did become the only visually impaired the first and only visually impaired student at that school. Oh, boy.

Speaker 1:

Yep. Easy.

Speaker 2:

And, I mean, it it it had its ups and downs. I'm I'm a little ashamed to embarrassed to admit it now, but the school, the the the principal of the school kind of decided to make me the school's charity case. He did kind of a he did an all school fundraiser to buy me a laptop and screen magnification software.

Speaker 1:

Oh, I'm not gonna turn that down. I mean, I guess, like, you know, I think that's like, oh, you wanna give me a laptop, maybe. Okay. Yeah. Yeah.

Speaker 2:

Yeah. Yeah. And, yeah. So but, in in 1998, I through through pure coincidence, I happened to cross paths with a blind person online for the first time in years. And and so as I got to know as I got to know her, and don't read anything into the gender.

Speaker 2:

We were just friends. But, as I got to know as I got to know her Yeah.

Speaker 1:

Go ahead. Yes. I did. I I didn't mean to interrupt. But these where did you because you mentioned this too that you were, like, hanging out in forums.

Speaker 1:

I'm I was that where did you meet in 1998? Is this Usenet? Is there are there particular websites? Or what how did you

Speaker 2:

I was totally

Speaker 5:

gonna ask, is this

Speaker 2:

Usenet? That's right. Actually, this was on a type of MUD.

Speaker 1:

Nice. There you go.

Speaker 2:

Yeah. So I was, my my my preferred distraction from the homework that I really should have been doing was hanging out on, in particular, a a flavor of mud called a Moo for mud object oriented, and and it had it had its own programming language. And for me, the the appeal was it was a combination of chat room and and fun programming environment. So, and so I happened to cross paths with a blind person on one of these moves. And and as I got to as I as I got to know her and and and, you know, learned about how she was using her, could learn that she was using a Windows screen reader.

Speaker 2:

Now unfortunately for her, she had gotten saddled with and and and I should mention that at this time, Windows screen readers cost I mean, the just just never mind the speech synthesizer. The software, would cost, like, $500 or more. Wow. Because, I mean, at at at least the the the rationalization for this was was that it was a small market, small market and and, yeah, heavy demands on tech support, etcetera. So, she had unfortunately gotten stuck with a Windows screen reader that wasn't keeping up with the fast changing world of Windows at that time.

Speaker 2:

And this was also a, particularly, dark time for access to the web on Windows. Because, as you as I'm sure you recall, yeah, tables and frames were were both, being heavily used, and these screen readers were still depending heavily on their off screen models to provide access to the contents of the web browser window. So if you had a if you had a web page that used the typical layout table with navigation links on the left and page content on the right, and you tried to read that with your screen reader, it would just read straight across.

Speaker 1:

Right. Yeah.

Speaker 2:

Now, there was, there was a, there was a web browser designed specifically for blind people called PWWebspeak. But, that was a I mean, that that that was its own web browser, not Netscape or Internet Explorer. And it it was a I'm I'm sure that there were, a great many websites that weren't compatible with this specialized web browser.

Speaker 1:

I think I have somebody who follow-up questions just on PW or WebSpeak. 1, I don't wanna tell them how to do their branding, but it's not exactly a catchy I mean, it was how do you was what I agree. WebSpeak. And is this aimed again or this is aimed at accessibility explicitly, I

Speaker 2:

see. Okay. Squarly. Squarly.

Speaker 1:

Yeah. Interesting. Sorry. Yeah. Wow.

Speaker 1:

Okay.

Speaker 2:

Yeah. So, and and then the the screen reader that Anne, the the blind person that I had met online, had gotten that I I don't know if she bought it herself or if it was bought for her. The it was developed by one person who apparently was not making enough money from it, and he had to go take a job at another company. So it was basically abandoned. And so by late 1998, Anne and I were thinking of seriously thinking about getting her set up with Linux and Emaxspeak, which is where and and and as I meant well, when Dan brought up Emaxspeak last week, I mentioned that I had made some small contributions to that community.

Speaker 2:

And so that was what started me down that road.

Speaker 1:

So so you are you're 18. Have you Yep. Decided that I mean, clearly, I mean, I do software engineer, you have programmed or deprogrammed. I mean, I Mhmm. Had had you decided that was your life's calling at this point?

Speaker 2:

I I knew for a long time at this point that programming was my life's calling. I mean, I I had started learning to program on my on my family's Apple 2gs computer when I was 8 years old. And, I had an uncle who taught me basically everything he knew about, well, mainly basic programming on that platform.

Speaker 1:

Okay. I've gotta ask on the 2 gs. I mean, total shout out to the 2 GS. I spent way too much time playing epic summer games in the 2 GS. Ah.

Speaker 1:

But the GS, of course, stands for graphics and sound. Was there better speech speech synthesis on the 2 GS?

Speaker 2:

Yes. Unfortunately, I do not have an audio clip of the one text to speech engine that I know of for for that.

Speaker 1:

I don't know, Matt. It's pretty disappointing that you don't have an audio clip from an Apple 2 GS.

Speaker 2:

But,

Speaker 1:

but So Yeah. But it was better.

Speaker 2:

The thing is, though, as far as I know, nobody ever did anything that could be considered a screen reader using the soft the speech synthesis software that was available for that machine. So, I I know what I I, I had a a brief email correspondence in 94 with with a blind programmer who was working on a GUI screen reader, for the 2gs, but he was doing it using he was doing it using the echo synthesizer, I think, because because he was this as a some kind of hack on top of the version of Textalker that was available for the 2gs. Again, maybe maybe the, well, I don't know. Well, you do

Speaker 3:

get this problem.

Speaker 1:

This is super technical to develop, and, ultimately, people have to eat. And, you know, you've gotta have some you when these markets kind of you begin to slice the markets smaller and smaller and smaller. And this is why you're you're kind of leading up to this open source moment, which must have been a real watershed moment, I imagine.

Speaker 2:

Yeah. Yeah. Although, what, what Anne and I didn't understand going into into our adventure with Linux and Emacs Speak, was that, the author of Emacs Speak, T. V. Raman, was basically building it for himself.

Speaker 2:

And of course, that that is the way with personal open source projects. But he was he was building it for himself, not not as something that was specifically designed to be more generally useful. And when when we and also other, newcomers arrived on the Emaxspeak mailing list and started asking basic questions, I I think there was an expectation from Raman and and the other regulars on the list that that users of Emax speak would already be fluent in Emax and comfortable with, reading things, you know, finding things in the documentation or even the mailing list archives. But, I I did, I I I did what I could to to help Anne come up to speed, and then I I tried to contribute back to the community in general. As I mentioned last week, I made a, an RPM package of Emax Speak for Red Hat.

Speaker 2:

Now I went I went back looking through the this period in the EmaxSpeak, mailing list archives the other day. And one of the things that struck me was fairly early in in her time coming up to speed with Emax speak, Anne posted a message about how she was using the w three web browser for Emax. And now she was surfing the net more than she ever had before because because it was such a, such a a a better experience to read a web page with although w three had its own problems. And in fact, one of the one of the small Emacs Lisp hacks that I developed that I had forgotten about in the intervening years was an a an extension to w three to convert the, the the tables of on a on a web page into something that you could move through, yeah, linearly with your up and down arrow keys as opposed to actually navigating it as a table. Because, again, layout tables were so common in this time.

Speaker 2:

So in a way, we had the same we had one of the same problems as the Windows screen readers, but now we could hack around it.

Speaker 1:

Right. And and also you don't have it's not $500. You've got the kind of the liberation. And you you know it's not gonna be end of life. I mean, you've got certain things that you get from open source you're not gonna get from a proprietary solution.

Speaker 2:

Yeah. Yeah. Although things were getting better on the Windows side, it took a while for me to realize it because late in 1999, the so and if if anyone here has heard anything about Windows screen readers in the past couple of decades, you've probably heard of a screen reader called JAWS, which stands for Job Access With Speech. And JAWS came out, I I think it came out for Windows 3.1 in 1995, and for Windows 95, like a year or 2 later. But in 1999, in the the fall of that year, according to a friend of mine who was the engineering manager for DAWS at the time, they released an update which introduced what they called the virtual buffer.

Speaker 2:

And it's it's not a very good name, but I'll explain what it did. So when you were browsing a web page with Internet Explorer, they they only ever did it for IE, not for Netscape. But when you were browsing a web page with Internet Explorer, DAWs would start intercepting your, keystrokes for the common cursor. Well, okay. Let me back up.

Speaker 2:

1st, at least according to my friend Chris, the engineering manager at the time, JAWS would grab the HTML, for the page from IE using the, the the object model that IE was exposing through, something called COM or component object model. It was basically the technology that was used by things like Visual Basic for applications and VBScript.

Speaker 1:

COM in a way. Yeah. Definitely.

Speaker 2:

Yeah. Yeah. So JAWS would grab the HTML out of IE using this COM object model. And according to Chris, it would parse the HTML and build up its own representation of the contents of the web page. Later, when IE actually exposed the whole, DOM through COM, JAWS could traverse that.

Speaker 2:

And then, it would, it it would intercept common cursor movement keys like up and down arrow, home, and control, home, control, and etcetera. And and basically give you a linear document type of structure that you could move through. And then when it when it really started getting good was when they added what they called quick navigation keys. For instance, h to move to the next heading, or or shift h to move to the previous heading. They all had this pattern.

Speaker 2:

F to move to the next form field, b to move to the next button, and so on and so on. So you could more easily jump around the page, when you wanted to or read it linearly if you wanted to do that.

Speaker 1:

So I met 2 questions. 1, job access with speech? Where I saw I've not heard of JAWS. What is the job in job is it, like, job like, a computing job? What's where does the name come from?

Speaker 2:

The the the the the name is intended to signify that it's for access to employment.

Speaker 1:

That's okay. So, like, job like, getting a job. Yeah. And so was this was the genesis of this, like, a a program to help folks get work? I mean, I I just I got more questions now.

Speaker 2:

Well, I mean, it it it it was it was a commercial product developed by by a company called at the time called Henter Joyce. The the head of the company was a blind programmer named Ted Henter. And I I I mean, probably it was a it what you might call a a a retro name or a backronym. Right. Yeah.

Speaker 1:

I was wondering that too. Okay. Yeah. That makes sense.

Speaker 3:

But

Speaker 1:

And and then my other question is well, though you answered that, this was way designed with accessibility in mind because some of the the the keystrokes you're mentioning, I'd like that would be just kind of generally useful to be able to whip through a web page just using the keyboard, and not have to go to a to a mouse. Was was was there any do folks use this for reasons other than accessibility?

Speaker 2:

Not as far as I'm aware.

Speaker 1:

Which way?

Speaker 2:

Although, they might. I mean, it it it might be more attractive for for people to do that, now that that functionality is built into the narrator screen reader, which is part of Windows.

Speaker 1:

That was probably the is that the next chapter here? Is that the way and and and where do you come into the picture from a mic? Because you worked at Microsoft. So

Speaker 2:

I worked at well, I did for 3 year, but I didn't so I didn't join Microsoft until 2017.

Speaker 1:

Oh, that was okay. Yeah. Yeah.

Speaker 2:

That was fairly late in the story.

Speaker 1:

Got it. Okay.

Speaker 2:

Yeah. So I was I was doing my thing with with EmaxSpeak and and Linux, and, there was there was another, screen reader for Linux that came on the scene around 1999 called speak up. And this, believe it or not, was implemented as a patch for the Linux kernel. Now it was it was it was, at the time, entirely dependent on these these hardware based speech synthesizers, which were still pretty widely used in the late nineties. Although software based solutions were beginning to take off on Windows.

Speaker 2:

But even even then, there was a there was a problem, which was that unless you had a fairly high end sound card like the SoundBlaster Live that had its own built in mixing of multiple audio streams, you couldn't have speech synthesis and any other sound playing at the same time until, if I'm not mistaken, Windows 2,000 in the NT lineage or Windows Millennium in the 9 x lineage when Microsoft added, software mixing to the OS. So Can I ask you an easy question?

Speaker 1:

The, on Sure. The challenge of speech synthesis, how much of that is specialized hardware versus just having enough compute to actually work?

Speaker 2:

It was really just having enough compute. Interesting. Well well, and and I get, well, specialized hardware in that the very early computers, didn't have, their, you know, their their built in sound output wasn't up to the task. Like the Apple 2, you had basically one bit resolution in its built in speaker. The PC's built in speaker, it it could it could do its own tone generation or, again, yeah, one bit resolution for, yeah, clicking the speaker on and off.

Speaker 2:

So lack of sound hardware built into the computer was a factor, but and and that and lack of lack of built in support for mixing multiple sounds in the OS became the limiting factor for Windows in the late nineties. But meanwhile, over on the Linux side, it I I I mentioned SpeakUp. And SpeakUp was written by a blind programmer and system administrator. He was he was working as a sysadmin at a university, and he wanted something that would speak, as far as possible, everything that happened on a Linux box from boot up to shutdown. And so he wrote speak up as a patch for the, the console driver in the kernel.

Speaker 2:

If I'm not mistaken, he even went so far as to do his own serial port IO routines so that his, speech synthesis support could be up and running even before the normal serial driver was. And That is slow.

Speaker 1:

I I mean, it's that's a great way to do it, honestly. I mean, it's the it it is the single single source of a certain kind of ground truth in the kernel, so it makes sense.

Speaker 2:

Yeah. Yeah. And and as as one as one, person put it to me, around that time, try reading a kernel panic with emacspeak.

Speaker 1:

Right. Oh, in terms of, like, with the the their motivation for this was, hey. We can't participate in kernel development because when this thing panics, we we can't

Speaker 2:

Or or independently get your box out of certain situations.

Speaker 1:

That's interesting. Forget even the kernel development side of this. Just like the my box is in a reboot loop, and I literally have got no way of figuring out why because I don't I yeah. Yeah. Wow.

Speaker 1:

Interesting.

Speaker 2:

Right. So I I started, and and by this time in in 1999, I had my double talk speech synthesizer. It was it was a, a box that connected to the serial port of, of my machine. And so I started I started playing with speak up, and, and I I I realized that in in some ways yep. As I said, in some ways, it was it was a more complete solution than, than EmaxSpeak.

Speaker 2:

And my my first real contribution to well, trying to make it easier for people to get started with Linux and SpeakUp was, so I I don't know if any of you guys remember, but Linux had a file system called UMS DOS, which was basically a Unix a UNIX, friend yeah. A file system with all of the necessary features for UNIX, like long file names and permissions and ownership, etcetera, implemented on top of the MS DOS file system. So that meant that if you had a zip archive of one of these UMMS DOS file systems, you could unzip it onto your hard drive and then and then run it without having to mess with repartitioning and things like that. Because there was also a DOS utility called loadlin that could boot into Linux. So so this is like

Speaker 1:

a DOS file system is like up in store? I mean, is that am I understanding that?

Speaker 2:

Yeah. Yeah. Basically, a DOS file system is a backing store for a for for a Linux file system. So and so the Slackware Linux distribution had had had a a package called Zip Slack, which was basically a Slackware based system in a zip file that you could just unzip onto your DOS formatted hard drive and run. And so I took Zipslack and, SpeakUp and created ZipSpeak.

Speaker 2:

Maybe not maybe not the best name, but it did 58 character limitation.

Speaker 1:

There you go. Right. Okay. So so ZipSpeak is I mean, honestly, like, where is ZipSpeak running? Is ZipSpeak running?

Speaker 1:

Is it is it are we in DOS or in in Linux here? I'm honestly confused. Well, I

Speaker 2:

mean, the the idea was that you would boot your Windows machine into DOS mode, run run this load then utility, and then you'd be boot it would boot you up into Linux from DOS.

Speaker 1:

Load then would be your DOS COM file that acted as a MS DOS or sorry.

Speaker 2:

As a bootloader for Linux. Yeah. Yeah. Yep. And so and and speak up at the time, you had to compile, you had to compile a kernel for the specific speech synthesizer that, that you were using And, speak up at this time had 5 different, 5 speech synthesis drivers.

Speaker 2:

So I I remember writing this elaborate build script that well, elaborate for for me at the time.

Speaker 1:

It's not elaborate by any standards, honestly. This I mean, it seems like we got a lot of moving parts here. So so so ZipSpeak is then allowing for speak up to be much more broadly used, but to be used by folks who are coming from the from the win from the Windows and DOS side. Is that a fair statement?

Speaker 2:

Yeah. Yeah. And so I put it out in, I think, March of 2000. It got slash dotted.

Speaker 1:

There we go. Slash dot. Nice. Hopefully, slash dot had nice things to say. The Hacker News of its day.

Speaker 1:

Could say some very not nice things.

Speaker 2:

Well, there was there was one guy, I I think he might have been a troll, who, who was was commenting about, well, why why why are you being so selective about which speech synthesizers you support?

Speaker 1:

Oh, Internet. Never change. Some things are just feel like truisms. It feels like you can Yeah. And arbitrary and good innovation.

Speaker 1:

Interesting. Okay.

Speaker 2:

So is it My other go ahead.

Speaker 1:

No, sir. Go ahead.

Speaker 2:

My other major contribution to Speak Up in early 2001 was, I I refactored the synthesizer driver code so that you could compile all the synth drivers into 1 kernel and specify the one that you wanted on the command line.

Speaker 1:

And then are you connected to folks I mean, you you you've met fellow blind folks online. Are you connected to folks for whom this is opening up kinda new doors? Because, I mean, obviously, you've got a very personal motivation for this stuff.

Speaker 2:

Yeah. So, so, yeah, I I I spent a lot of my spare time around that time kind of, yeah, doing, you know, 1 on 1 helping people get set up with with either Emax speak or speak up.

Speaker 1:

That must have been Just I I've been I that's obviously gotta be very personally rewarding because, like, you know, you're you're allowing someone to do something for someone they they they couldn't previously do at some level.

Speaker 2:

Yeah. Yeah. And like I said, but by the time I was doing some of this, things had already gotten better on the Windows side, but only if you could afford JAWS.

Speaker 1:

Got it. Right. Right. Right. So now you're actually get a lot of people do this with open source.

Speaker 1:

Yeah. Neat.

Speaker 2:

Yeah. So

Speaker 5:

I'm I'm also curious just what using Linux looks like if you're visually impaired. Like, I assume you're not trying to start x or or do things like that.

Speaker 2:

That came later.

Speaker 5:

You know, things like them are probably a nightmare. You've got Emax speak with. Does Ed become a reasonable choice of editor?

Speaker 2:

There was one well known, guy in the blind Linux community who seriously advocated that blind Linux users should use ed, because he he seriously felt that that a line oriented program like ed yeah, that that mastering a line oriented program like ed was the best option. And in fact, he went on to write a program that he called edBrowse, which was a reimplementation of ed plus a browser using that same interface style.

Speaker 1:

Oh my. Wow. That's awesome.

Speaker 5:

And then, like, things like cursors and tables and some of the more, you know,

Speaker 2:

visual Yeah. I I I I think stuff. I think my my, wake up call that this was for that that, you know, using interfaces like that was perhaps not the with the screen reader was perhaps not the best solution came when and and and this leads into the next phase in my story, which is the beginning of 2,001, when, and I I had been subscribed to the Speak Up mailing list for quite a while at this time, and a guy posted to the list asking for help getting a a Red Hat Linux machine talking. And I was the one that that replied to him and offered to help and just and and as as I'm sure some of you recall, the the text mode, Red Hat installer used the whole, yeah, pseudo GUI style, not technically using cursors, but it amounted to the same thing with, you know, with with dialogues and and, yeah, a the the focused button would be highlighted and things like that. And I I just remember trying to walk him through, you know, struggling to make sense of of of what was on on the yeah, what was going on with that program.

Speaker 2:

And I I I think that was when I began to reconsider whether accessing yeah. What using Windows with a good screen reader might be better.

Speaker 5:

So And so what what is it doing? It's just going dash dash dash dash dash plus dash dash dash dash?

Speaker 2:

Well, fortunately, the screen readers are a little bit smarter than that. But the problem is that you don't you're and and especially with the pseudo GUI toolkit that this Red Hat installer was using, the the cursor, the the the blinking cursor, if it's, which might not even which might not even be visible, but was really the only indication that a screen reader had of where you were on the screen, it might not always be in a useful place. And and when if if you were arrowing through a menu and and the highlight was moving, to to indicate where you were, that honestly, I don't remember all all of the details at this point, but I I do remember that it was a that it was a challenge to work with. But, I mean, in in a way, I mean, it was good that you at least had access to it in some form. There there are still plenty of GUIs, actual GUIs even now, that are completely inaccessible.

Speaker 2:

This, at least being text, you had something, but it wasn't and and this this guy, Mike, who, later became my boss, Yeah. He was, he was proficient with Windows. And he was and and and the other thing that kind of clued me into the fact that things had changed while I was off in Linux land, he was exclusively using a software speech synthesizer on his Windows PC, not not one of the hardware, options. So Interesting. So things had

Speaker 1:

things had shifted to the point where new things are gonna be possible now without specialized hardware. Yeah. Yeah.

Speaker 2:

Yeah. And and, and he had JAWS, which meant that he had the the best Windows screen reader that was available. And so so for him, trying to work with with Linux and SpeakUp was was a downgrade.

Speaker 1:

Interesting. So so did you go to Windows at that point, or what what did you end up end up doing?

Speaker 2:

So, it it it was a it was a while longer before I, before I left Linux behind for a while. But, as I mentioned, Mike became my boss, and this this kind of transitions into where I started working on on accessibility for pay. Because Mike Mike was working at the time with an offshore programming company in Russia to try to develop a new product that he called FreedomBox, which is a a talking it was it was basically a well, so he described it at at the time as attempting to be an AOL for the blind, a very easy to use way of accessing the Internet, designed specifically for blind people. And what what he was trying to do at the time was using not only speech synthesis, but, speech recognition so you could give the thing voice commands, and and it would talk back to you. And and so he I I initially started working for him as his assistant administrator because the product had a had an online service that went with it.

Speaker 2:

And the the the first version of the product, which, like I said, was was a fully custom interface based on speech speech synthesis and recognition. It I mean, it it it got some some it got some positive responses, but it was not taken seriously by the, by the, basically, the establishment, the, you know, the the people that were that were training blind people to use computers, in particularly in and and this this was never intended for for access in a in an employment environment that was intended for for, like, elderly blind people trying to use the Internet at home. But still, in order to to reach any of this market, we've we needed something that would be taken seriously by, you know, the existing, well, the the exit I mean, by the people that were that that that would make decisions on what to purchase.

Speaker 1:

Yeah. Actually.

Speaker 2:

Yeah. Right. So so, what we ended up doing for version 2 in 2003 to 2004 was we did instead of a a fully custom, very simplistic interface, we did a talking browser based on Mozilla that was that kind of more resembled Jaws in the way that it worked.

Speaker 1:

And that is a pretty I'm I'm trying to remember. So like, because that is right when Firebird gets renamed. Right? I mean, this is like

Speaker 2:

Yep. Yep. And and in fact, one of the decisions that I had to make at that time was was whether to use the existing SeaMonkey code base or the new Firebird fork. And I went with the old SeaMonkey code base, which was in retrospect, a mistake. But, yeah, I did, so I I basically did a something something akin to the JAWS virtual buffer, but, implemented in JavaScript.

Speaker 2:

And and it was this this talking browser, could run on both Windows and Linux. But, to make it run on Linux, I had to address the problem of mixing multiple sound streams, which at that point, amazingly, had not yet been had not really been addressed, at least not to my satisfaction, on the Linux side.

Speaker 1:

It just got a a collision course here with Pulse Audio.

Speaker 2:

Yeah. Well, Pulse Audio came out, like, a year or so after my implementation,

Speaker 3:

I think. Oh, shit.

Speaker 2:

At the time, the the the well, you had you you had 2 implementations. The the the Gnome camp had e sound, and the KDE camp had something called ARTS. And ARTS for I think the RT was for real time. But, what I recall about esound in particular was that it used a very, naive, sample rate conversion algorithm, which meant that when you had a speech synthesizer that was, putting out audio at 11 kilohertz and you were trying to up sample it to 44 kilohertz, it sounded pretty nasty. And so I looked around all over the place for a sample rate converter that I could use, And I ended up finding one that met my requirements in a mod player of all places.

Speaker 2:

And I I pulled out that code and I used it.

Speaker 1:

And then how much of this is I mean, the the sound focus and and the, I guess, the the KDE versus GNOME sound wars. I mean, I I I can't imagine that these two camps got along and had a consensus on a I I, I mean, I was not Well

Speaker 2:

and and I I put my implementation out as open source, but nobody else ever used it.

Speaker 1:

How much of their focus was on accessibility versus other aspects of sound? I mean, were

Speaker 2:

so GNOME I mean, KVE to this day doesn't have a screen reader as far as I know. But on the GNOME side yeah. But on the GNOME side, Sun Microsystems, yeah, they they put together a team to implement accessibility for GNOME. And and as as part of the and, they they actually so they actually ended up developing not 1, but 2 screen readers.

Speaker 1:

The one that, you know,

Speaker 2:

The first of which ended up getting killed off.

Speaker 1:

That's great on brand perception. That sounds like something. Yeah. Exactly. I believe I believe that.

Speaker 1:

In Texas.

Speaker 2:

Well, so originally, they had contracted with with another with with with a a company called Baum, b a u m, which was mainly known for manufacturing braille devices. But Baum Baum had a team in Romania, which was working on a screen reader, which which Sun contracted Sun somehow engaged with with Baum to develop a screen reader for for Ghannoum called, and this is a mouthful, Gnopernicus. I kid you not. It was called Ganopernicus. And and I I I'm I'm not well versed in the problems with that program, but it was bad enough that my friend at the time, Mark Mulcahy, who was working for Sun and his his official and and he he worked on the Gnome accessibility team, but he wasn't tasked with working on the screen reader.

Speaker 2:

He was tasked with working on basically their abstraction layer over the various speech synthesizers, a component called GNOME speech. And it supported both both hardware and software, speech synthesizers. But Mark, who is blind himself, got so fed up with Copernicus that he took matters into his own hands and wrote a screen reader called Orca. And where that name came from is that, so you you already know that, that the the the leading screen reader for Windows, was Jaws. Before that, there there had been a screen reader for DOS called Flipper.

Speaker 2:

And, of course,

Speaker 1:

you gotta you gotta go to a higher trophic level. That is I should say, that is very on brand for Sun.

Speaker 2:

Well, apparently, Mark, some someone had suggested that that progression to Mark when he was previously an intern at Microsoft. So so Mark wrote Orca, and and, the retro you know, after the fact, got permission from Sun to, to put it out. And and Orca is to this day the GNOME screen reader, but Mark hasn't worked on it for a very long time now.

Speaker 5:

And this works how? By tapping into GTK? Or

Speaker 2:

so, to explain this part, I need to I need to back up a bit and and explain another thing that was that was that started happening on Windows in the late nineties. So, like I mentioned, very early on, the GUI screen readers intercepted calls to functions in quick draw or GDI or whatever to build an off screen model. But it didn't take long for for people to realize that that wasn't gonna be good enough. And in 1997, Microsoft put out something that they called Microsoft Active Accessibility, of course, because everything at that time was active this, active that from Microsoft. So Microsoft Active Accessibility or MSAA, which I understand also stands for something else in some some kind of anti aliasing thing.

Speaker 2:

But, MSAA was an API that an application or GUI toolkit could implement to programmatically expose the content and and semantics of the UI. And like the Internet Explorer object model, MSAA was based on COM. So but, one of the one of the first things that, that the GNOME accessibility team at Sun did was to define an accessibility API for GNOME. And they my understanding is they did this in 2 layers at the GTK level, GTK 2 and GTK 3 depended on on something called ATK, which was basically the accessibility API for GTK. So so that was that was at the, you know, the in process, yeah, c library level.

Speaker 2:

And then for the actual inter process communication between the application and the access technology screen reader or whatever, They they did, an inter what they called, ATSPI, assistive technology service provider interface. Thank you, Sun, for that mouthful. Sorry. Which was based on, in in the Gnome 2 era, it was based on Korba. And then for Gnome 3, they redid it based on DBUS.

Speaker 2:

So

Speaker 1:

Out of the frying pan into the fire. I'm not sure.

Speaker 2:

Yeah. Yeah. And, AppSpy, I will pronounce it. My my my I I I'm not deeply familiar with it, but my understanding is that it exacerbates the the problem of of, being an inter process communication protocol with the overhead that that entails by being fairly chatty. Like, you know, not being able to, like, bulk fetch information in 1 in in one IPC call, but having to do a lot of back and forth.

Speaker 2:

And, but, yeah, that that's that's basically I mean, Orca would would would connect to so first of all, each of the each of the applications that supported this accessibility stack would register itself with with a with an at spy registry daemon. And then Orca would connect to that. And then from there, would get object references to to each of the the windows, on the on the desktop. And it would it would register oh, and and with all of these accessibility APIs, that the screen reader registers event handlers so it can find out about things like when the, when the currently focused, control has changed and things like that.

Speaker 1:

And then so so how much kind of programmer awareness does this stuff require? I mean, are this require that the programmer build a program for accessibility and to what degree do programs kinda comply with that? Or

Speaker 2:

Well, it depends it depends on what level of the stack they're working with. If they're writing their own GUI toolkit, then it it does require total cooperation from, from them. If they're writing an application based on an existing toolkit like GTK or Pick Your Windows Toolkit or or a web application, then I I guess it depends on how custom they decide to go. Like, if they're just doing straightforward buttons and checkboxes and edits and listboxes, then in in a lot of in a lot of cases with with simple UI patterns like that, we can get accessibility for free. Right.

Speaker 1:

Okay. So that answers the question. They they they're basically hooking into the Windows toolkits to to minimize the kind of that that's course, like, what you want to do. Makes sense?

Speaker 2:

Yeah. Only now, one of if if if if any of you have ever seen me on Hacker News, it might have been because I was posting a comment on an article about some random GUI toolkit which had 0 accessibility.

Speaker 1:

Right. Well, so another thing I was gonna ask is that when when you have this kind of, step forward in terms of technology, it must feel like, okay. Wait. Now there's another community that needs to go be educated about what's required for accessibility. Or or or is that not the case?

Speaker 1:

I mean, it like, like, I mean, in particular, I'm wondering about the the the movement to the device and to the phone. I mean, has that been, hopefully not positive for accessibility?

Speaker 2:

Or It has been. Although, at first, when the iPhone came out, we didn't think it was going to be because the iPhone, of course, was and is an all touchscreen device with very few buttons. And when the first iPhone was released in 2007, it didn't have a screen reader built into the OS. And, of course, iOS being as locked down as it is, there wasn't a chance in hell that there would be a third party screen reader for that platform. But then, luckily, in 2009, Apple, introduced the voice over screen reader along with the iPhone 3gs.

Speaker 2:

And, Google was quick to follow with TalkBack, although it was a couple years before TalkBack on Android was at the same level of usability as well, anywhere near, really, the same level of usability as VoiceOver on iOS. But the iPhone was was really a game changer for us, because it was the 1st mainstream mobile device that had a screen reader built into it. And and with all with the explosion in apps for that platform, that just opened up a whole new world for us. And then, of course, it wasn't long before there were apps being developed specifically for blind people, In some places in in some cases, replacing special purpose hardware devices because the iPhone had so many useful sensors and things built up.

Speaker 1:

Right. Yeah. I mean, because I and you now have got such power in your pot. You've got, obviously, all of the hardware support you would need to do any of this stuff, and it's been able to ride the consumer economics, presumably.

Speaker 2:

Yeah. And and, more recently, Apple is pioneering a new approach to to screen reading, which is basically using machine learning, optical character recognition and other forms of machine learning to look at the pixels on the screen and try to figure out from there what's going on. In iOS 14, they introduced something called screen recognition, which does a pretty decent job, at least in some cases, of taking a completely inaccessible application and making it at least kind of accessible through machine learning.

Speaker 4:

That's very impressive.

Speaker 1:

Yeah. I mean, it's it it

Speaker 2:

And and sometimes when I see the the what seems to be the constant uphill battle with the long tail of GUI toolkits, I wonder if if things like Apple's screen recognition are really gonna be the future of the screen reader on all platforms.

Speaker 1:

Yeah. I mean, you do it it does feel like I mean, machine learning opens up some possibilities that obviously, we didn't have before where we actually don't need necessarily programmer compliance or using a particular toolkit. We can just actually look at the the pixels that are rendered and actually then, yeah. So that's that's interesting. Well, it's a relief to hear that it I mean, is it I don't wanna be overly optimistic, but it does sound like things have, broadly speaking, improved with available computation and so on.

Speaker 2:

Yeah. Yeah. Yeah. I mean, the the, yeah. The the yeah.

Speaker 2:

Machine learning, in particular, is a really promising new, new direction. I I haven't yet seen anybody other than Apple doing that yet. I'm not I mean, at least the team that I was on at Microsoft wasn't working on that while I was there. But I I'm I'm sure they will at point. Well, I hope they will at some point.

Speaker 5:

Point that it sort of crossover from the main limitation being just the capabilities of the hardware to being more of the social problem of how do you convince people to do this. Right? Because, like, presumably in the eighties, it's it's mostly a hardware limitation. And in the early 2000, it sounds like it's just Noam and KBE not liking each other.

Speaker 2:

Yeah. I think I think that, that the early 2000 was when well, when the last well, like I said, on on on the Windows side, the last reason to use specialized hardwares for speech synthesis went in 2000 when Windows itself added, the ability to mix multiple audio streams. Linux, of course, lagged behind, but that wasn't the hardware problem. Or so, yeah, by by the mid 2000, I would say just about everybody was using, software software on a standard PC, with without having to have a special, speech synthesizer card or device.

Speaker 1:

By the way, I guess it I just been super educational. I mean, there's so many things I feel that I've learned about and not just EdBrowse, although I'm now I I I went I I I'm sweet. I'm definitely gonna go check it out. It's amazing. Yeah.

Speaker 1:

It is amazing. So I guess a couple of things. Just one, what are some of the the the, open problems in accessibility? And maybe kind of in the same stroke, I mean, you're talking about kind of the Hacker News comments that you have to kind of remind people about the need for accessibility. What what is that kind of if you could distill that reminder, I mean, what should software engineers know about accessibility?

Speaker 2:

Well, first of all, if if you have a choice, use an existing UI mature UI framework. Now, I I I know that there are reasons why people don't use the, why people don't use the the the stock UI widgets that are provided by the OS. And kind of ironically, one one place where this happens a lot, where where where custom GUI toolkits are are common is in, digital audio workstations. And I say I I say that's ironic because blind musician is such a stereotype. But, I I I know I mean, obviously, I'm not gonna convince the whole world of programmers to stop trying to develop lean and mean alternatives to Electron.

Speaker 2:

So I am act I have actually started a an open source project that I call, AccessKit, which is an attempt to come up with a cross platform action over the the accessibility APIs that I mentioned earlier. By the way, Microsoft Active Accessibility is dead. The, the current Windows accessibility API is called UI Automation. I mentioned spy on the free Unix side, and, of course, Mac and iOS and Android also have their own accessibility APIs. And if if, heaven help us, you are, targeting doing bringing your application to the web platform using something like WebGL or Canvas and, yeah, porting your code to, you know, to the web through Wasm, then what you currently have to do over there is construct a parallel HTML DOM to expose information to the screen reader.

Speaker 2:

But so I I have started working on this access kit project to try to come up with an abstraction over all of these APIs that cross platform, GUI toolkits can use. And I'm doing it in Rust.

Speaker 1:

Nice. Nice. We we need a crab emoji. You know, I I wanna what what are my file,

Speaker 2:

so GitHub yeah. Accesskit slash accesskit on GitHub. It's still pretty early in development, but, I am I am, deeply indebted to Chromium for the design because, Chromium, of course, has to implement all of these all of the platform native accessibility APIs, except probably for the iOS one because you can't run act the actual Chromium engine on iOS. And I I really like the abstraction that they have come up with over over those APIs. But, of course, that abstraction is deeply embedded in the massive Chromium c plus plus code base, so it's not particularly reusable.

Speaker 2:

So

Speaker 1:

So you're taking that that that abstraction or inspired

Speaker 2:

by that abstraction? Yep. Uh-huh. Yeah.

Speaker 1:

That sounds great. Well, cool. We'll definitely obviously, we'll we'll link that. And what and that just has been great. I I again, thank you so much for taking us down.

Speaker 1:

I think I I dare say certainly educational for me. I think it was educational for a lot of folks, and I I love the, the samples were were terrific, and, really

Speaker 2:

Oh, since since, DECtalk came up last week, I should probably play a sample of deck talk for those who haven't heard it.

Speaker 1:

You see, this is where I I feel that I feel that Twitter spaces needs, like, a lighter emoji. Like, what is the lighter equivalent? You know what I mean? I mean Lighter. You you should be this is like what you do at a concert.

Speaker 1:

You know, you kinda, like, hold up your lighter when they're playing, you know, Freebird or whatever.

Speaker 2:

Okay. Okay. So here's deck talk again with an introduction from, Dennis Klatt who, in this case, is introducing his own speech synthesizer.

Speaker 3:

35. Several of the deck talk voices. I am Garfik Pahl, the standard male voice.

Speaker 6:

I am Beautiful Eddie, the standard female voice. Some people think I sound a bit like a man.

Speaker 3:

I am Dewey, a very large, perfectly voice. I can serve as an authority figure.

Speaker 6:

My name is Taylor Swift, and I am about 10 years old. So I sound like a boy or a girl. I am referring, Remy, and and a very worthy boy quality.

Speaker 3:

Alright. Now that was sweet. Though I am referring.

Speaker 2:

And that came

Speaker 1:

out in that

Speaker 2:

that came out in in 1984.

Speaker 1:

Holy cow. Yet. But that that last one is is definitely super creepy, but but that's Yeah.

Speaker 2:

Well, I mean, of course, he was showing off the range of vocal parameters that he could tweak.

Speaker 1:

Yeah. I mean, he developed a 3 pack a day smoker. It's amazing. I mean, that is that is ridiculous for the time. Holy mackerel.

Speaker 1:

Yeah.

Speaker 2:

I mean, I I I I was astonished when I learned that deck talk was already at that point back then.

Speaker 1:

And was deck talk aimed at accessibility, or is that aimed at just the more broadly speech synthesis?

Speaker 2:

I think the original DekTok commercial product was primarily aimed at telephony applications. And then in the early nineties, you had, you had a couple of DecTalk products that were aimed more at accessibility. You had the DecTalk PC card, which was a an ISA card that you could put in a PC. And then you had the Decktalk Express box that you could hook up to your serial port. Now, the there there were applications of that in in sort of in the disability field, but beyond just blind people, because for people who couldn't speak for whatever reason, combine something like DECtalk with the right software that would let them choose what they wanted to say using whatever whatever means they did have available to them.

Speaker 2:

And and well, Stephen Hawking, of course.

Speaker 1:

Stephen of course. You say, I mean, of course, it. And it's like

Speaker 5:

like how people, like, really like to customize, a lot of the things about their desktop environments and window managers and color schemes. Like, is there a similar thing for speech synthesis?

Speaker 2:

Well, the choice of which speech synthesizer you use, especially now that it's all software, is is a very personal preference. There are a lot of blind people who so you you might be surprised to learn that the the newer generation, more natural sounding speech synthesizers are not, universally chosen by blind people. There are a lot of blind people who still use speech synthesizers from the the the sort of deck talk generation of of more robotic sounding synthesizers because they're they continue to be very intelligible when you crank up the speech rate. And and this this is another I mean, probably the one of the most important settings for a blind person to be able to change about their speech synthesizer is how fast it talks. Because once you get at all proficient at listening to these things, you'll want to crank it up to at least something moderately faster than its, default rate so you can be more efficient.

Speaker 2:

Now there is a, there's another audio clip that I would like to play if we have about a minute. So and and this is not so much a demonstration of technology as as kind of a glimpse of what you might call blind culture. So in 2000, there was a patch that came out for jaws that had the unfortunate side effect on some people's machines of blowing away all of their JAWS configuration files in custom scripts. Oh, by the way, JAWS had a scripting language. And so, you know, some people these days well, sighted people these days make memes.

Speaker 2:

But there was this one blind teenager back then who, to vent about the effects of this Jaws patch, He loaded up his audio editor and did kind of a, yep, making fun of the JAWS install, the talking JAWS installer at that time. Let me just, play this. Oh, and by the way, the the the other reason I'm gonna play this is because it has a bit of singing deck talk at the end.

Speaker 1:

Oh, man.

Speaker 2:

So, so this

Speaker 1:

Is it whispering, what are you singing? I don't know.

Speaker 2:

I wish you'd No. No. No.

Speaker 1:

Yeah. I don't think

Speaker 2:

No. It's it's it's the the default guy voice singing at the end. But no. No. I'm not aware of anybody who used Whispering Wendy as their default voice.

Speaker 2:

I I think that was just a show off voice for the the creator of the synthesizer. But Okay. So this starts out with the opening music and sound effects of the JAWS installer, and then and then the I'll I'll let you know when the the satirical part of it begins. Sorry. This part's a little long.

Speaker 3:

Welcome to the Jaws for Windows Patch setup program.

Speaker 2:

Please relax Okay. This was an actual human voice over from the original installer.

Speaker 3:

Take a couple of minutes. Doctor. JAWS, please wait. Now Doctor John

Speaker 2:

This synthesizer that you're hearing, a lot of us still use it.

Speaker 3:

For JSW, which you will spend Doctor Jaws downstairs with JFW. Too bad. Doctor Jaws will not fix them. Doctor Jaws, this program has performed an illegal operation and will be shut down. If the problem persists, contact the program vendor.

Speaker 3:

Close button. Thank you for installing the JFW update patch. No. You're screwed. You are a loser.

Speaker 1:

Yeah. That is hilarious. That is hilarious, and also, like, I can dissatisfied user. You got a dissatisfied user. That is a level 7 dragging right there.

Speaker 1:

I mean, that that's that's the

Speaker 3:

case, Fred.

Speaker 1:

That is an actually, you know, about one thing I choose because you, you you call me on this this week that I really appreciated that, one one thing I definitely now appreciate is in terms of putting alt text on, like, on Twitter, Twitter's actually made it super easy to put alt text on things. Mhmm. And the and I actually like, you you sent me that whole path of, like, understanding the alt text on images, and it doesn't it it only assists those who need it. It does not in other in any way. I've so I've been I I I actually wanna find a way for Twitter to give me a reminder that I put alt text on images because I haven't done that before and we'll we'll start doing that from now on.

Speaker 2:

Yeah. Yeah. Now I I I also noticed when you started live tweeting, from the, Almost Perfect book the other day.

Speaker 1:

Oh, shit. I didn't do it. You're right.

Speaker 2:

Well, fortunately, the, so the NVDA screen reader has a built in command for using the Windows 10 OCR module to, to do OCR on a graphic. And it was able to do a pretty good job with most of those, with most of those, images. Although there was one I think it was from the customer is always right versus customer always gripes package that passage that, that that that that the picture was not good enough for it to really be able to OCR it well. But, yeah. Were were you taking pictures of an actual book with that?

Speaker 3:

Or I

Speaker 1:

was taking pictures of an actual book. You know, I came into this promising myself that I wouldn't talk about the word perfect book. But now that you brought it up, yes. I was taking pictures of the book. I was fooling myself with terrible.

Speaker 2:

Ah, okay.

Speaker 1:

But no. But it's a good point. I you know, it's funny because I and I had just it I I I can't I almost want a way for Twitter to kinda stop me when tweeting an image because it's just it's I need the just a reminder that to to enter it's easy to enter the alt text, and I need to go do it instead of Yeah. Although the UI is not that easy. Transcribe text, and I need to

Speaker 3:

go do it

Speaker 1:

instead of Yeah. Although the UI is not that easy.

Speaker 2:

Transcribing those whole passages from the book might would would be more onerous, I guess.

Speaker 1:

I mean, onerous only for the contents of the passages, not for the actual movie. Yeah. I mean, it's it's they would have been easy to go do. So Yeah. I just, a reminder to that, you know, I and, again, I really, really, appreciate it and, appreciate honestly, this has been so great.

Speaker 1:

Thank you for the show and tell too. That has been absolutely awesome.

Speaker 2:

And and then when I got it, I was like, oh, this is too long. I I won't be able to play it. But but then I I decided I would at least ask.

Speaker 1:

It's great. I'm I'm glad you did. It was as you say, it was a great, what I think is also insight into how the how low bearing the software is. Uh-huh. And, I mean, clearly, you've got someone who's pretty upset because it's like, hey, by the way, I really depend on this thing for, like, with my life.

Speaker 1:

And, you know, it's like I know it's just a bug to you, but it's like it's a really big end. So that it's it's a good, a good window to the importance of how, of the software. Yeah. Well, Matt, thank you again. Really appreciate it.

Speaker 1:

This has been a lot of fun. Very educational. The show notes on this one are gonna have a lot of interesting links, so looking forward to putting that one together.

Speaker 2:

Yep. I can also send you, if you like, the the audio files that I played so you can have clean versions of them.

Speaker 1:

That would be great. I think we'll we'll we'll link to them. Right, Adam?

Speaker 4:

Yeah. Absolutely. That'd be perfect.

Speaker 1:

Awesome. Alright. Well, thanks thank you, Matt, especially. And thank you everyone. This is a lot of fun.

Speaker 1:

Very educational. I'm, again, looking forward to getting the notes out there.

Speaker 2:

Yeah. It was it was it was fun for me to go back down memory lane like that. So

Speaker 1:

Awesome. Alright. Take care, everybody. See you next week.

Speaker 6:

Alright.

Speaker 1:

Thanks, man. Bye.

Speaker 2:

I played so you you can have clean versions of them?

Speaker 1:

That would be great. I think we'll we'll we'll link to them. Right, Adam?

Speaker 4:

Yeah. Absolutely. That'd be perfect.

Speaker 1:

Awesome. Alright. Well, thanks thank you, Matt, especially. And thank you, everyone. This is, a lot of fun.

Speaker 1:

Very educational. I'm, again, looking forward to getting the notes out there. Yeah.

Speaker 2:

It was it was it was fun for me to go back down memory lane like that. So

Speaker 1:

Awesome. Alright. Take care, everybody. See you next week.

Speaker 2:

Alright. So

Speaker 1:

Awesome. Alright. Take care, everybody. See you next week.

Speaker 2:

Alright. Thanks, man.

Speaker 1:

Bye.