Computing Education Things

About this episode

Danny Yellin worked for IBM for 35 years and managed many large software R&D teams, including IBM Research, Mobile offerings and IBM Cloud. He has a PhD in Computer Science from Columbia University and is currently a Faculty Lecturer at Reichman University in Israel. We talked about his proposed curriculum on LLMs for software engineering, how colleges should approach AI in their curricula, the different views on computing education for the AI era and the use of AI in teaching CS.

Where to find Danny

LinkedIn: https://www.linkedin.com/in/danny-yellin-9878154/

Google Scholar: https://scholar.google.com/citations?user=qfEhHz8AAAAJ&hl=es

References:

Software Engineering and Large Language Models: What university students need to know: https://www.linkedin.com/pulse/software-engineering-large-language-models-what-students-danny-yellin-k0rne/?trackingId=L6%2F1I7QHqhO8TMbzQyZ8xA%3D%3D

Syllabus for "LLMs for Software Engineering": https://www.linkedin.com/feed/update/urn:li:activity:7401911808388923392/?originTrackingId=0yc3qIrmmwx%2FGuhayEFqWg%3D%3D

AI Tools for Software Development (Carnegie Mellon): https://ai-developer-tools.github.io

College of Computing and Artificial Intelligence (UW-Madison): https://cai.wisc.edu/

Learn AI-Assisted Python Programming: https://www.manning.com/books/learn-ai-assisted-python-programming-second-edition

Beyond Vibe Coding with Addy Osmani: https://youtu.be/dHIppEqwi0g?si=a9NvIRRcDLFFG_tY

The Minimum Every Developer Must Know About AI Models (No Excuses!): https://blog.kilo.ai/p/minimum-every-developer-must-know-about-ai-models

About the podcast

You can watch the full episode on Youtube. Or listen to it on Spotify, Apple Podcasts, or your podcast app of choice. Thanks for listening!

If you are interested in these topics, I have a weekly newsletter that you may want to subscribe to: https://computingeducationthings.substack.com/

Creators and Guests

Host
Daniel Prol
Ph.D. Student in the Department of Computer Science at the University of Houston interested in computing education
Guest
Danny Yellin
Faculty lecturer

What is Computing Education Things?

A podcast about the world of computer science education. If you enjoy teaching people how to become better computer scientists and discovering new ways to learn CS, this podcast is for you!

Speaker 2 (00:00)
Welcome to another episode of Computing and Education Things. We are back after the first three episodes from last summer. I know six months is a lot, but yes, no one said that starting a PhD could be easy. I'm in a new country. Lots of changes going on, but I'm back. And today I'm very excited to have Danny Jelling on the show. Danny has a PhD in CS from Columbia.

worked for 35 years at IBM. That's enough for another episode, I think. And he's currently a faculty lecturer at Richmond University. Thank you for your nice, Danny.

Speaker 1 (00:43)
Pleasure, pleasure to be here.

Speaker 2 (00:45)
Yeah, thank you. Thank you so much for being here. You know, this feels like a great moment, Danny, to talk with you, especially since you'll be teaching this course, ⁓ LMS for Software Engineer, for the third time next semester, right?

Speaker 1 (01:01)
That's right. That's right. Yeah, I started three years. I've been teaching for a while a course in cloud computing. And I went to the dean and said, you know, I would like to teach a course on using LLMs for software engineering. And he said, that's a wonderful idea. And I don't think there's still very many courses in this area, least not as focused on this topic, I don't think. I've been, you know, there are a few popping up now. But when I started it was

really one of the only ones and it's been a real adventure over the years.

Speaker 2 (01:34)
Yeah, so I don't know if there are so many courses about this, but of course, I recently read that the, for example, University of Wisconsin here in the US is launching a new College of Computing and AI, which will make it one of the first US universities with AI in its name. Yeah, I don't know if you agree, but I believe AI is no longer just a discipline of computer science or statistics.

It has become a ⁓ broader, more complex discipline that draws on many intellectual branches, I think. So before we discuss the syllabus, which I am excited to talk about, I think it's important to talk about how the AI boom is impacting colleges, right? There will probably be many more CS jobs in the near future.

Danny, my first question is, what are your thoughts on how colleges should embrace AI and how important it should be in their curriculum?

Speaker 1 (02:44)
Yeah, know, usually universities, believe it or not, are somewhat conservative. And they're slow to change curriculum and new ideas. But I think with AI, are a lot of especially certain universities that very forward thinking. I don't know if you've seen, I think it's Dr. Angel Caberra, the president of Georgia Tech. He's been one of the leaders. He has some videos out there of what he's doing. I know of my own college.

There's a doctor Boaz Ganour who believes that this just can completely change the college. And so I think you're right. It's not just computer science. It's going to be every single discipline it's going to affect. know the change is coming. And I think it requires a lot of thought of how is it going to change? How is it going to change how students learn? How is it going to change the curriculum of individual branches, whether it be health care, legal?

everything's going to be effective, right? And then, and then furthermore, you know, what type of skills do we need to teach students today? Right? There's, there's a new set of skills, I think are going to be needed. And then, and then finally, you know, how do we prepare, you know, the world is changing, there's going to be a new sort of economic models, there's going to be new sorts of organizations, right? How do universities start preparing for that? How do they experiment with that? So I think

There's a lot of ideas, very interesting ideas out there, but we'll have to see how it unfolds. But ⁓ yes, it will definitely change ⁓ significantly, I think, every discipline.

Speaker 2 (04:23)
Yeah, yeah. And I think your syllabus is very timely. ⁓ know, this is such a fast changing field. But I think it's also important identifying the core concepts and the fundamentals that remain essential, right? As you said in one of your posts on LinkedIn, because there are a lot of advances in AI. But I think we can now go through your syllabus because

We can go through each of the 15 chapters in the syllabus, but I leave in the description, so feel free to read everything there. But I'd like to take some topics that have been discussed in the competing education community over the past few months based on your syllabus and connect them.

Speaker 1 (05:14)
Before we discuss the individual topics, great. Yes. They had one point here. And that is, you know, this is a very fast changing field, obviously. Right. Yes. Here, you know, even in middle of the semester, right. I'm updating things, right. There's always new work coming out and both in academic studies and in industry. And we all know, you new releases of models that, know, however, one thing I have to my maybe a little bit surprise is

things are going to change and they're going to continue to change. But some principles I feel, you know, are not changing, are becoming, as time goes on, you see that they become, you know, they're stable. Now, what I mean by stable, I mean, take something like prompt engineering and specification. That's a clear, a very key principle. Yes, it's going to change as the models change and we have to adapt it to some degree, but as something to be taught and a principle on how you deal.

with AI and LLMs, that's not going to change. And there's many other. So what I'm trying to do is in the syllabus, and it's not completely there because things are still evolving, but trying to find those things that even, you know, hopefully four years from now, even as technology really changes, these topics are very timely and still going to be very, very important.

Speaker 2 (06:33)
Yeah, you mentioned prompting engineer, right? For me, prompting engineering is just an expression of computational thinking, right? And using LLMs develops this skill. ⁓ That said, right, today it's prompting engineering, but in three years time, who knows, right? Maybe it will change. ⁓ yeah, maybe we can start.

Speaker 1 (06:57)
Let

me just say one thing, Matt. I agree with you. This idea of computational thinking is very similar, but often computational thinking has a little bit more of the how you do something in it. Whereas I think one of the things with LLM is really, I would call it more specification engineering, right? I like the word engineering. It's how do you specify, right? And what level do you specify and what details need to go into there? And there's been a lot of studies in the literature, right?

on, you know, if you don't specify things exactly, whatever you put in is what you're going to get out, obviously. But there's also an issue sometimes of over-specification with LLMs. So it's a very interesting field of understanding what is the essentials of how we interact. It's not just, I think, computational thinking. It's also how do we interact with these LLMs. It's a different

⁓ you know, it's kind of almost legalistic in a certain way in terms of how you have to specify things very exactly.

Speaker 2 (08:02)
Yeah, good points. So yeah, first of all, for context, Danny, can you tell us if you teach this course to advanced students, juniors and seniors in the US system, or is more for freshmen, beginning students?

Speaker 1 (08:18)
Yeah, so I'll tell you, I started when I first did it, it was for advanced students and even master students. This year, the university decided they want to make it ⁓ part of the core curriculum, so required course. So now it will also usually be third year students, which here is that's like final year, usually third year. So it's advanced students. So they do have background in computing, right? But it's, so I've toyed with the idea and actually

I've spoken about, the university tries to get with how do they spread and try to teach even students of humanities and other things, how to deal with LMs, developing a curriculum for more broadly and for not only for introductory students in CS, but for other disciplines as well. But the course that I'm teaching right now is actually for CS students and really more towards the senior CS students.

Speaker 2 (09:15)
Got it. Got it. So yeah, let's start about talking these topics in the syllabus. The first topics focus on the basics, right? And the fundamentals. And it makes sense, as we just said, because once you acquire these principles, that's going to give you the knowledge and resources to understand each part necessary to achieve your goal in software engineering.

Maybe you can talk about why you focused on these particular topics in the first chapters.

Speaker 1 (09:53)
Yeah, so so I'll tell you the first thing I do, I used to have an introductory one where I actually went through more of like, you know how, you know, like, a short introduction into neural networks and LLMs and, and attention.

And it turns out that there's a full course of that RU offered in the university. Really, kind of more, how do you build an LLM? ⁓ And so I decided that was not as important for this course. This is more for people who want to use LLMs. However, they have to understand ideas like temperature. They have to understand ideas of context windows, how much information that sometimes you're limited in how much information you have, that the amount of variability

through temperature can affect your output. those kind of very, the way that you can set parameters in the LM for you, a user makes a big, big difference. So I want to get that across to them. That's one of the very first things. And the other very, very important thing which comes up throughout the course is non-determinism. And I start off very much with that because, you know, I think that there is a danger sometimes that people, you know,

even software engineers, take whatever the LLM gives them and say, this is the truth. There's been a lot of publicity around that, even in the press about lawyers having cited precedents that they got from an LLM and it turned out it was completely fabricated. And we all know cases, right? And even as LLMs get better and better and better, by the very nature of how they work, right? There are going to be things that are incorrect. So the idea of non-determinism and really hammering that away and showing them examples of non-determinism.

And it even goes so far and it's interesting discussion. I don't want to get too technical here, but if you set the temperature to zero, you would think it should be deterministic, but it turns out it's not for various reasons. So trying to hammer those points into students and giving them examples of that sets the tone for the course, because part of the course is how to use LLMs, but also a big part is how to verify results that you get. So I want to make those two sides to it. How do you specify, how to use all the

tools around it, whatever. But at end of the day, you have to be a little bit skeptical about the results. And how do you increase the probability that you're going to get good results? But then how do you verify, or at least invest your ability, that the results are correct?

Speaker 2 (12:20)
Yeah, very interesting. think for especially for CS students that are going to build these AI tools in the future, they need to know exactly how to build, not just how to use LLMs, because when you are using LLMs, you are not focusing on like changing the temperature or the context window, right? You are just using the GUI, right? But when you're building, when you are interacting with

with code, with different ⁓ LLMs, you have to set up this. You have to adjust the tokens in the context window. You have to adjust the temperature. You have to iterate to see if the model is giving you the right output or not, et cetera. very interesting. And let's move to the future because you also in one of the...

first chapters, you also mentioned the future of software engineering, right? If anyone is interested, I leave a link to a video of ⁓ Adi Osmani from Google on the Pragmatic Engineering Show, which I think gives a clear idea of where ⁓ the software engineer role is heading. But, you know, in this section, what's your approach, right? Do you try to give students

a real sense of what the market ⁓ is like, how the developers work now in real settings, but while also not forgetting how it was before, right? Bringing, for example, your experience at IBM too.

Speaker 1 (14:02)
Yeah, well, I always interject anecdotes of all my experience in IBM. I can't help it. But I think the idea of, let me go back a little bit. A couple of years ago, actually, maybe about the time I started this course, I wrote a kind of like an op-ed piece in communications of the ACM. And I called it the premature death of programming.

And it was a response to many others who said, no, within a year or two, you're not going to have programmers anymore. We're just going to have AI that's going to create programs for you. looking back at what I wrote, I would probably revise it. No, I was probably a little bit too cautious. think AI has outstripped what I thought it could do. On the other hand, I still hold by the fact that at least I don't have a magic.

to look into the future and tell you what it's going to be in 15 years. I can't tell you that. But in the more near future, like five years to seven years, what I want to tell the students is yes, AI is going to tremendously change the way we do software engineering, but we're still going to software engineers. And we're going to need software engineers who are proficient in using these tools.

and new paradigms that are going to come about because we have all AI. So it's going to be more of a blend, right? Engineers are going to have to embrace these tools and embrace them and really understand them and utilize them. But it doesn't mean we're not going to have engineers in the loop.

Speaker 2 (15:46)
Yes, yes, absolutely. I like this concept of computational judgment. You know, I see that toward the end, Chapter 13, if I'm not wrong, you also dedicate a chapter to very fine LLM responses. So I think, you know, at the motivational level, at least, there could also be a contrary effect where students become demotivated due to the lack of skills to understand these AI-generated responses.

So I think it's very important what you mentioned now that humans feel they are part of the process, this human in the loop or human in the middle ⁓ idea, right? And in this textbook from Porter and Singaro, they mentioned this pedagogical concept that I think is very interesting to talk about now. It's about read, critique, and improve, right? Where students are given code and

an LLM explanation of that code. But there's that call to action to a student saying, it turns out that the explanation from the LLMs is incomplete as it doesn't describe the code for all the inputs. So determine what explanation is missing and add your answer to the explanation. So I think this type of activities, I don't know if you are also...

trying to do these kinds of activities, right? Because they are coherent and consistent with the concept of scaffolding, right? Because, you know, we are not just generating AI generated code, but also we are practicing our software engineering skills, right? So about this Verifying LLM Responses chapter,

⁓ What's your take here? Can you share some examples of how you also develop this agency from the human in this review process?

Speaker 1 (17:51)
So first of all, there's different techniques, many different techniques that have been developed. So one I like talking about, for instance, is majority voting, which is a very interesting technique, right? Where you might talk to multiple LLMs, get multiple responses, maybe the same LLM multiple times, maybe with different prompts.

And then you see the majority. What do the majority agree on? And do they agree on the same thing? And you go after the majority. There's actually just a very interesting paper that I don't think has been published yet, but was on AIXRV, where they, and I apologize, I don't remember the author's name offhand, but they talk about that you have to be aware with majority voting because it turns out, according to their experiments, often,

LLMs will make the same mistake when LLMs make mistakes. I think they're they said 40 % of the time So if you had their correct answer and an incorrect answer and you look at the LLMs that make incorrect answers 40 % of the time they're making the same incorrect answer Which a little bit it doesn't completely say that majority voting isn't good. It just means once again majority voting just

Let me make it simple, right? It says, I'm going to vote. I'm going to ask the L.M. like, you know, we might have 10 friends what their opinion is. And you say, well, if the majority agree that I should do it this way, I'll do it that way. The same sort of idea with LLMs, right? So that's one technique which is very popular. There's another technique called self-consistency, which goes even further. It says, I don't just look and say, what do my friends all say? You know, should I should I? don't know. Should I marry this girl or not marry her? Right. But they get either.

And I look are there reasons right if they all tell me to do it But they're doing it for different reasons it gives me more confidence because they're all coming out from a different perspective There's something called self-consistency ⁓ But you know there's many other things there's LLM is a judge. We ask another LLM to critique right what happens right? It's kind of an infinite regression, right? You have an LLM give you a response you tell it an LLM to tell you

Is that response good? Right. And then, then how do know that LM is right? You could go on forever, but it increases your confidence, right? All this idea, some of this verification is not absolute, but increases your confidence. There's other things like frameworks, like guardrails AI, where you can actually put in rules and say, you know, I'm expecting you to give me a response that maybe contains certain words or is JSON format or is in a correct schema or is in a correct.

There's some format, whatever, and it will actually check for you. So might not check the exact answer, right? But it can make sure that the format is right. And there's other things, right? You want to check that, know, vial language isn't used, right? So you can have a guardrail that will check the response to make sure that it's socially ⁓ appropriate content. So there's different things that will actually check and make sure it doesn't violate it.

And then there's also things like you can go to a, a, ⁓ a very, a trustworthy source for information. So you can, instead of relying, you can ask the alum for references, right? Now it happens and has happened before the alum will give you references and they're wrong, but you can automatically check those references and see that they correspond to what the alum is telling you. So there's a lot of different techniques, really a tool bag of techniques, depending on your problem.

to help you verify that it's correct. And of course, there's always what's called human in the loop, where you have a human who looks at it, right? And verifies it. ⁓ But the point I'm trying to make is the industry, know, in academia and in industry, they're coming up with many different techniques to give you more assurance that as you get to an answer, don't just take that answer, but do certain techniques to help verify it and make sure that it's really correct. And there's a...

whole handbag and a toolkit of such techniques.

Speaker 2 (21:59)
Yeah, I think in academia we have different views on CS education for this new moment we are living in. Of course, there are people in general ⁓ who have this, you know, extreme view that programming, as you mentioned before, that's programming, our parameters are obsolete, are skewed, and the future is pipe coding. But I don't think that is the view of the competing education academy ⁓ community, the general view, I think. I think one approach

I'm seeing is this kind of cautious view that could basically ban AI because students must learn to do things without AI before they use it, specifically for CS1 and CS2. And then I'd say there is this view of embracing AI tools, They're essential now, and of course.

they help students learn to use them well because it's a golden opportunity to teach AI skills like you are doing, right? All of these opinions are legitimate and there is a bunch of papers around that and about how it helps or how it hinders learning. The dilemma here is that we want students to learn skills, right?

like reading, like debugging, like understanding the code, to have more self-regulation skills and other metacognitive abilities. But we also want them to develop these necessary skills to prepare them for the job market, right? As you spent 35 years at IBM, right? So how can we balance these two goals?

Speaker 1 (23:53)
Well, I'm not sure they're in conflict with one another necessarily. I I think you got to have fundamentals in my opinion, you know, of learning. It's like, I used to always hate it when people say, well, how many languages do you know? You don't have to know every language, right? But you know, if you understand how to program and you understand the concepts, right?

need comes, you can program in another language. Same thing, you need to have some basic understanding if you're going to be able to, whatever field you're using, right, if you're doing networking or you're doing storage or you're doing some domain, you have to have the fundamentals of how you program and how you do that. in specifying to an LM, how will you know what to specify? How will you know how to verify?

I'll give you just a really, let me, this is ⁓ too small of an example, but just because it just happened to me today. I had given an LLM, I wanted to create something and this is using a very high powered LLM ⁓ from Anthropic. And I said, I gave it a specifications that create this program. And I was very impressed, really good. And I started looking at it I well, I had told it to use

It's going to get too technical, I thought it to use in Mongo database to use two collections and it created not two collections, but two databases.

Speaker 2 (25:17)
Wow.

Speaker 1 (25:19)
Yeah, and because I know the fundamentals, you know, I just looked at the course of wait, wait, that's not my tone. And I said to it, you know, I said to collection, I said, you know, of course, it says, Oh, I'm so sorry, you're right. You know, redo it for you. Right. And there are other things, right? Other different things there that, you know, mistakes it made. But if you didn't understand the concepts and the overall things, you have to understand every detail. But you know,

able to, you wouldn't be able to catch these problems and you would end up having the AI create for you something that's not what you want.

Speaker 2 (25:53)
Sometimes the students are very focused on the output, right? And they forget about the process, right? because as you say, I just said now, right? They can learn a lot from making mistakes and it's okay to struggle with problems to reach your flow state, I would say, right? And to be honest, we should normalize this iterative process, which can be in

Incredibly frustrating and fun at the same time, I would say. But I think that's how learning works, right? Through persistence and approaching problems from different perspectives.

Speaker 1 (26:36)
Yes, absolutely. Absolutely.

Speaker 2 (26:40)
All right, all right. I think, yeah, just I think to wrap up, right? I think we can spend the last five

Speaker 1 (26:47)
Before you wrap up, is one thing I think is important in teaching that I would like to say, which is it's not only teaching how LMs work, it's not only teaching how the different tools we have. I teach things about RAG, I teach things about different tools like chain as just a tool. And of course I get into agents, as agents are very important and very big now. ⁓

It's also about the patterns. How do you build app, not just building using an LLM, but more and more what people are going to do, they're going to LLM based applications. I you have an application, you want to do something, right? LLM is going to be one, maybe a very important part of it, but it's not going to be the entire thing, right? And how do you structure those programs? How do you build them? Right? I'm not sure if it's clear what I'm saying, but

It's the point that, you know, you're building, I don't know, an agent to build, you know, good example everyone uses is to go on a travel trip and it's got to gather information from, know, has to your preferences of what countries you want to visit and maybe, you know, temperature, what time of season and attractions and things you're interested in. has to be able to contact airlines. It's kind of do a lot of different things and you have to be able to put

together into an entire system. So there are emerging patterns on how you do that, how you build these more agentic-like systems. And I think that's also one thing, because it's not going to be just working with an LLM. It's going to be really a broader view of how, and that's what computing has really been about it, As computing is about using computers for many, different domains. And it's integration, right?

it requires a lot of other computing skills as you start doing that.

Speaker 2 (28:49)
Yeah, interesting, interesting, Danny. Thank you. And I think we can spend the last five minutes talking about your use of AI in teaching, right? A few months ago, we hypothesized about how AI could help CS educators with different teaching tasks. But right now there are already many tools in the market and we've been able to test them in the classroom over the last few months. For example, I'm thinking now about generating exams

examples that could reduce academic dishonesty since it's less biased. So what clear use cases do you see?

Speaker 1 (29:30)
funny you mentioned that I just gave a quiz for one of my classes, not an AI. And I found that generating the questions was not very successful, partly because it doesn't know exactly what I could tell the topics, but it doesn't know exactly what I taught. Right. So that's number one problem. And I want, you know, I don't want to test the kids and things that I did, the students and things that I didn't really, you know, go over exactly, or I didn't spend enough time on.

But besides that, I found it was too generic. That when I teach, I ask for certain things that I feel passionate about that you need to know. They're really important principles. And really, right? that comes off in my questions that I give up. But when I asked it to critique my questions, that was a different thing. Then I thought it was good. made language issues. found some inconsistencies. It was very good at that. ⁓ But that's just kind of an aside. ⁓

With me in the classroom, when I teach AI, what I do is, so I'm a very firm believer in marrying theory and practice. So I don't have a class that's just, it's not a how do, like, know, get out your computer and now we're going to teach you like these books and how do you use AI or whatever to do something? I want them to understand the principles behind it, understand the concepts, understand, know, so we go over that.

We go over the concepts, but then we say, OK, let's see it in practice. And then we start opening up the computer. Sometimes I do it live, quite honestly. Sometimes I do it before I get to class and just take screenshots of it, because I find that I spend a lot. I can be much more focused to the point if I do that. But often in class itself, we open up and we experiment and see.

are we getting the same results that I just mentioned in the theory? you know, interacting with the LLF.

know how to do that. That's the way I just try to combine the two.

Speaker 2 (31:39)
Yeah, interesting to see your both experiences, right? The good side and the bad side of using ⁓ AI in the classroom. So this was a great talk, Dani. I'm really grateful for your time. What's the best way to contact you? Please tell the audience here how they can reach you.

Speaker 1 (31:57)
I think the best thing is probably on LinkedIn. So they can find me on LinkedIn. ⁓ We can post their way. can give you my LinkedIn or you have it and we can post it for that. That's probably the best way. ⁓ And I'm happy to, you know, I love discussions and, know, hearing what other people are doing. ⁓ You know, there's so much going on. I'll just close with this, you know, you know, I, because I do a lot of students, I'm actually,

I have several students that I'm advising them on their master thesis and all of them are working and they're coming with their ideas from work. One's working on financial institution, know, computing and had ideas and he just finished his thesis and did a great job on it on using LLMs to help create financial strategies. Another's working in cybersecurity and has a great idea and he's in the middle of it of ⁓ using multimodal analysis of PDF documents.

or cybersecurity. So it's so amazing to see so many different areas now and the creativity going around and how we can leverage LLMs as and to advance the state of the art.

Speaker 2 (33:11)
Amazing. Thank you so much, Dani. Goodbye, everyone. So see you in the next episode of Computing Education Things. Bye.

Speaker 1 (33:18)
It's Dan. Bye bye.