Undercooled: A Materials Education Podcast

Steve and Tim talk about grading.  YouTube version can be found here:https://youtu.be/_yIxWQSd_tA?si=ttA0hHUo3ip4gUIZ

Show Notes

Steve and Tim talk about grading. 
YouTube version can be found here:
https://youtu.be/_yIxWQSd_tA?si=ttA0hHUo3ip4gUIZ

This episode is sponsored by the University of Michigan Materials Science and Engineering department (https://mse.engin.umich.edu).

Creators & Guests

Host
Steve Yalisove
Host
Tim Chambers

What is Undercooled: A Materials Education Podcast?

A look into active learning, flipped teaching, team based/project based learning and much more.  Everything related to teaching materials science and engineering will be covered. Kindly sponsored by the University of Michigan Materials Science and Engineering Department

[MUSIC]

Well, hello.

Welcome to another

episode of Undercooled.

Today, Tim and I are

gonna talk about grading.

Why?

Because we just graded our

class from the end of the term,

and it's foremost on our minds.

So why don't we start by talking about

what grading is, why it exists,

and maybe some of the

problems we've identified.

So we'll start with what's wrong with

traditional grading,

and let's start with curving.

So Tim, what do you think about curves?

I think they're very

statistically interesting,

and they reflect a few assumptions and

biases in our grading system.

First is this idea that there should even

be a Gaussian distribution

in our populations

that we're working with.

And like a lot of these assumptions about

grading, there's a nugget of truth to it.

If you looked at the entire nation,

there'd probably be something pretty

close to a normal

distribution of proficiency in

lots of different areas.

But at a university, we're performing

this biased selection

from a certain subset of

the population to try to get the kind of

people that we want to get.

So believing that there should be an

underlying normal distribution

is already quite a flawed model.

We tend to have a much more skewed

distribution with the

students that we have,

because we're selecting who we admit,

some of those selection criteria.

Actually, you know, that's

another episode entirely.

We can talk about admissions another day,

but we don't have a Gaussian

distribution to begin with.

So why would you try to enforce one?

I think curving is

pretty silly in that sense.

Yet we're always told that you should fit

your distribution to a bell curve,

which is a Gaussian

distribution, of course.

And that's the standard way that most

faculty just approach it.

And it's a shame because it's really

flawed thinking like

you just pointed out.

Yeah.

And there's another problem there, which

is to a lot of students,

this conveys a very competitive,

anti-collaborative mindset, right?

If they're going to

be stuck onto a curve,

then they have to out-compete their

classmates in order to get a grade,

which maybe they should or should not

actually care about,

but intrinsically a lot of them do.

And as soon as you throw away the curve,

you can get to a space

where everyone can succeed

and everyone can help each

other achieve at a higher level

to actually meet and hopefully exceed

your expectations for the course.

Yeah. That's so true.

Whoever said that only 10%

of the class should get A's.

What does that mean?

You know, if you take teaching

seriously, like I think we do,

you want everybody to

be able to get an A.

But you have to have the bar pretty high

so that everybody actually earns an A.

But our goal should be

to get everybody there.

Why have an arbitrary number that makes

absolutely no sense.

And another thing that makes absolutely

no sense is why we're

so focused on teaching

grading with a zero to 100 scale,

as if we have the precision to say

anything about that.

So if you think about it, when we grade

on a zero to 100 scale,

we normally look at the standard grading,

which says you need a 93% to get an A, a

90% to get an A minus,

an 87% to get a B plus, et cetera.

Why do we have those numbers?

And when we do it like that, what happens

is the entire class gets compressed

between usually 75 and 100.

And then we look at all the scores,

especially in a large class,

and we look for natural breaks because

we're so omniscient

about what these things mean.

And we know that this

moves people to the right area.

So we sit there and we look at grades and

we look at the boundaries

when we do our final grades,

because we usually never publish these

because then the

students will scream at us.

And we look and see,

well, this person had a 89.95.

And okay, you go, well, that's pretty

close to a 90. Let's give them a 90.

Except you realize in your class of 150

students that you've got 10 students,

four of which have 89.95.

And then just below that, you have 89.89.

And you go, ah, that's a big jump.

But think about that.

That is 89.95 versus 89.99 even.

That is 0.04%.

So the fraction is 0.0004.

How in our wildest dreams do we feel that

that is significant?

If we were to put error bars on this and

think thoughtfully of

what those error bars mean,

probably we can only measure learning

through exams and

grades and stuff like that

to at best plus or minus 10%, which is a

lot bigger than 0.04%.

Yeah, there's a great irony there that

especially in a lab context,

I'm always mentioning to students the

importance of significant figures

and thinking about the

precision of your measurement

and how the number of digits you use

conveys the intrinsic precision of the

tools that you're using

and for significant figures on a grade.

That's borderline meaningless.

I could be convinced we have two sig

figs, as you said, to order of 10% error.

Yeah, that's actually pretty achievable.

And hard to do but possible.

But if we were going to grade ourselves

on that kind of

effort, we'd probably fail.

And yet this is done in just about every

class all over the world.

And it's absurd.

And it's not right.

And so unless we can find a way to

magically improve the precision of our

measurement of learning,

which maybe can be done, I mean, you can

have precise measurements that are good

down to 10 to the minus 7.

Delta E over E or

however you want to measure it.

Look at the LIGO project, measuring

gravitational waves.

That was the hero

project of improving precision.

Unbelievable what they did.

But that took billions of dollars and it

was in an area where you actually could

make those measurements and

could improve the precision.

I'd argue it's a lot harder to measure

learning than it is to measure

gravitational fields.

And so why are we even doing it?

So the simplest thing, of course, to do

is not grade on a zero to 100 scale,

but something like zero to 10 or even

zero, one and two or a five point scale

or something like that,

where the data is a more meaningful

reflection of the actual precision that we're trying to measure.

And so, but it still comes back to how do

we measure learning?

And that's a deep subject.

And I'd argue that is the hardest problem

in education is how do

you measure learning?

And I wish more people

studied it, but they really don't.

They just keep giving tests.

So, and tests are a problem.

Maybe you can talk about

the problem with exams.

Oh, goodness.

I, you know, I first have to come from

the research perspective, having done

education research for enough years,

that there is this gap between measuring

learning in a rigorous

research context versus

what you have to do to survive as a

teacher with hundreds of

students in one semester,

developing a tool to precisely and

reliably and validly measure student

learning in one small topic

area is a years long effort.

And that's not something that any of us

have the luxury of time to do when we're

just trying to get through our day

and make sure, hopefully try to make sure

that our students are learning something

rather than spending all of our time

developing measurement tools.

You know, these measurements can be done,

but I don't think it's practical to

expect most practicing teachers to get

into so much of the minutia of the

psychometrics and the statistics and the

validation procedures to say,

"Yes, I'm actually

measuring learning with this test."

That's just not

something we can practically do.

So instead, you have to ask, "How should

we assess our students?

What can we do as regular practitioners

of this craft of teaching to say, "I

think I have a gauge, you know, a certain

level of understanding of which of my

students have learned how much."

So the other problem is that almost all

exams are focused on having a student

demonstrate a certain level of knowledge

on the day and time of that exam.

And very little thought usually goes into

the idea of retention.

And this is famously born out with Father

Guido Sarducci's little comedy skit, The

Five-Minute University.

I have none of you out

there have ever seen that.

Just type into Google, "Five-Minute

University," or in YouTube, and watch it.

And he makes the point that, you know, in

five minutes, he can teach students what

the average college graduate remembers

five years after they leave college.

And sadly, it's kind of true.

And there's a large body of educational

research that bears this out, that by

giving students exams that are largely,

largely factual recall and retrieval,

has caused generations of

students, myself included,

headed to the library two days before the

exam to cram like crazy to remember all

the things that I might be tested on.

Then I go and do a data dump on the test.

I do really well, but they show that even

two hours later, the students couldn't

even do 50% retrieval

of what they just did.

And two months later, it's down to like,

you know, 20% or some crazy low number.

And so that's the whole point of Father

Guido Sarducci's little skit.

And why don't we think about retention?

Retention is what really matters.

Now, in some fields where it's

conceptually based, we

can think about concepts.

And if you can understand a concept, the

chances are you'll retain that

understanding much

longer than just memorization.

And so the physics community has come up

with concept tests, which is really an

excellent way to do that.

But it's not so easy to translate that to

other fields because most of the concepts

in material science, for instance,

are really chemistry

concepts or physics concepts.

And we don't test those.

We try to test the ability to use those

concepts in the context of material

science to be able to answer a question.

Like earlier, we were just playing around

with the new voice model for chat GPT.

And Tim, you asked it about Martin site.

What is it?

And chat GPT was confused.

It just said it's an interesting phase,

but I don't really see it

on any of the phase diagrams.

So I'm not quite sure what it is.

And so chat GPT just scrubbing the

internet didn't get that

it's a metastable phase.

And you don't see metastable phases on

equilibrium phase diagrams.

And that's the whole point.

So there's an underlying concept there.

And so that's why chat GTP didn't get it

because it doesn't

really understand anything.

It just repeats things.

So giving exams is a dangerous way, in my

opinion, to assess learning.

And, you know, there is one type of exam

that's fantastic for actually

understanding whether

your students know something.

That's right.

But that's not a written exam, is it?

No.

Yeah.

I think you know

where I'm going with this.

Go ahead.

Oral exams can be so powerful.

Both on the student side

and on the teacher side.

You know, this ability to interact with

someone in real time to ask them

questions, to give them hints,

to see exactly where in the process of

working out a problem in

real time, where they get stuck,

where they have moments of insight.

That can be very powerful for revealing

learning in our students.

And it's the original

adaptive testing, right?

Yeah.

The student is crushing it.

You make the question harder.

You see how far they can get.

The student is struggling.

You back off and you give them support

and you see what you can

help them with collaboratively.

That's right.

We're trying to develop all this

technology to replicate that experience.

But either way, the oral exam, as good as

it is, is not very scalable.

If you've got a class of 100 students,

how are you going to have 100

half-hour-long conversations

with all of the

individual people in your class?

And that's the key point.

Some people say, "Oh, you only have to

give a 10-minute oral exam

and you can learn everything."

I'm not sure I agree with that.

And then they say, "Well, to grade

somebody's paper takes about 10 minutes,

so it should be scalable."

But I think you said the key thing.

It kind of takes more like a

half an hour for every student.

Because all of us who went and got PhDs,

most of us had a PhD

defense, which is an oral exam.

And that takes at least an hour,

sometimes two hours.

But that's a big body of work.

And I'll never forget, I was a math major

as an undergrad, and I

had a really interesting

professor who he was in the National

Academy of Sciences.

And he used to tell us

about thought process.

He brought up this thing,

the Epicurean model of thought.

And so he thought that when you were

trying to solve, prove a

theorem, very abstract, right?

And you would sit there and have this

experience of pounding your

head on the table and getting

very, very frustrated while you tried to

figure out the problem.

And often the best thing to do would be

to work for a few hours

and then go take a break,

do something else, go

to the bar, whatever.

And maybe in the middle of the earth, of

sleeping that night, you'll

wake up with this aha moment.

And he viewed that as the, he called it

the Epicurean model of thought.

I have no idea if this is

accurate, but it's a good model.

While you're pounding your head on the

table, you are taking

relevant ideas off that pegboard in

your brain and throwing them into a bingo

pot where like the

kinetic theory of gases,

you get random connections of

these ideas coming together.

And when the right idea comes together by

random, the power of the

aesthetics of that beautiful

solution are so powerful, they crash

through to your conscious mind.

So he gave us math exams.

We only had 14 kids in the class because

it was, you know, who was a math major.

And he would give us a piece of chalk and

the chalkboard and he'd

tell us something to prove

that was pretty easy. And we'd start

writing on the board and doing it.

As soon as he realized that we knew what

we were doing, he

said, oh, enough of that.

Let's do this. And you knew you did well.

If you could never finish any of the

things he asked you

because he could tell that you

knew how to do it. So I waste the time

and he wanted to push you

and he'd eventually get you

to a place where you had no idea what to

do. And then he'd start to

drop hints and see how we did

with the hints. And when it was all done,

because this was a very

subjective exam, he gave us two

grades and those two grades were averaged

for our final grade. And

those two grades were how good

was our conscious thinking and how good

the other grade was, how

good was our subconscious

thinking going back to this Epicurean

model that he told us about.

And that was where he gave us

hints. How could we synthesize those to

crash through to our

conscious mind to do the technical

aspects of actually doing the proof. And

so it was pretty amazing.

But that took a lot of time.

I mean, I was in there with him for over

a half an hour and I got a

C for my conscious thinking,

but I got an A for my subconscious

thinking. So I got a B for

the exam and I was pretty happy,

but I learned a lot. It was an amazing

experience, but it's just

not scalable to 150 people.

That's just too much time.

You know, that reminds me of one question

from my PhD qualifying exam. And

the question was very simple. You have a

gas of charged particles,

the box that the gases in

expands, does the temperature go up or

down? And so of course, having done

however many years of

physics at that point, I went straight to

the nuclear option. I'm

like, I'm going to write the

partition function for this thing.

Because if I know that I can calculate

anything. So, you know,

I go through the jumping through the

hoops of writing down the

states of the system. And I

start to write integral of, and then I

have this moment of I'm

never going to be able to do the

math for this. This is completely

intractable what I've set

myself up for. And one of the

people on the committee saw me have that

moment where I had

finally realized that, you know,

integrating this by hand was not the

correct approach. And he

says, Okay, good. So now just

think about it physically. Where is the

energy in the system? And

I'm like, Oh, yeah, huh. And

five minutes later, I was at the answer.

But, you know, it was just

one of those experiences that

sticks with you. Yeah, I wish we could

have more of our students to

have more of those epiphanies

on their exams instead of just grinding

out written calculations.

And speaking of written,

another option is to think more

physically and have them actually write

essays and write papers

about the concepts that underlie it. And

this used to be a pretty good way. Now

that also takes a lot

of time to grade. That's not so easy. But

the whole world has changed

in the last year. First of all,

now chat GPT will help the students write

those without necessarily

showing that they understood

it, but it will also let the instructor

grade them more

accurately. And maybe there's a way

with the oral exam, if you record it,

that we could transcribe

it and, you know, grade it

automatically. But unfortunately, you

still have to sit there

through the whole oral exam and

participate in the interaction. So until

chat GTP voice models can

actually do that interrogation

on a curated set of data, so it's

accurate, and be able to do assessment,

this hasn't happened yet.

Maybe it can. Maybe chat GPT will finally

give us a way to evaluate

the quality of learning.

But I think that's a long way off. But

it's a nice thought.

At any rate, so we've talked a lot about

what's wrong with

grading, what's wrong with exams,

why, you know, oral exams, which are

great, really aren't scalable. But what

do we do? Are there new

methods out there? And when I say new

methods, I've got to put that

in quotes because there are,

but they're really not that new. They've

been around for a while.

And so, for example, you know,

people talk a lot about mastery grading

as one of these new things

that we're going to do. But if

you go back in the literature, the person

who coined the phrase

mastery learning and first wrote

about it was none other than Benjamin

Bloom, who's very famous for his Bloom's

taxonomy back in the

60s. And so, but mastery grading is

starting to gain a lot of traction. So

why don't you tell us

what you think mastery grading means?

Interesting. So this is not

something I use in my classes.

We can get to that. But my understanding

of mastery grading is that

you have your course broken into

many small components. And for each of

those components, you're

assessing the students almost

on a binary scale of have you achieved a

sufficient level of proficiency on this

or have you not? And

then by totaling how many course

components students achieve mastery of

over the course of the

term, the quarter semester year, whatever

it is, then you can assign

a final grade based on how

much of the course content they have

mastered. Is that how you

implement it in your classes?

Sort of, but I'd say that the K through

12 community has been

using this for a long time,

especially with the younger students. And

so I know when my kids

were going to school,

they weren't getting grades until like

after fourth grade or

fifth grade until then they

were just getting one of three things

exceeds expectation meets

expectations or is approaching

expectations because the idea wasn't to

stigmatize the student with something

horrible, but rather

talk about where they are relative to

what the teachers wanted

them to do to give them flags so

that they knew who to give more attention

to. Cause you know, our

public school system doesn't have

that much money. Teachers aren't paid

that well. And most teachers find

themselves in classes

that are way too big for them to

adequately teach. So they do

a triage and that's exactly

what that is. You know, except unlike a

triage, the worst group,

they don't just leave to die.

They actually spend more time on that

worst group, a little more

time with the people who are

just starting to meet those expectations.

And then maybe even enlist

the help of those students who

exceed to help the younger students do a

little bit better. That's kind of the

idea of the one room

schoolroom, the schoolhouse, right? Where

all grades used to be mixed

together. The older students

would help the younger students. And, you

know, it's a pretty

powerful technique and I think it

works really well. So why don't we use

that in higher education?

That's kind of my thought on

the whole thing. Well, since you gave me

the perfect launching

point, I'll have to say that's

very close to what I do in my classes. I

am bringing together a few of

the topics we covered today.

Most of my assignments are scored on a

five point scale because

that's course enough that I can

actually be reliable about it. I can

distinguish between a four

and a five, a four and a three.

And my criteria are exactly what you've

just stated. I have

meets expectations, exceeds,

does not meet. And then descriptive

rubrics that say, here's what I'm

expecting. If you achieve

this, that's a four. If you exceed that,

that's a five. And there is a lot of

student angst around

that, especially at the start of the

semester. Oh my God, I got a

three out of five. I'm failing

out of school. No, it means you're almost

there. You're not quite

there yet. And it's something

that I find makes the grading go a lot

smoother because I can focus on the

qualitative feedback

that I give to the students about where

they succeeded and where

they still need to develop

more as opposed to spending my effort

trying to turn it into a

number. Right. And of course,

by using rubrics, you're moving towards

another form of mastery

grading. You're probably doing

something closer to specifications

grading. Yes, that's

right. So I'm, I'm giving the

specifications for what the different

levels of achievement are for each

assignment. And then

just comparing what the students produce

to those specifications.

So you've put in a lot of transparency to

your grading, even though

the students may be fearful

because it's new, it's actually a lot

less opaque than just, you

know, getting a percentage on an

exam where then the professor arbitrarily

draws the lines later on. There's no

transparency to that.

So that's pretty cool. And then there's

another movement that

has grown up around mastery

grading specifications, grading, and

there was a book published that talks

about this quite a bit,

and it's called ungrading. And the

ungrading movement is also

not a new idea. For example,

the president of University of Chicago

decided that grades were

evil back in the 1930s, and he

banned grades, but that only lasted for

about a year before they

threw the guy out because none of

the faculty could deal with it. But the

truth is that if you go

way back, grades didn't exist

until the late 1800s. And they were

started in England at

Cambridge and Oxford,

not grades, but levels, right? You

achieve some level. And do you

know where grades were? A, B,

C, D were first instituted in the United

States in the whole world.

I don't. I would guess University of

Chicago just for pure irony.

No, for real irony, it started at the

University of Michigan.

Oh, no. Yes. So we were

the leaders there as well.

Yes, we were. And it caught on like

wildfire. And at least that's what the

book ungrading says. I

will trust them. It's a fascinating book.

And so this whole

ungrading movement is to get rid of

all of those concepts. And if you take it

to its logical conclusion,

the logical conclusion of

ungrading means you don't give grades.

And instead, you just ask the students,

what do you think you

deserve? And it's kind of interesting

because, you know, that

will actually work sometimes,

but not at the University of Michigan.

And I watched a podcast that

Eric Mazur did on ungrading

with two faculty members from the Eastern

Kentucky University, which has a very

different demographic

than Michigan or Harvard. And they

actually talked about their students.

When they let their students

give themselves grades, they almost

always give themselves a

lower grade than they actually

deserved. But that's because they have

mostly non-traditional

students who, you know, are mostly

first-generation college students. They

know why they're in school

and they're pretty tough on

themselves. And a great comment, one of

the faculty members said, but this must

be great at Harvard.

Your students are like so highly evolved.

And Eric just shook his head

and said, no, think about how

they got into Harvard. They all got in by

focusing entirely on

grades. And so there's a lot of

dishonesty in that whole process. And

sadly, University of

Michigan is close to that. So it

just won't work to let students give

themselves grades because they'll just

give themselves A's.

I don't want to be that, you know,

cynical, but I'd say for 90%

of the time, it will be cynical.

So we can't quite go that route. It might

work some places, but it's

that to me is a bridge too far.

Yeah. As we're talking about students

grading themselves and the

sort of student obsession

with the A, it really makes me think

about the fact that there's

a very wide gap between how

important our students believe grades are

and how important grades

actually are not. You know,

that as we're sending students to their

employers or to graduate schools, the

large majority of what

we're writing in our recommendations is

not about the grades they

got. It's about what they did

and the interactions that they had. And

certainly when I'm writing

a letter of recommendation,

I'll say about two sentences of this

student got an A and then two

pages about the actual human

being, not the letter. That's right. And

what's sweeping the country

is this whole idea of holistic

evaluation of admissions. And for the

graduate programs, you

know, it's well known that

all grades are as a predictor is whether

or not students will score

well in exams in graduate

school. And it says nothing about how

well they will do with

research, which is what PhDs are all

about. And for research, we know that

we'd rather have a student who's

perseveres, who's fearless,

who embraces failure, learns from it, has

all these other things.

We would much rather have a

student who's overcome adversity in their

undergraduate career

to go from a pretty low

level to a high level that shows an

ability to overcome the kind

of barriers that are actually

important for doing research. Because

that's what research is all

about. Better get used to failing

a lot and embracing it because if we knew

what the answer was, it

wouldn't be called research.

So, you know, with all of that, I, again,

students have this, you

know, how do we educate students

that grades aren't important when all of

society is telling them

that they are. And it's really,

really sad. So anyway, we've talked a lot

about some background

about what you and I believe

is important in grading or not important

in grading. Why don't you

tell me what you actually

do for grading? You started by telling me

you have a multi-point

scale. Can you be a little

more specific about how you do grading in

your class right now? Sure.

Yeah. On the implementation

side for each of these assignments,

whether it's a report or a

lab notebook or a homework or,

no, actually, I guess that's about it.

The rubric will have a description of

what students should be accomplishing in

the assignment. So as a

concrete example, in this

last lab report that my students have

just turned in, I'll have a

line that says, for example,

must compare predictions of a theoretical

model to experimental data

and evaluate the accuracy of

the model. And then at that point, the

different levels of grading

are a five does an excellent

job of providing a convincing explanation

that combines multiple

sources of data. And a four

accurately compares the theoretical model

to the experimental data,

but does not bring in other

sources of information. And there's this

kind of tiered structure of

what will really convince me

that you know what you're doing and then

sort of taking away some of

those pieces of the argument

as you get to lower score levels. And an

important distinction that I

think is worth pointing out

here is that I'm telling students what

they need to have

accomplished at the end of the day. I'm

telling them that here's what will

convince me, but I'm not actually giving

them a list of boxes

to check, right? I'm not saying, okay,

you have to get some error

bars on this thing. And I'm not

telling them, oh, yes, you should run the

model with different model

parameters to see whether

there are numerical issues because that

that's part of what's being

assessed is do they have the

understanding of this process of science

to be able to do a

meaningful comparison. And sometimes

students look at this rubric and they're

like, so you want a meaningful

comparison? What does that

mean? But then that's a great question.

That's something we can have a

conversation about. And

when they're ready to ask that question,

and we can talk about it,

then that's a place where I think

true learning can really happen. So I

would argue that there's value in an

interactive course that

leaving some things not fully written out

explicitly in these

specifications is good

because it gets students to question the

meaning of the words in the

specification to help them

unpack any gaps in their own

understanding of it.

Right. So I think it's important to

mention that you teach a

laboratory class, and that you

typically have, you spend a lot of time

with the students. So

your teaching schedule is

four days a week, every afternoon, you're

in the lab with groups of

students and you typically have

groups of four in each team. Is that

right? Yeah, usually three to four. And

how many teams do you

have because you have them all rotating

on stations around to

utilize the equipment?

Yeah, typically a section will be four

teams and then I'll have however many

sections I have based

on enrollment, usually three. Right. So

you have like up to 16

students at any given time.

So you actually have time to interact

with each student. Because

how long are your labs are like

two hours? They're four hours, four

hours. Yeah. So you have, you

have a lot of time to interact

with each student. And that just you just

have to spend a lot of time. So in a way,

you're able to do oral exams with all

your students almost every day.

Yeah, I just don't call them that. And

that's right. That takes

away a lot of the stress around,

Oh my God, I'm being examined. No, we're

just having a conversation so I can see

what you get and what you don't get. So I

can help you get what you don't yet get.

Right. But you know,

this approach. Go on.

I was just going to say that this

approach is going to hit a

scaling limit where in a 100,

120 person course, there's just no way to

do this unless I had an army of grad

students who were all

trained in pedagogy and able to tease out

students ideas in the same

way. So that kind of leads to

the question of in a bigger class,

because like Steve, your classes are

significantly larger than

mine. What are other approaches that you

can employ for grading

that would work well with

a larger number of students? That's

right. So I have 140

students in my class this term.

We had a 24 teams of six people per team,

roughly. Some people,

some teams were five people.

It was in a very large room. And so, you

know, I've tried to

structure my course really around

the way that I'm going to do grading. And

so I think one thing that

was really important in what

you do, Tim, is a best practice that lots

of people talk about

that if you're going to give

assignments for a grade, it's better to

give lots of low stakes

assignments. So instead of giving

three exams, it's better if you gave 14

exams that were all equally

weighted, because that way,

if you do poorly on one of them, it

doesn't crash your whole

grade. And so by working with them

every single week and interacting with

all of them, you know,

you've kind of created a

large number of evaluations with low

stakes for each one that

conclude with your judgment,

where you can use your rubric to grade

them. And so I do try

to do the same thing. So

with as many students as I have, it's

kind of hard to just redo everything. I

have to do it in steps.

So I looked at what I had been doing. So

I give reading

assignments for every class session.

And I give, we usually have, I meet two

hours, twice a week. And

so each week has two class

sessions. And so I usually give two

reading assignments for

each class session. Each week,

I have a homework assignment. Each week,

not the same week, but I

have a what I call a readiness

assurance activity, which is a more

formative based test, but it's weighted heavily towards the team

participation. I also have outcomes

assessment reflections, where I, at the

end of every module,

I have students reflect on what they

actually learned. So

again, my whole course is set up

to focus on retention, because I think

that's really important.

And one way to get retention

is to have students revisit a concept

multiple times over a

relatively long period of time.

So all of my modules, so a module starts

with a one week effort.

And that module then extends

for three weeks before they actually

finish all the things. So while they're working on other

aspects of that module, a new module is

being initiated the next

week. At any rate, I have

roughly 24 reading assignments in the

term. We have about 12 weeks, 14 weeks,

something like that.

So I have, you know, 24 reading

assignments. I have 14 homework

assignments. I have 14 readiness

assurance assignments. I have 14 outcomes

reflection assignments.

And then I also do projects.

And I've been doing three projects, but

the first one's just for practice. I'll

be changing that next

year, mostly because drop bad deadlines

destroy teams in my class for the first three weeks. So

my first project has pretty much always

been a bust because the

teams got disrupted so much.

It wasn't working, but I finally figured

out how to fix that. I can

now go in the day after class

starts and drop the enrollment number so

nobody can join the

class. People can still drop,

but nobody can add my class after the

deadline. And that's
going to let me have three projects

which will be worth two units each. When

it's all said and done, I

add this all up and I have 72

units make up my course. And what I tell

the students is they have

to get an A on every single

unit. Now I'm a little flexible there and A minus is just as good as an A because I can't really

distinguish. So they pretty much have to

score like a 90%, but it's

not a strict 90%. I just do it

as a zero one or two and a two is an A. A

one is approaching expectations and a zero means

they just didn't do it. And I try to give

each of those units where

I can, I can't always do it

where I can. I give them the ability to

retake them multiple times

until they achieve a two.

All the reading assignments, which are,

you know, 24 out of 72, a big chunk, all of that

are reading assignments on perusal and

perusal gives me the ability

to allow them to continually

add annotations that score highly or to

do high quality interactions.

So they get credit for being

social because I believe social learning

is a powerful tool. But I think that's a really

powerful tool. Social learning is hard

baked into our DNA.

Writing's only been around for 5,000

years. Social learning has been around

for 3 million years. That's

our primal instinct. That's

how we learn. And so by being social, by

interacting with other students on

perusal, I really highly

value that. So they can do that up to a

deadline. They, you know, once the

deadline goes too late.

So that's how, that's how I'm trying to

do mastery grading. I also

have these reflections. So

what I grade is the homework reflection.

I don't grade the

initial homework effort.

I only grade that on effort, not on

accuracy. They have to scan

their work and put it into

canvas. They have to do it, but I pretty

much just, you know,

virtually weigh the packet of

information to see that they tried. And

that's just a check mark.

They have to do that if they

want to get an A on the assignment or

pass that unit. If they

don't do that, it doesn't matter

what they do later on. They're not going

to pass it. Then they have

a homework activity in class

where they make a better homework

solution with their team. And I hire all

of these instructional

aides, our senior level material students

who were just in that class two years

earlier. And so they,

I have one undergrad for every two

tables, maybe every three tables, and

they walk around and

provide help. But it turns out the

students don't really need

help. The help they need is those

students that are introverts and don't

talk need to talk. So I tell the whole

class, we're going to

do this. And I train the instructional

aides to pull those folks

aside and say, Hey, you know,

Johnny, you're not talking enough. How is

anyone in your unit team?

How is anyone in your unit team

realize that you're adding value? And

unless they realize you're

adding value, they won't respect

you. And good teams are teams that

respect each other. So we help the

introverts become less

introverted. We also talk to the

extroverts and explain how important

learning how to listen is.

And it's sort of like an audio, we do

compression on the high

end. And we do, you know,

we boost the levels on the low end, to

try to make teams that

perform better, because ultimately,

they need to teach each other these

concepts, if they want to

have retention. So that's what I

focus on. And does it work? Well, I'm not

there yet. I still have a

lot of problems with it.

But I'm getting there and I'm learning

right now. I have way

too many A's. So although I

want to have everyone get an A, I know in

my heart that the

students who a lot of students

are getting A's who really don't know

anything. And how do I know

that? Because when I walk around

the classroom, I will pick out a few

activities that I give them that I know

are very conceptual.

And I go and make them write stuff on the

whiteboard. And I talk

to every single team,

takes me about a half an hour to get

around the whole room. But

I can get a really good feel

for whether they learned anything from

their reading, whether they

learned anything from what

I talk about. And sadly, sometimes I walk

around the room like I tried

to tell them with extrinsic

semi-conductivity, where the acceptor

level is, where the donor

level is, where the Fermi level

is for an intrinsic semiconductor, and

where it moves when you have

an extrinsic or acceptor or

donor. And I try to get them to draw

these pictures on the

board. And then two days later,

I give them that exact problem on the

readiness assurance activity.

And on the individual round,

30% get it right. So I failed. I thought

I was giving them one-on-one instruction.

I thought they understood what I was

saying, but other factors

always come in. This was on the

second to last day of class. And I'm

sorry, students, by the time they hit the

very last week of the term

are pretty much brain dead because their

other classes are

weighing so heavily on them.

They're cramming like crazy. And it's a

really tough environment

to learn something new that

last week of class. So, you know, we all

suffer from this. I'm sure you've

observed that phenomena

as well. Oh, sure. And so those are my

problems. But what I'm

going to try to do next time,

I'm going to go back to something that's

really like an oral exam. You know,

it's called the Feynman technique. It's

what Richard Feynman told all of us.

If we want to know that we actually

understand something, we

should use the litmus test of,

can you teach this to a five-year-old?

And if you can teach

it to a five-year-old,

you probably really deeply understand it.

And if you deeply understand anything,

chances are you're going to retain that

for a long time. So I'm

going to play around with

things like I do for my projects, which

is whiteboard video. And I'm

going to try to get students to

make short whiteboard videos. My

whiteboard videos for my projects are

like two minutes long.

I think for certain concepts that I'm

going to ask them to do, I'm going to

embody the idea of a

YouTube short or a TikTok video. They

have 30 to 45 seconds to teach the

concept. And we'll see how

that goes. I haven't quite figured out

how to do it. But the truth

is they're all pretty skilled

in editing video. They come to college

knowing how to do that. So

let's exploit those skills.

Let's get stuff up. And maybe they can

help each other enough. And

if I build a library of the

good ones, those can even help students

learn small little chunks. But we'll see.

Yeah, I love that idea of producing short

form content. One trouble

that I have in my course with

the specifications that I'm using is

students come in believing

that more is better. And,

maybe I'll just write 30 pages instead of

20. And then all the

right answers will be in there

somewhere. And they will, but that's not

really helping the

underlying problem. So this is

something that I'm always trying to find

ways to do better in my

classes to give assignments that

really have students distill something

down to just the minimum

essence of the concept or the

problem of the calculation. And do it in

a paragraph. Don't even

do it in a page. If I could

really get them to be more concise and to

think about how short can

I make my product instead?

How long can I make it that would serve

all of us very well, but

we'll get there someday.

We all know it's much harder to write

something that's short

than something that's long.

And the same is true for video or

anything else. The thing I like about

video is that it's way

ahead of where chat GGP can quite get to

right now. I can start to

make videos, but they're not

going to be very good. And so I don't

have a problem with having

the students use chat GPT to

make their script, but to make the script

and make it concise

enough for a 30 second spot.

That's a pretty challenging thing that

requires you actually think

about what chat GPT is telling

you. And as long as you think about it,

that's a really good

activity in my book. Because once

you've thought about it, you've gone

beyond just memorizing and

you've edited what chat GPT says.

And so then chat GPT becomes a tool and

it's a useful tool, but you

must edit it. You must think

about what it's saying in order to make

something very concise. So I

think that's a good strategy.

Evaluate whether it's accurate, decide

how it can be done better, and then

actually execute the

doing it better. And that's a really good

use of time for learning.

And I'm even thinking maybe I

can come up with an activity where they

use their short form video

to teach other people on their

team and have the other people on the

team evaluate the quality of those

videos. So we'll say,

not sure how I'm going to pull this all

off. But that's what I'm

thinking about for next year.

What are you thinking

about for next year?

Ah,

I'm thinking about the biggest change

that I want to make on

this topic of grading is

doing a better job of mapping this five

point scale onto a

numerical scale that will make sense

to students who have grown up in zero to

100 land without getting

to this compression problem

that you were bringing up at the

beginning. If I can actually spread out

the individual numbers

over a wider numerical range so that the

values are distinct and have

separate meanings from each

other without getting students into the

panic of why do I have a C minus

territory? That's really

what I'm focused on right now is to help

students distinguish the

score on the assignment

from the grade you're going to get in the

course. Because as

much as I would like to

just tell them that grades don't matter

and have them actually believe it,

that's an impossible challenge. So have

to work around, work with

what our students believe

instead of what we wish they believed.

Well, that's a good point

to end with. I think we've

been talking for a long time now longer

than we intended. It

sure is hard to be concise.

Yes, it is. Especially when it comes to

something as intractable as grading.

So I guess with that, we'll say goodbye

and looking forward to our

next chat. See you later.

Okay. See you next time.