Wednesday, July 13, 2022

Introductory Programming Assessment Must Accommodate Copilot-like Assistants

David Kopec:

There are certain standard problems that we are accustomed to assigning because completing them demonstrates the ability to implement fundamental simple algorithms. A prior strategy to reduce plagiarism has been to provide scaffolding code or put a spin on a problem to make it unique. Unfortunately, Copilot-like assistants are almost as capable in these scenarios as they are at writing generic simple algorithms. In my own preliminary testing on a (what I believe to be) unique scaffolded assignment of my own creation for an introductory class, Copilot was able to contextualize the comments and write most of the smaller functions accurately with just a little bit of my assistance.

[…]

I am suggesting that exams and exam-like evaluation will need to be a greater percentage of the mix. We can be creative. An oral presentation can in some instances demonstrate knowledge as well as an exam. A project coded live in class on machines that do not have AI assistants enabled, can serve as a pseudo exam.

Previously:

8 Comments RSS · Twitter

Old Unix Geek

Or... use a non-standard language, say Forth.

I'm not sure I understand the problem - the person demonstrates the ability to solve the task at hand, what difference does it make what tools they use to accomplish the goal?

Is using the compiler error messages an issue? What about "Fix"? What about the docs in the headers? Or Stack Overflow? Where exactly is the line between using resources to accomplish the task and “cheating”?

Yeah. By definition, a lot of the problems developers solve are novel to them (if they've already solved a problem in the past, there's no need to solve it again, that's how software works). So developers' ability to learn new things is an important factor in determining how productive they are. Assessing developers in ways that exclude the mechanisms they use to learn new things means you're not really assessing them.

Old Unix Geek

@Plume: I disagree.

The question is whether a programmer can write a program, or only plagiarize. Since it is easier to write code than to understand it, asking someone to write something all by themselves is really not that onerous.

If you are competent, I should be able to give you a set of atomic operations, and a suitably constrained problem to solve and you should be able to write a program using those atomic operations to solve it. That's what my first job asked me to do during my job interview.

If you can't figure out a constrained task all by yourself, but need hand-holding from an assistant, then I don't know how you can call yourself a programmer... and I don't know many competent programmers who would want to work with you.

It should go without saying that you're not supposed to be learning during a job interview or a university examination. Similarly, a graded university project is when you're supposed to apply what you learned in class. If everyone fails, it indicates that the class didn't cover the material properly. Again, this is something that assistants would mask, reducing the quality of university education. So the presence of assistants is detrimental in such situations.

The thing is that what most developers do most of the time is not "program" in the sense of "implementing basic algorithms using atomic operations." But that's the only thing you can really ask them to do in an interview-type situation if they don't have access to their normal tools.

At any rate, we never do these coding exercises when hiring developers. One thing we do instead is to ask them to bring in code they've written, and then talk us through how it was structured, why they wrote it that way, and so on. This does create its own set of issues, because some devs don't write code in their spare time, and can't show proprietary code from work, though.

Old Unix Geek

@Plume:

I agree that implementing basic algorithms using atomic operations is only a minimal check of competence.

And you're right about bringing one's own code. I've worked for a number of companies which claimed everything I coded, even if on my free time, belonged to them, which, of course, put me off making things in my free time.

I've actually been wondering whether one should ask people to refactor, debug and optimize purposefully poorly organized and slightly broken code. This would test whether they actually understand the code, what they value in code, what trade-offs they would make when refactoring, etc. Do they have a sense of what is actually slow, without profiling? Do they have taste? Do they understand that unnecessary complexity is the enemy? Do they have an idea of what the underlying CPU does? and so on. And how do they take feedback?

It would however take a couple of hours... checking in on them / being there while they do it (depending on the candidate. Some are more nervous than others and do better alone or with company).

"I've actually been wondering whether one should ask people to refactor, debug and optimize purposefully poorly organized and slightly broken code."

That's a great idea. I'll have to bring that up for our own hiring process.

Old Unix Geek

@Plume: If you do end up using it, please let me know how it goes!

Leave a Comment