The Education of a Programmer

The other day, someone asked me to teach them programming. I didn't agree to do that, since I didn't know them, I was already at maximum capacity dealing with more important things, and probably couldn't help them anyway. Why did that happen? What's so hard about learning programming that people resort to asking random people like that? What's so hard about teaching it that I didn't think I could convey what knowledge and skills I do have effectively?

There are different levels of teaching. Ultimately it's about helping someone get from one level of skill to another. For example, there are things you can do to lead someone from "complete beginner" to "can write a simple program". You can pick a language, give them a few examples of programs, and put them in front of a programming environment, and they'll probably achieve what you wanted. Assuming the environment is easy enough to set up, you probably don't even need to be with them, and could present this information in a book or a web page, with the same result.

What about going from writing simple programs to whatever comes next? Are there books for that? The answer to this question isn't a definite "yes". It's more complicated, because there are lots of things to do once you can write a simple program. One possible next step is continuing to read the book from which you learned to write your first simple program, and see where that takes you. Let's read along with our hypothetical novice and see how their journey goes.

C Programming: A Modern Approach, by K. N. King

I've seen this book recommended a lot. The first program in the book, excluding a 1990 IOCCC winner included to prove a point, is a "hello world" style one.

int main(void) {
    printf("To C or not to C, that is the question.\n");
    return 0;
}

It goes on to explain things like the meaning of

int height, length, width = 10;

which I thought was a little weird, since it isn't necessary to introduce any other C concepts, but there's nothing wrong with mentioning it. Reading user input comes next, after a brief detour into arithmetic:

int main(void) {
    int height, length, width, volume, weight;

    printf("Enter height of box: ");
    scanf("%d", &height);
    ...

Unsurprisingly, the author decides to use the scanf function. This is unfortunate, since it's terrible. It's also interesting to note that despite explicitly including warnings about uninitialised variables earlier in the book, he decides not to initialise any of them. Maybe putting everything on one line looks better. This doesn't change the fact that if you don't input a number, scanf won't write to the variable, and the return value is completely ignored, meaning this program won't detect the error. This means the program is needlessly brittle, and likely to exhibit strange behaviour.

C's standard library leaves a lot to be desired, but even within it there are better ways to read a number from standard input into an int variable. For example, you can use getline and atoi. This still isn't "good", because you can't detect errors, but it's better, because it's easier to understand, and errors don't lead to undefined behaviour and semi-random output, but rather a predictable 0. (The problem is that while all atoi errors lead to a 0 return value, not all 0s indicate errors, so you can't use it to detect errors reliably.)

Indeed, the manual page for the scanf family of functions says

The scanf() family of functions scans input like sscanf(3), but read from a FILE. It is very difficult to use these functions correctly, and it is preferable to read entire lines with fgets(3) or getline(3) and parse them later with sscanf(3) or more specialized functions such as strtol(3).

Perfect for beginners! A function described as "very difficult" seems like a terrible idea to use in one of the first programs in the book, especially when a better alternative is readily available.

The above program is also fundamentally broken, because printf writes to stdout, which is line-buffered when it's connected to a terminal. This means stdout usually only writes whole lines at once to the terminal. If you try and print a string that doesn't contain a newline, it's not guaranteed to be written to the terminal (and usually won't be), and will be stored in a buffer for later, when a full line can be written. When the buffer is full, or a newline is written to stdout, the contents of the buffer will be written to the terminal.

Since the "Enter height of box: " prompt is small (will fit entirely in the buffer), and contains no newline, it won't appear on your screen if you run the program. You can fix this by adding a call to fflush after printf, but this is nearly impossible to figure out on your own as a beginner.

So, I'm going to conclude that no one's getting anywhere by reading this book, since some of the earliest programs don't even work, and the content covered is generally not hugely useful, like a comprehensive treatment of printf format strings and several pages dedicated to scanf.

Flicking through the book also reveals that it contains some ridiculous advice:

Q: What if I want an array with subscripts that go from 1 to 10 instead of 0 to 9?

A: Here's a common trick: declare the array to have 11 elements instead of 10. The subscripts will go from 0 to 10, but you can just ignore element 0.

and that it waits until page 414 to introduce malloc. This comes two sections after "writing large programs", which is strange, since in my experience large programs usually use malloc. I'd venture that one of the most important and useful functions should probably be in the first half of the book. How much of this book would you have to get through to be able to write useful programs? It seems like the answer is "too much". It's notable that the order of the sections in the book is very different to the order in I learnt them in. From the perspective of someone wanting to learn C programming, it seems very artificial in that respect, lacking context.

Maybe the book is useful to people who have other goals than "learn to write useful C programs by reading the book". It would probably be somewhat useful as a reference, which makes sense given that the first edition was published in 1996, before the internet got into full swing. Now that it has, it's very easy to find lots of information about C, and it's no longer necessary for a book to address every piece of trivia in the name of being comprehensive. If you only have one book, you want it to be the biggest book that mentions every little detail, but a modern reader will have access to more books than they could ever read in their life, as well as the equivalent in web pages, etc. It's a bad thing that people still recommend this book in 2023, when the problem it solves no longer exists.

Either way, our novice will need to take a different route.

University Course

I did one of these! Let me tell you how it went.

The first time anything happened, they provided us with the following code

double a, b, c;
printf("Enter the values of a, b, and c respectively,"
    " separated by white space then press Enter: ");
scanf("%lf%lf%lf", &a, &b, &c);

which has exactly the same line-buffering problem as above, and exactly the same "using scanf" problem as above. How many people have had to unlearn the idea that scanf is the way to read user input?

I was in a room with other people this time, so I got to watch someone struggle with this not working, after copying it into their editor and compiling it. This code wasn't just an example, it was part of an exercise, so it was supposed to compile and run with no issues, and then you would edit it and add a feature, or something. Since it didn't work, and I already knew why it was broken, I spent my time trying to help someone who'd never used C before deal with this. He was confused, which was understandable. What's less understandable is how this happened in the first place. Did whoever wrote the code never run it? Did it work on their machine? Did they copy it from a book? Even if I could come up with a charitable explanation, I know who was responsible, and can confidently say they wouldn't deserve it.

This same problem came up later in some coursework, but this time it was surrounded by /* DO NOT EDIT */ comments.

The course consisted of language-feature trivia rather than actually programming, things like how to write functions with multidimensional arrays as parameters, or information that you won't need and could easily look up if you ever did, and therefore wasn't worth presenting proactively. I decided it wasn't the best use of my time listening to someone telling me to cast the return value of malloc, and stopped engaging with this course because I knew I could do the coursework without using the course material (thankfully, since it wouldn't have helped anyway).

I probably wrote less than 100 lines of code in total as part of that course. Speaking to other students, they way they talked about things made it obvious no one had learnt anything. It was also interesting that we wrote no code from scratch, it was all "implement this function in this program I wrote for you". There was no chance to write the program in a better way, or at least a way you had designed yourself, or to change the interfaces you were using. Conforming to very rigid specifications for toy programs to enable automated testing was a bad experience, and did nothing to make me (or anyone else) a better programmer.

The fact that hundreds of people are reading terrible specs for terrible interfaces as their first experience programming, working with terrible code while being unable to change it or make any decisions at all about what they write due to automated testing, is a bad thing. The reason their specs sucked is because they wrote a program, recorded its behaviour, then removed parts of the code for you to fill in. So you either had to write your code exactly the same way, even if your instinct on how to write it was different, or if you had written it differently, you would have to add ugly hacks to munge the output into a bit-identical format. This included things like explicitly adding trailing whitespace to the program output, to match the original program.

The task of implementing an exact interface is relatively rare, and quite advanced. Beyond using simple file formats, the kind of programs where you need to read specs, like compilers, web servers, networking stacks, and so on, don't seem to be good candidates for beginner programming projects. If you did want to do something like that anyway, it would probably be most educational to write a very hardcoded implementation of, for example, a webserver that just serves one file. Writing a complete and useful implementation of your favourite networking protocol is a monumental task in comparison to serving a single file over HTTP.

In short, that course did not make learning programming easier. Maybe other courses are different.

The Hard Way

So if we can't make things easier, why not stop wasting time and just do them the hard way?

You could explain to someone who doesn't know C that if stdout is a terminal then it's line-buffered, as we did above, but that's too easy. They should find out on their own, first by knowing that they're on a POSIX system, or at least something close enough (everyone uses one, right?). Then they should go to https://pubs.opengroup.org/onlinepubs/9699919799/ which they've already bookmarked. Despite not knowing that FILE *stdout even exists, it will be obvious to a beginner that printf writes to stdout, after which it's easy to find out that "the standard input and standard output streams are fully buffered if and only if the stream can be determined not to refer to an interactive device". The meaning of the word "stream" needs no explanation, and "fully buffered" clearly implies the existence of other kinds of buffering, which you can read about in the ISO C standard (yours for about £200). We can tell from the above that when the stream does refer to an interactive device, it is never fully buffered.

Now, this doesn't entirely specify the behaviour, as is common in POSIX standards, a fact which everyone is used to by now. The next step is to take a look at the C standard library implementation. They'll open their copy, which they downloaded earlier using git, and observe the following (on musl, at least)

hidden FILE __stdout_FILE = {
    .buf = buf+UNGET,
    .buf_size = sizeof buf-UNGET,
    .fd = 1,
    .flags = F_PERM | F_NORD,
    .lbf = '\n',
    .write = __stdout_write,
    .seek = __stdio_seek,
    .close = __stdio_close,
    .lock = -1,
};
FILE *const stdout = &__stdout_FILE;

at which point it's perfectly clear that lbf determines which character, if any, will cause the buffer to be immediately flushed, and that assigning '\n' to this struct member is what we mentioned earlier about line-buffering. You can verify this by reading other code in the library that accesses this member, which they'll understand because it's written in C, a language they already know how to use.

This doesn't explain everything, since it doesn't show the part of the code that detects whether the output is a terminal, which involves an ioctl call, but I think this is a natural place to stop. Needless to say, this is a completely impractical way to learn things.

The Slightly Easier Way

In practice, there's no need to do what's written in the last section, however, there is a need for something between very easy things like reading instructions to write a "hello world" program and learning everything from scratch using the specifications and the source code. One useful tool is trial and error. You can change your program, and see what it does. This method could uncover the above buffering problem. The "hello world" program was known to work fine, but something didn't work in the second program. There were a few changes made between them, and applying them one by one and seeing what the program does can help you diagnose the problem. The first change to the program is the addition of a call to scanf, and the second is the different text in the printf call. If you apply these changes seperately, you'll find out that the second one is the culprit.

Most programming ends up as a refinement of this trial and error process, combined with miscellaneous knowledge and experience. For example, implementing DNS in C will likely cause you to run into its unusual operator precedence, a trap you will either fall into if you've never come across it before, or easily avoid if you have. If you write something like byte & 0xc0 == 0xc0 to check that the first two bits of the byte are both 1, then you'll run into a problem because this means the same thing as byte & (0xc0 == 0xc0) which is unlikely to be what you meant. This might take a while to figure out, since it doesn't produce a warning with any compiler I tried with, but you'll only ever have to go through this once, and then in the future it will be comparatively easy to spot. Another example of this "laser eyes" effect is for-loop conditions, where <= often means the condition is specified incorrectly, since < is usually correct. This isn't always the case, but this will attract your attention if you've dealt with those kinds of problems before. The kind of problems you've dealt with in the past give you an impression of what kind of problems could potentially be present now.

Notably, I don't think memorising C's operator precedence rules would help in this situation, because the abstract knowledge that == has a higher precedence than & is disconnected from the task of checking some bits in a byte with an AND operation and an equality check, which is what you're concerned with when you're actually writing the code. In other words, just because you know something, it doesn't mean that knowledge will come to the front of your mind when it's useful. This fact about operator precedence comes up in this particular context, and it's important to associate that knowledge with similar contexts, and gaining that knowledge while actually in that context is as sure a way as any to make sure that association exists.

Unfortunately, deliberate attempts at learning often involve unrealistic contexts. In a general sense, education faces a problem of trying to teach people to do something without actually doing it, which doesn't really work. Trying to reconcile this with the fact that people with expertise in one domain are often able to use their experience to succeed in a new domain is sometimes known as the "problem of transfer". For example, in language classes, it's common that most time is spent doing something other than speaking the language, like memorising vocabulary or word endings for tests, which doesn't "transfer" to actual proficiency in the language. Sometimes this style of teaching is justified as providing some kind of "base" for when you actually start speaking the language in the future, but that's a bad excuse whose only purpose is to sound plausible to someone who doesn't know any better. That's not to say it isn't better than nothing, but it isn't very effective, which is easy to verify because it's easy to judge language proficiency.

Consequently, I think that writing small versions of "real software" is a good idea, because, while you'll only be dealing with a subset of the problems that the real thing would have to solve, those are at least real problems. Coming back to the problem of transfer, if you spend your time actually programming then you'll be practically drowning in transferrable knowledge. It's not necessarily that you have more knowledge, but that you've gained that knowledge from the perspective of a programmer. If you write just enough of a web server to make something appear in your browser, you haven't just learned something about HTTP, you've learned something about how to write that specific program, similar programs, and programs in general. You've learned about HTTP implementations, what kind of problems they need to solve, and how to solve them. Many components of a solution to these problems will exist in other programs as well, and being familiar with them in advance means that you won't have to start from scratch if you see something like that in the future. This is a much better outcome than having memorised some details about the protocol.

As another example, many people are familiar with various examples of cryptographic attacks, without knowing how to implement them. Many people know that the ECB cipher mode won't hide your penguins, but less could actually exploit it in the wild. A slightly more obscure example is that the amount of people who know that double-DES would be essentially as weak as DES itself is a lot higher than the amount of people who know how to break it. Just because someone knows something, that doesn't necessarily mean they can do anything, though this doesn't mean that knowing something without being able to put it into practice is worthless. It's useful to know which crypto is unsafe to use, so you can avoid using it, even if you don't become a cryptographer. Still, going as far as implementing attacks gives you sense of just how trivial some of them are, which can drive home the importance of not rolling your own crypto, and instead using something safe, reliable, and well-understood. Knowing how something can go wrong can help you make sure you're doing the right thing.

Practicing programming like this isn't easy (though it isn't prohibitively difficult) and it's a breadth-first approach in that the goal is to maximise context and understand how what you're doing fits into larger systems rather than to write the absolute best TCP stack in the shortest amount of time, or whatever. Even if you have one specific goal, the fastest way to achieve it would probably start with doing lots of different stuff and learning a wide range of things that will turn out to be useful. That said, you also eventually need to stop doing that, which we haven't covered how to do. In other words, this is an incomplete way to learn programming. It's something useful to do, but it hits its limits around an intermediate level, where you can manage with a decent variety of things, most of your programs are fairly small, you've accumulated a lot more knowledge than you think, and can read and understand and be involved in discussions about programming, but there's still a way to go until you can write useful thousands-of-lines programs or libraries (at least if you're doing systems programming — I'm not sure what qualifies as a similar benchmark for other subfields).

However, I'm not really qualified to say what to do next, because I'm not past that point myself, so I can't say for sure how you end up able to do those things. That's not to say I know nothing, because some things are obvious, and I can also extrapolate from my experiences somewhat, but it's to say I don't know everything for certain. One thing I don't know is how I even got to this point. I can't really describe how I learned C, or went from not knowing how to program to being able to, it just sort of happened. I learned the language and many computing concepts I'm familiar without much active effort, and I've learned relatively little through deliberate study. This is also why I say I probably can't teach a complete beginner, because I don't even know how I got to this point, and the path someone else should take probably shouldn't involve copying every random and inefficient action I made that led me here over several years, even if it was possible to replicate that. Which it isn't.

Miscellaneous Remarks

Some artists continually start new projects when they get to the difficult part of their previous project. Starting new projects can be comfortable, because they've done it lots of times. Often, the hard part is the most important. Some people in the alternative keyboard layout community change their layout to one that's supposedly 1% better if a particular sequence of keys doesn't feel right, rather than improving at using the bad parts of the layout, until it no longer feels so bad. Some new programmers hop to new languages or frameworks when they get to something they find difficult, rather than learning how to do the hard parts. When weighing up options, give a medium to strong advantage to the thing you're already doing, or the tools you're already using, etc. That doesn't mean you should never change anything like that, but keep in mind that it's a small step backwards, and too many of those mean that you go very slowly indeed.

It's hard to judge how good someone is at programming. Universities are forced to pretend to do this, even if they don't actually have the resources to do that successfully. This means that at best, every student does the same thing despite that being pretty much inherently antithetical to how programming works, and they are assessed in a way that doesn't involve actually looking at the work they produced. This is, however, "objective", and therefore "fair". Interviews for many programming jobs are famously suboptimal, despite a strong financial incentive to judge skill accurately, and success stories of many other hiring processes being relatively widespread. This particular status quo is incomprehensible to me, and probably involves historical factors that I wasn't paying attention to when they happened, or wasn't around to see, but I find it interesting anyway.

The parts of programming that are the most universal are not necessarily the most important to everyone. The size of the mantissa in a IEEE 754 double precision float is certainly universal, and it's very easy to teach, because the answer is a single number, and learning it doesn't require anyone to actually do anything. However, it's not that important because if you're writing normal code, it's unlikely to be relevant to the task at hand, and if you're writing code that interacts with floats at a low level, for example implementing them in software, details like this will quickly become second nature as you refer to them so often. There's the argument that knowing a few basic facts can help get you started, but a better programmer who didn't know that fact would have an easier time. You can look up numbers on the job, but building skills, not so much.

It seems that more people than necessary learn things the hard way. Ideally it would be the responsibility of those who did it the hard way first to pave the way for everyone else to do it more easily, but that doesn't seem to have happened in general. One moderately successful example of this happening that I can think of is that maths PhD theses are usually a good way to learn something because they generally include a lot more detail than academic papers. Reading and understanding the papers would be hard, but reading the thesis is much more useful because the author generally clarifies, explains, and contextualises the papers they work with. It doesn't make it "easy" but it removes a little of the unnecessary difficulty of dealing with academic writing and its unfortunate traditions.

The Easy Way (or: "Helping People")

If you're learning how to do something, it's likely that there's someone better at it than you out there, so you should try and see what that person is doing differently. That doesn't make everything they do automatically right, which can lead to a slightly weird situation where you might know something they don't. For example, I saw someone write char alphabet[65] = "..." in a base64 implementation, presumably to make space for the null terminator, but that's not actually necessary. In some sense, this knowledge is useful, because it lets you write slightly clearer code, but in another sense, it's useless because even though I know it and they don't, they're still accomplishing more than I am.

In all kinds of different fields I've seen top performers make basic mistakes. It took a long time for me to learn that my ability to avoid those mistakes didn't make me better than them, because I was also making much bigger and more important mistakes. For example, in a fighting game, ordering your inputs in a slightly unintuitive way would deal damage faster, and I knew how to do this trick. When I saw someone better than me do it the "normal" way, I'd feel smugly superior. Then, I'd lose, because I wasn't standing in the right place, choosing to fight at the correct times, etc. That was the most meaningful part of the game, and I'd been hung up on some small detail.

Programmers certainly aren't above getting hung up on small details either. The concept of "bikeshedding" is one example of that. Learning to ignore things that don't matter is important, but equally, it's hard to tell which things are important and fundamental in advance. If you ask people what the fundamentals of programming are, you'll almost certainly get the wrong answers, because most things that are actually fundamental aren't things that people engage with consciously. If someone is better than you at doing something, and you can tell this by looking at their work, rather than trying to find out what they're doing by asking them, you should find that out by looking at their work too.

Work is both a noun and a verb, referring to a product and a process respectively. Both of these are important to look at, especially since code as it's written doesn't necessarily reflect the process of writing it. If someone tried three different strategies to implement something before finding one that worked, you'll only see the third one if you just look at the finished code. Version control makes it easy to edit development history and make future development easier through operations like bisecting, which largely relies on a clean history, as well as having clearer commit messages and so on — random changes made to try and fix a bug won't be included. However, if that happened, there's no way anyone would be able to tell that the way you fixed that bug was by randomly changing things by looking at the commit logs.

That's a pretty crude example, but the point is that it's hard to tell what people actually did, purely by looking at the result. For example, I wouldn't have figured out that someone worked on a Wayland compositor over ssh from their laptop so that they could use breakpoints in their debugger without freezing their session, if no one had told me, unless I'd physically seen them doing it. There are lots of ways that people gather information about their programs, like debuggers and logging, and there are also many tools written for specific programs. Because of this, it's hard to teach this stuff in a way that's detailed enough that it will actually help you, but also general enough that it will be applicable beyond the specific examples used. Once again, the solution is to actually do it, but unlike programming itself, you don't automatically get detailed feedback about what you're doing wrong.

Spending time programming with or around other people can help you pick up useful things, including debugging techniques. For example, I introduced a group of more experienced programmers to clang's -fsanitize=address feature, just after having fixed a bug that would have been trivial to find with it enabled. I didn't read about this feature in a book or hear it from any kind of course, I just heard about it from people who were using it, and I'd guess that they did too. Another advantage of this, is that often the things you end up talking about when doing actual work are the things that are relevant to that work, whereas when you ask people for advice in a vacuum, you often get answers which aren't relevant to anything, just because there's nothing meaningful you can really say with no context.

Still, programming isn't just a long list of easily describable tricks like "here is a useful compiler feature", "here's a useful debugger command", and so on. Slightly less easily describable is understanding of programming language idioms, like automatically knowing that writing for (size_t i = length; i--;) to iterate backwards over an array correctly avoids off-by-one errors. While there are lots of constructs like this in any language, that doesn't necessarily convey something like a good way to structure a particular program, for example, which gets further from the realm of "this is correct, this is wrong", to concerns which vary significantly in different circumstances. It takes an amount of time measured in seconds to check that your for-loop is correct, but the time to verify that the big-picture architecture of a program was well-designed can be measured in years.

Beyond looking at what more experienced people are doing, it's also helpful to have them look at what you're doing and give you feedback. I said before that it was a bad thing that no one reads your code as a student. This is because if you don't have reasonable code to use as an example, and you also don't have feedback from someone who knows better, then the code produced can spiral into oblivion as some subset of available tools are used in the worst possible way to do something that technically solves the problem. Usually when you look at this stuff, you'll say something like "why didn't they just ..." and the answer is that they don't know that it exists. I've had evidence of code which manually recreates for-loops sent to me by an engineering student. Imagine how that person is solving more complicated problems than iterating over an array.

(The code also incorrectly assigned the output of fgetc to a char variable, breaking the subsequent EOF comparison, and led to an infinite loop if the char type was unsigned, but that part was written by the instructor.)

There are lots of problems which are completely trivial for someone with more knowledge to fix. For example, I've dealt with the above fgetc problem before, which is why I could immediately see it was wrong, and immediately see how to correct it. Lots of problems have already been solved, and you just need someone to tell you about the solution. Lots of people doing things the hard way are doing that because no one told them about some library function, for example. Sharing the overall weight of the problems being dealt with over a group of people means that each of them has to solve less of them, as long as they share their solutions. This can come in the form of feedback, as we described above, where someone struggling with a problem asks for help, and finds out that it's already been solved. If people keep asking you the same question about how to do something, or you think some information will be important later, consider writing it down. This rare practice is known as "documentation". That way, people can find the solution to a problem before they run into it.

Having a good relationship, or at least any at all, with people who can trivially fix some subset of your problems, is very useful. There's a compounding effect, where the time you waste trying to fix something trivial could have been spent doing something not only more useful, but also more educational. You can combat this to some extent by having multiple things you're working on at the same time, so you can move to another one if you get stuck, but if you never get unstuck, you're eventually going to run out of projects to switch to. Right now I'm sitting here not really knowing how to fix a particular problem, but luckily, I know someone I can ask who almost certainly knows the one line I need to add to a build script to fix it. Still, I don't want to waste their time, so I'll at least have a go myself in case I can. I've sometimes heard people say something like "always try and fix it on your own for 15 minutes before asking someone else", which might be a good rule of thumb for making sure something is as hard as you think it is before involving others. From there, hopefully enough obstacles will be removed that you're able to exercise your own judgement and focus on what you wanted to do in the first place.

This doesn't say anything about tacit knowledge, or even anything nontrivial, but that's kind of the point. Those are things you learn from other people, rather than being conveyed over text. Pretty much anyone who's good at anything will emphasise the people they learnt from, whether that's a sports coach, a teacher, or a particular team at a job they had. It's possible to find those kinds of people, and learn from them by being around them and working with them. Exactly how to do that will vary so much from person to person, each with a unique set of circumstances and constraints, that there's nothing useful to say without taking all of that into account. The rest is left as an exercise to the reader.