Concurrency. Parallelism. These terms get used a lot in development discussions, and especially in Elixir and Erlang circles. But what do they mean?
Let's talk about them using burritos!
The Burrito Shop 🌯🌯
Suppose I work at a burrito shop. Two customers come in, and each orders a burrito.
How do I get the work done?
Burrito Strategy 1: Serial
To do the work serially, I'd get out one tortilla, fill it with yummy stuff, and close it up. Then I'd get out the next tortilla and do the same.
This is neither concurrent nor parallel.
Burrito Strategy 2: Concurrent
To do the work concurrently, I'd get out one tortilla and spread it out. Then I'd get out another tortilla and spread it out.
Now I am making two burritos concurrently. I have two burritos in a partial state of completion at the same moment.
I can switch between these burritos, doing a bit of work on the first, then the second, then the first, then the second. However, because I only have one set of hands, I can only be actively working on one burrito at any given moment.
This doesn't increase the amount of work I can do per second, but it may increase the number of productive seconds I get. For instance, suppose I can't finish the first burrito until my coworker makes more guacamole. With a concurrent strategy, I'm allowed to put it aside for a minute and work on the second burrito. Without concurrency, I'd just have to stand there, staring awkwardly at the customer until the guacamole was ready.
Burrito Strategy 3: Parallel
To do the work in parallel, I'd get a coworker to help me.
We'd each spread out a tortilla. While I'm filling the first one, my coworker would be filling the second.
If you took a photo at that moment, you'd notice that there are two burritos in a partial state of completion. So the work is concurrent.
But there are also two burritos being actively worked on at that moment. This means the work is being done in parallel. Now we're actually getting more burrito-making done per second.
Notice that in these scenarios, the ability to do work concurrently is a precondition of doing it in parallel. For example, if there isn't enough counter space for two burritos, my coworker can't work at the same time as I do. But if I don't have a coworker, all the counter space in the world won't turn my concurrency into parallelism.
Back To Computers
Computer operating systems generally supported concurrency long before we got true parallelism. If you had one CPU, you could still be running multiple programs, and each one appeared to be running simultaneously. In truth, the computer was switching between tasks. But it did it so quickly that the individual bursts of work seemed, to slow human eyes, to blur into one continuous activity like the pages of a flip book.
Later, when we got multiple CPUs per machine, the computer could actually be doing multiple tasks at the same time. But if it has 4 CPUs and 10 tasks to do, it's still switching between tasks to create the same illusion.
Switching between tasks is called "scheduling" - the computer has to decide which task(s) to work on at any given moment. And how it's done is important.
Two ways of scheduling
There is more than one way for scheduling to be done.
One way is called "cooperative multitasking". This means that each task must be polite, and say "I'm finished, now it's someone else's turn." This is nice if you're designing a single process, because you know it will never be interrupted unexpectedly.
But cooperative scheduling also means that a greedy, slow, or hung process can bring the whole system to a halt. Everyone is waiting on that process to hand over control, and it never does.
"Preemptive multitasking" means having a scheduler involved. The scheduler says, essentially, "you processes have to take turns. Each turn is N microseconds long. If you can finish your work in that much time, great. If not, I'll force you to pause and give someone else the CPU. You'll have to get back in line and continue when it's your turn again."
This introduces some overhead: every time a process has to pause what it's doing, a little time is wasted. So in terms of raw speed, cooperative multitasking is better, because it wastes less time switching tasks.
But preemptive multitasking makes the overall system performance more predictable, since one big (or buggy) task can't delay the smaller tasks very much. You don't want one inefficiently-written program, or one unexpectedly large task, to make the whole system hang.
For this reason, nearly all operating systems today use preemptive multitasking.
The Erlang Advantage
The Erlang VM (known as "The BEAM"), on which all Erlang and Elixir code runs, is sometimes called an operating system for your code. Like a system OS, it allows for parallel execution and uses preemptive multitasking (basically - see the BEAM book for caveats). This is in contrast to, for example, Node.js, which is single-threaded and uses cooperative multitasking. Node.js juggles IO tasks efficiently, but a single slow computation could bring all its work to a halt.
The BEAM loses a bit of raw speed by switching tasks so much, but the payoff is that it performs more consistently. For example, suppose you have a Phoenix application where people upload data for processing, and you respond with the processed data. Or suppose you have to parse large XML API responses in order to respond.
Because Phoenix is fronted by Cowboy, every incoming request gets its own BEAM process. If most requests are small, but you occasionally get an expensive request, it's no problem; the other incoming requests won't be affected at all, because the heavy process will keep getting paused to give the other ones a turn. That's true whether the "heavy" process is waiting on IO or performing a long computation.
And if you have multiple cores, the BEAM will use them all to get work done in parallel. The more cores you have, the more this outweighs raw speed.
Of course, most web services will do heavy computation in the background and respond asynchronously. But when you're using the BEAM, that's an option and not a requirement; it won't kill your site to do it synchronously. You can make the decision based on user experience, not uptime.
If you want to know more, Bryan Hunter's talk, "What Every Node.js developer needs to know about Elixir", has a nice explanation of concurrency, parallelism, and the BEAM's scheduler starting here.
Saša Jurić's talk "Solid Ground", published since I wrote this article, talks about the BEAM's concurrency model as the basis for predictable performance and reliability. A nice quote:
"Your runtime is your foundation; it is your ground. When your ground is solid, then you can move on top of it with a lot of confidence at a steady, reasonable pace, and you can start building your systems and confidently watch them grow way beyond your original plans and imagination... Ask not only what your language syntax and your libraries and frameworks can do for you, but first and foremost ask what your runtime can do for you."