Coroutine
A coroutine is a programming concept referring to a function or routine that can pause its execution and yield control back to the caller, and later be resumed from where it left off. Unlike traditional subroutines or functions that run from start to finish, coroutines have multiple entry and exit points, enabling cooperative multitasking or asynchronous processing. Coroutines can be implemented using a variety of techiniques, each offering different levels of control, complexity, and performance.
Coroutines can suspend execution (yield) at certain points and resume from there later. They rely on the program's flow or a scheduler to determine when to yield or resume, instead of being preemptively managed by the operating system like threads. When a coroutine is paused, its state (e.g. variables, execution context) is saved, so it can pick up exactly where it left off. Coroutines are more lightweight than threads or processes because they don't require the overhead of system-managed context switching.
Coroutines are widely used in scenarios that require efficient handling of asynchronous or I/O bounded tasks. They allow for non-blocking operations and offer lightweight multitasking.
Stackful vs. Stackless
Stackful coroutines
Stackful coroutines are more similar to traditional threads. Each coroutine has its own separate stack. When a stackful coroutine is paused (typically via yield), the entire state of the coroutine, including the call stack, is saved. This allows the coroutine to resume exactly where it left off with its local state preserved. It can handle deep recursion and more complex function calls and each coroutine is isolated.
Languages like Lua and C++ (with libraries like Boost.coroutine) use stackful coroutines.
Stackless coroutines
Stackless coroutines, unlike stackful ones, do not have their own call stack. Instead, they rely on a shared execution context, saving only minimal state information—such as variables and the program’s execution position—when yielding. This makes them lightweight, using minimal memory, which is ideal for handling many concurrent tasks efficiently. Because there’s no need to save and restore a full call stack, context switching is faster, and stackless coroutines are easier to implement and manage. However, they are more limited in the types of tasks they can handle, as deep recursion or complex function calls may not be feasible, and they must avoid patterns that depend on a deep call stack, such as recursive calls.
Languages like Python and JavaScript use stackless coroutines.
Implementations
Coroutines can be implemented in various ways, ranging from low-level approaches like
setjmp
/longjmp
and ucontext
in C, which save and restore execution state for
context switching, to more modern, high-level approaches like async
/await
in
languages like Python and JavaScript, which manage suspension and resumption
automatically via event loops. Low-level techniques, including assembly and
manual stack management, offer fine control but are complex and error-prone, while
green threads and coroutine libraries provide lightweight, cooperative
multitasking with simpler management. OS-level threads offer scalability but at
the cost of higher overhead. Each method balances control, complexity, performance,
and portability differently, with higher-level solutions generally being easier to
use and maintain.
setjmp / longjmp
While setjmp
/ longjmp
can be used to simulate coroutines in C or C++, they come
with many challenges, such as portability issues, complex state management, stack
corruption, and lack of proper resource management.
#include <setjmp.h>
#include <stdio.h>
// define a global jmp_buf variable to store the execution context
jmp_buf buf;
void function() {
printf("calling setjmp\n");
// saves the current execution context the first call returns 0
int res = setjmp(buf);
printf("setjmp res: %d\n", res);
if (res != 0) {
printf("done\n");
return;
}
// calling longjmp to jump back to the setjmp point with a non-zero return value
printf("calling longjmp\n");
longjmp(buf, 1);
}
int main() {
function();
return 0;
}
ucontext
ucontext
in C provides functions to get and set execution contexts and control the
flow of execution, similar to setjmp
, longjmp
but with more control over the
state including registers and stack.
#include <stdlib.h>
#include <stddef.h>
#include <stdio.h>
#include <ucontext.h>
#define STACK_SIZE 2048
ucontext_t main_ctx;
ucontext_t ctx[3];
void func1() {
printf("func1 first\n");
// saves the current context into the first argument
// and switches to the context provided as the second argument (main context)
swapcontext(&ctx[0], &main_ctx);
printf("func1 second\n");
}
void func2() {
printf("func2 first\n");
swapcontext(&ctx[1], &main_ctx);
printf("func2 second\n");
}
void func3() {
printf("func3 first\n");
swapcontext(&ctx[2], &main_ctx);
printf("func3 second\n");
}
int main() {
// init main context
getcontext(&main_ctx);
// setup context for func1
getcontext(&ctx[0]);
ctx[0].uc_link = &main_ctx; // after func1 finishes, it will return to main
ctx[0].uc_stack.ss_sp = malloc(STACK_SIZE); // allocate stack for func1, the size depends on what we do in the function
ctx[0].uc_stack.ss_size = STACK_SIZE;
makecontext(&ctx[0], func1, 0); // set func1 to run in this context
// setup context for func2
getcontext(&ctx[1]);
ctx[1].uc_link = &main_ctx;
ctx[1].uc_stack.ss_sp = malloc(STACK_SIZE);
ctx[1].uc_stack.ss_size = STACK_SIZE;
makecontext(&ctx[1], func2, 0);
// setup context for func3
getcontext(&ctx[2]);
ctx[2].uc_link = &main_ctx;
ctx[2].uc_stack.ss_sp = malloc(STACK_SIZE);
ctx[2].uc_stack.ss_size = STACK_SIZE;
makecontext(&ctx[2], func3, 0);
// acts as a simple scheduler
int i;
for (i = 0; i < 6; i++) {
printf("switching from main to ctx %d\n", i % 3);
swapcontext(&main_ctx, &ctx[i % 3]);
}
printf("done\n");
}
switching from main to ctx 0
func1 first
switching from main to ctx 1
func2 first
switching from main to ctx 2
func3 first
switching from main to ctx 0
func1 second
switching from main to ctx 1
func2 second
switching from main to ctx 2
func3 second
done
assembly
Coroutines can also be implemented using assembly. In assembly, we need to manage things like saving and restoring registers, switching statck pointers, and maintaining the state of execution across different coroutine calls or execution contexts.
; assume on x86-64
; non-functional, just to show some ideas
; coroutine_start: Coroutine entry point
coroutine_start:
; Save current state (e.g., registers, stack pointer)
push rbx ; Save registers
push rdi
push rsi
push rdx
; Simulate some coroutine work
; Your coroutine code goes here
; Switch to another coroutine (restore the other coroutine's state)
pop rdx ; Restore state
pop rsi
pop rdi
pop rbx
; Resume the coroutine's execution
ret ; Return from the coroutine, or yield control
; coroutine_switch: Coroutine switch function
coroutine_switch:
; Save the current coroutine state
mov rdi, [state_ptr] ; Load state pointer
mov [rdi], rsp ; Save the current stack pointer
; Switch to the other coroutine's state
mov rsi, [other_state_ptr] ; Load other state pointer
mov rsp, [rsi] ; Restore the other coroutine's stack pointer
; Return control to the other coroutine
ret