Implementing Generators (yield) in C

The end result works better than you might expect.

Feb 19, 2024

Generators are a subset of coroutines that can be used to implement iterators, streams and state machines in a super convenient manner. I love them in Python, C# and JavaScript, where you’ll recognize them as functions that use the yield keyword instead of regular return.

While C does not natively support generators, we can implement them in a somewhat clever way by abusing a quirk of how the switch statement works1.

Why would you want to implement generators in C? I can think of a few reasons:

You want to have a better understanding of how generators (or coroutines in general) work.
You use C as a compilation target for your programming language and want to add support for yield.
They can be really convenient!

Fibonacci Numbers

As a running example we’ll be turning a function that prints the fibonacci numbers into a generator function that yields the fibonacci numbers on demand. Here’s our starting point:

void fib(void) {
  int a = 0;
  int b = 1;
  int n = 1;
  while (1) {
    printf("%d\n", a);
    a = b;
    b = n;
    n = a + b;
  }
}

This is a simple iterative fibonacci implementation that prints out an “infinite”2 stream of the fibonacci numbers. Hopefully nothing too surprising.

Instead of having fib print the numbers out directly (or store them in a list or whatever else), we want it to lazily produce the next number in the sequence on demand, with the caller then doing whatever they want with it.

To achieve this we’ll turn fib into a coroutine, meaning a function that can be suspended and resumed3.

Stack Frames

First thing we need to do is store fib’s local variables outside of it. We can’t have fib’s local variables in registers or the regular C stack because when we suspend it we’re returning back to the caller, which will trample all over those registers and the stack before we get to resume fib.

The way to do this in C is to manually put all those local variables into a separate struct, reifying fib’s stack frame, and storing it elsewhere4:

typedef struct {
  int a;
  int b;
  int n;
} fib_frame;

void fib(fib_frame* f) {
  f->a = 0;
  f->b = 1;
  f->n = 1;
  while (1) {
    printf("%d\n", f->a);
    f->a = f->b;
    f->b = f->n;
    f->n = f->a + f->b;
  }
}

// usage
fib_frame f = {};
fib(&f);

All we’ve managed to do so far is make fib less efficient and more annoying to call, but now we have fib’s stack frame stored outside of the machine’s stack, which we will need to implement suspension.

Suspending a Function

In order to suspend fib, we need to store some additional information in its stack frame: the equivalent of the program counter. We need to know in which “instruction” fib stopped at after the yield:

typedef struct {
  // program counter (sort of)
  int _pc;
  int a;
  int b;
  int n;
} fib_frame;

We’ll also change fib so that instead of printing numbers it returns them:

int fib(fib_frame *f) {
  f->a = 0;
  f->b = 1;
  f->n = 1;
  while (1) {
    return f->a;
    // we want to resume here somehow
    f->a = f->b;
    f->b = f->n;
    f->n = f->a + f->b;
  }
  return -1; // we'll never get here
}

We need to store the address of the instruction after the return, and jump directly there the next time we call fib. If you’re thinking goto, you’re on the right track. For example, we can do the following:

int fib(fib_frame *f) {
  if (f->_pc == 1) { goto resume_1; }
  f->a = 0;
  f->b = 1;
  f->n = 1;
  while (1) {
    // store the resumption point before returning
    f->_pc = 1;
    return f->a;
  resume_1:
    f->a = f->b;
    f->b = f->n;
    f->n = f->a + f->b;
  }
  return -1; // we'll never get here
}

And then we can use fib as follows:

int main() {
  fib_frame f = {};
  for (int i = 0; i < 10; i++) {
    printf("%d\n", fib(&f));
  }
  return 0;
}

The fib function just keeps producing the next number in the sequence each time we call it, so if we call it 10 times, we get the first 10 fibonacci numbers.

Labels in C aren’t first class, so there is no way to store resume_1 directly in _pc, and goto in C cannot be used to jump to a numerical address5, hence the conditional at the start translating the number 1 to the appropriate jump.

If we had more yield points, the same pattern would repeat:

if (f->_pc == X) { goto resume_X; }
if (f->_pc == Y) { goto resume_Y; }
...
f->_pc = X; return something; resume_X:  // yield
...
f->_pc = Y; return whatever;  resume_Y:  // yield

Those ifs could be a switch statement instead:

switch (f->_pc) { 
  case X: goto resume_X;
  case Y: goto resume_Y;
}

And those yields are asking for a macro:

#define yield(PC, X, R) PC = X; return R; resume_##X:

// usage
while (1) {
  yield(f->_pc, 1, f->a);
  ...

If we standardize on the name of the frame parameter (f) and its program counter field (_pc), we can remove the first argument:

#define yield(X, R) f->_pc = X; return R; resume_##X:

Much better, here’s the full example now:

int fib(fib_frame *f) {
  switch (f->_pc) { 
    case 1: goto resume_1;
  }
  f->a = 0;
  f->b = 1;
  f->n = 1;
  while (1) {
    yield(1, f->a);
    f->a = f->b;
    f->b = f->n;
    f->n = f->a + f->b;
  }
  return -1; // we'll never get here
}

Still, needing to spell out the jumps at the start of the function and explicitly numbering the yield points is a bit annoying. Good enough for a compiler backend, but annoying for direct usage in C, the explicit stack frame is bad enough… However, we can do better by taking advantage of a feature that’s usually considered a wart in C.

Switch Fall-Through

Most languages, including newer members of the C-family, have their equivalent of a switch statement work like this:

switch (x) {
  case 1 {
    // do this for case 1
  }
  case 2 {
    // do this for case 2
  }
  case 3 {
    // do this for case 3
  }
}

C’s switch statement doesn’t work like this, instead it works like this:

switch (x) {
  case 1:
    // do this for case 1
  case 2:
    // do this also for case 1
    // do this for case 2
  case 3:
    // do this also for case 1
    // do this also for case 2
    // do this for case 3
}

This is called fall-through and is a common beginner trap and source of bugs for sleep deprived programmers that forget to put a break at the end of each case. C# retains the same syntax but always requires an explicit break to avoid this issue.

Basically, those cases work like our resume_X label from before. You can even put a case inside a block from another case and C won’t bat an eye:

// this works, what the hell C!
switch (x) {
  case 1: while (1) {
    case 2: if (0) {
      case 3: 
        ..
    }
  }
}

Nasty, but we can take advantage of this. Instead of the switch at the start of the function mapping a number to a goto statement, we can use it to jump directly:

int fib(fib_frame *f) {
  switch (f->_pc) { 
  case 0:
    f->a = 0;
    f->b = 1;
    f->n = 1;
    while (1) {
      f->_pc = 1;
      return f->a;
    case 1:
      f->a = f->b;
      f->b = f->n;
      f->n = f->a + f->b;
    }
  }
  return -1; // we'll never get here
}

Might seem like a small change, but the start of the function is now always the same, regardless of the number of yields:

switch (f->_pc) { 
  case 0:

This not only reduces the amount of boilerplate, but it also enables us to handle the numbering of the yield points automatically6. Add another pair of convenience macros and we’re done7:

#define generator_init() switch (f->_pc) { case 0:
#define generator_done(R) } return R
#define yield(R) f->_pc = __LINE__; return R; case __LINE__:

typedef struct {
  int _pc;
  int a;
  int b;
  int n;
} fib_frame;

int fib(fib_frame *f) {
  generator_init();
  f->a = 0;
  f->b = 1;
  f->n = 1;
  while (1) {
    yield(f->a);
    f->a = f->b;
    f->b = f->n;
    f->n = f->a + f->b;
  }
  generator_done(-1);
}

This blog post could stop here but sadly I have no self control.

Bonus: Iterators

The above is a perfectly serviceable generator function but it doesn’t work quite the same way as generator functions in Python or C# do. If our generator function has any input arguments, we need to initialize them directly in the frame structure, which is a bit unusual to say the least. Easy fix, split the function in two, and lets rename frame to iterator while we’re at it:

typedef struct {
  // same fields as before, new name
} fib_iterator;

int fib_next(fib_iterator *f) {
  // same as our old fib
}

fib_iterator fib(void) {
  // no arguments so zero initialized
  // should have used a different example...
  return (fib_iterator){};
}

// no need to initialize the struct manually anymore
fib_iterator f = fib();
for(int i = 0; i < 10; i++) {
  printf("%d\n", fib_next(&f));
}

Now it looks more familiar, but it’s the same thing really.

Bonus: Foreach

Other languages also have foreach loops and various ways of combining iterators. Right now we don’t have a standardized way for an iterator to indicate that it has stopped, and our running example (fib) never terminates, so we need a new one:

#define generator_init() switch (i->_pc) { case 0:
#define generator_done() } return false
#define yield(R) i->_pc = __LINE__; *r = R; return true; case __LINE__:

typedef struct {
  int *array;
  int len;
  int idx;
  int _pc;
} rev_iterator;

rev_iterator rev(int *array, int len) {
  return (rev_iterator){array, len};
}

bool rev_next(rev_iterator *i, int *r) {
  generator_init();
  for (i->idx = i->len-1; i->idx >= 0; i->idx--) {
    yield(i->array[i->idx]);
  }
  generator_done();
}

The new example is a simple reverse iterator, yielding the elements of an array starting from the end and going backwards.

The next function now returns a boolean indicating if the iterator produced a value, while storing the actual output in an out parameter. We also adjust the macros accordingly. Now we can make a “foreach” loop:

rev_iterator i = rev((int[]){1, 2, 3}, 3);
for(int v; rev_next(&i, &v);) {
  printf("%d\n", v);
}

Bonus: Itertools

Ok ok, I swear this is the last one. Other languages let us combine iterators with fancy generic functions like zip, take_while, etc. We can also do those in C… somewhat.

First change is that we need to have a shared interface for our iterators:

typedef struct iterator iterator;
struct iterator {
  int _pc;
  bool (*next)(iterator*, void*);
};

typedef struct {
  iterator base;
  int *array;
  int len;
  int idx;
} rev_iterator;

Yup, we’re making objects.

We also need to update the constructor function8 and the macros. The rev_next function from before stays the same:

#define generator_init() switch (i->base._pc) { case 0:
#define generator_done() } return 0
#define yield(R) i->base._pc = __LINE__; *r = R; return 1; case __LINE__:

rev_iterator rev(int *array, int len) {
  return (rev_iterator){
    .base.next = (bool (*)(iterator*, void*))rev_next,
    .array = array, 
    .len = len,
  };
}

This shared interface allows us to implement a yield_from macro:

#define iter_next(I, R) ((iterator*)(I))->next((iterator*)(I), (R))
#define yield_from(I) while(iter_next((I), r)) { i->base._pc = __LINE__; return 1; case __LINE__:; }

And with it, we can implement a sequence iterator:

typedef struct {
  iterator base;
  iterator* fst;
  iterator* snd;
} seq_iterator;

bool seq_next(seq_iterator* i, void* r) {
  generator_init();
  yield_from(i->fst); 
  yield_from(i->snd);
  generator_done();
}

seq_iterator seq(iterator* fst, iterator* snd) {
  return (seq_iterator) {
    .base.next = (bool (*)(iterator*, void*))seq_next,
    .fst = fst,
    .snd = snd,
  }
}

Now we can pretend we’re a modern programming language with streams:

rev_iterator r1 = rev(int[]{1, 2, 3}, 3);
rev_iterator r2 = rev(int[]{4, 5, 6}, 3);
seq_iterator s = seq((iterator*)&r2, (iterator*)&r1);
for(int v; seq_next(&s, &v)) {
  printf("%d\n", v);
}

If we change the generator constructor functions to heap allocate and return opaque pointers9, we’ve reached parity with all these new language whippersnappers:

iterator* r1 = rev(int[]{1, 2, 3}, 3);
iterator* r2 = rev(int[]{4, 5, 6}, 3);
iterator* s = seq(r2, r1);
for(int v; iter_next(s, &v)) {
  printf("%d\n", v);
}

But in all seriousness this is getting out of hand 😬.

Conclusion

Generators in C with only mild macro abuse. This was quite a fun post to write. I don’t think I’d bother with generators in C most of the time (certainly not with fancy combinators), but it is nice the capability is there if I need it.

I’ll definitely use this for my toy language backend though, generators are awesome.

The same quirk used in Duff’s device.

There will eventually be an overflow and the function will start printing out garbage.

There are two main kinds of coroutine implementations: stackless and stackful. I won’t go into the distinction here because it gets confusing and would probably deserve an own post, but for generators stackless works well.

In Python, C# and JavaScript the local variables are stored as private fields of the iterator object returned by the function.

Storing the address of labels and jumping to them, known as a computed goto, is available as a GCC extension but it is not part of the C standard.

Only one yield per line is a small price to pay.

The two extra macros are really not necessary. I wrote them just in case you find the “switch (x) { case 0:” unbearably ugly for some reason.

If that function pointer cast makes you feel icky, add an additional wrapper function instead.

iterator* seq(iterator* fst, iterator* snd) {
  seq_iterator* i = malloc(sizeof(seq_iterator));
  *i = (seq_iterator){
    .base.next = (bool (*)(iterator*, void*))seq_next,
    .fst = fst,
    .snd = snd,
  };
  return (iterator*)i;
}

Burning the Midnight Coffee

Discussion about this post