Eli Arbel - *N Async, the next generation

*N Async, the next generation

Aug 10, 2016 • Eli Arbel

In the previous installment, I discussed how to use iterators (yield return) to create async methods. This time, we’re about to do almost the opposite – use async methods to implement async iterators.

Here’s what it looks like:

What are async iterators?

In .NET we use the IEnumerable<T> and IEnumerator<T> interfaces to create forward-only iterators. The enumerator interface contains a bool MoveNext() method, that when called, advances the iterator to the next item.

Async iterators replace this method with Task<bool> MoveNext(), so that each step can be performed asynchronously. This is useful when the next item should be retrieved asynchronously – mainly because it incurs IO, such as when iterating over Entity Framework entities (materialized from a DbReader) and Service Fabric Reliable Collections (which may require disk IO if the items are not in memory). Both of these frameworks expose their own IAsyncEnumerable<T> and IAsyncEnumerator<T>, which work as I just described.

The RX project has yes another implementation for async enumerable, which also provides a full LINQ implementation, so you can use operators such as Where and Select.

Language support

Unfortunately, C# is lagging a bit behind. While it does support creating iterators using yield return and async methods using async and await, currently there’s no way to combine the two. Or is there? 🙂

A very nifty feature has been added to the latest Roslyn beta (2.0.0-beta4 – not yet released) – “arbitrary async returns“. This was added mainly to address some allocation optimizations when dealing with Tasks (which are reference types) by providing an awaitable value type that defers the creation of the task until absolutely necessary, called ValueTask. But the compiler feature is much more flexible than that – it enables us to create custom “async method builder” classes that allow returning any¹ type from an async method.

I realized this feature could be somewhat abused to create async iterators, as seen in the above example.

How does it work?

A YieldReturnAwaitable which the extension method YieldReturn() returns. This awaitable/awaiter just wraps the task’s awaiter, except for the IsCompleted property. More on that later.
AsyncEnumerableTaskMethodBuilder<T> which allows the compiler to create the async state machine. It works differently from the task method builder, because when returning an enumerable, it can be invoked multiple times by calling GetEnumerator(). Also, the async state machine’s MoveNext() is not invoked automatically, but rather by the enumerator’s MoveNext().
- The state machine is started by calling AsyncEnumerator<T>.MoveNext(). A TaskContinuationSource<bool> is created to hold the return value of MoveNext().
- Each time there’s an await in the method, the AwaitOnCompleted gets called (unless it completes synchronously). If it’s our special YieldReturnAwaiter, we stop executing and set the MoveNext() task result to true. We also fetch the value using the awaiter’s GetResult() method. Otherwise (as in the Task.Delay() in the example), we just hook up the continuation and let it continue until hitting the next “yield return”.

There’s a small “type safety” issue – the compiler won’t stop us from using YieldReturn() on any type in the method. But of course we only take values from yields that match the method’s return type.

Lastly, this is just a prototype. I’ll have to review it more thoroughly to make sure it’s thread safe. I’m also not sure if ExecutionContext capturing was done correctly.

You can see the full implementation in this gist. Note that compiling it requires launching VS using the current Roslyn master branch. You can also view the decompilation results on Try Roslyn.

¹ Somewhat inaccurate – the return type must have a static method called CreateAsyncMethodBuilder, so you can’t extend types you don’t own.