Yielding functions proposal

Hi all!


While swift 2 is not open source yet, there isn't a specific place for discussing new feature proposals. Maybe for now this is the most appropriate place? I want to propose swift support for "yielding functions" a concept similar to coroutines, generators, fibers, etc in other languages. These constructs provide an excellent framework for working with asynchronous code, making it feel more like synchronous code, and making the job of handling errors much easier than with callbacks.


My proposal is something along the lines of:


func createYieldingFunction(foo: String) -> String yields -> String {
    return { bar in      
        let baz = yield foo + bar
        return foo + baz
    }
}

do {
    let yieldingFunction = createYieldingFunction(foo: "foo") // yieldingFunction will be: String yields -> String
    let foobar = try yieldingFunction("bar")                  // foobar will be: "foobar"

    if yieldingFunction.isAlive {
        print("this will print")
    }

    let foobaz = try yieldingFunction("baz")                  // foobaz will be: "foobaz"

    if yieldingFunction.isAlive {
        print("this will not print")
    }

    try yieldingFunction("blah")                              // will throw error
} catch {
    print(error)                                              // will print something like: Dead yielding function called
}


The function createYieldingFunction is a regular function that returns a "yielding function". We use a function that generates a "yielding function" because these kinds of functions have a limited lifetime. If the "yielding function" returns, it turns into a "dead" state, meaning that it can't execute code anymore. So, if we have a dead yielding function, we can just create another one by calling createYieldingFunction again.


Yielding functions are very similar to "throwing functions", actually, every yielding function is implicitly marked as throws (or act like they're marked as throws) and should be called with the "try", "try!" or "try?" keywords. It "throws" by default because if you try to call a dead yielding function it wouldn't be capable of honoring the call, so it should throw a standard error like DeadYieldingFunctionError. If we want to know at runtime if the function is still alive we can call "isAlive" on it, which in turns returns true if the yielding function is still alive or false if it is dead.


The first time you call a yielding function you pass the required parameters like a regular function. When the function yields, it returns the yielded value like a regular return and passes the control back to the caller. The next time the caller calls the yielding function passing the required parameters, the code continues execution from where it left off. The "yield" keyword inside the yielding function returns the passed parameters and the execution goes on untill a new yield or a return appears.


Another important feature of yielding functions is the ability to throw errors from the caller scope into the yielding function. This is done by calling throw before the yielding function with the specified error.


func createYieldingFunction() -> Void yields -> Void {
    return {
        do {
            yield
        } catch {
            print(error) // will print Error(description: "Error that will be thrown into the yielding function")
        }
    }
}

do {
    let yieldingFunction = createYieldingFunction()
    try yieldingFunction() // executes untill the yield
    throw yieldingFunction Error(description: "Error that will be thrown into the yielding function")
} catch {
    print(error) // will never be called
}


If the the yielding function catches the error, it can deal with the error inside the yielding function. It's important to know that even if you catch errors inside the yielding function the yielding function itself will still throw errors if it is called in a dead state.


func createYieldingFunction() -> Void yields -> Void {
    return {
        do {
            yield
        } catch {
            print(error) // will print Error(description: "Error that will be thrown into the yielding function")
        }
    }
}

do {
    let yieldingFunction = createYieldingFunction()
    try yieldingFunction() // executes untill the yield
    throw yieldingFunction Error(description: "Error that will be thrown into the yielding function")
    try yieldingFunction() // throws DeadYieldingFunctionError
} catch {
    print(error) // will print something like: Dead yielding function called
}


If the yielding function doesn't catch the error, the error will be retrown back to the caller.


func createYieldingFunction() -> Void yields -> Void {
    return {
        yield
    }
}

do {
    let yieldingFunction = createYieldingFunction()
    try yieldingFunction() // executes untill the yield
    throw yieldingFunction Error(description: "Error that will be thrown into the yielding function")
} catch {
    print(error) // will print Error(description: "Error that will be thrown into the yielding function")
}


TL;DR


With yielding functions, instead of this pyramid of doom:


getString { string, error in
    if let error = error {
        print(error)
    } else {
        let upperCaseString = string!.uppercaseString
        getOtherString(upperCaseString) { anotherString, anotherError in
            if let error = anotherError {
                print(error)
            } else {
                print(anotherString!)
            }
        }
    }
}


We can write this:


async { _ in
    try {
        let string = yield getString()
        let upperCaseString = string.uppercaseString
        let anotherString = yield getOtherString(upperCaseString)
        print(anotherString)
    } catch {
        print(error)
    }
}


We basically turned asynchronous code into synchronous looking code, solving the readability and easing error handling. (getString() and getOtherString() are still async)

Replies

What advantage has this compared to just using Grand Central Dispatch?

This proposal of yielding functions is very similar to the concept of coroutines, fibers and generators in other languages. It is a language construct whereas GCD is a library that implements task paralellism based on the thread pool pattern. Actually to get the most from yielding functions (in respect to async programming) you would need a library that provides async capabilities. But a very important thing to notice is that Asynchronous Programming is NOT the same as Parallel Programming. GCD provides async capabilities with concurrent powers, which is awesome. The real problem with async programming is something called callback **** (opposite of heaven), which is when you have horrible nesting of closures (used as callbacks) that obscure the real intent of the program. There are other constructs called futures and promises which mitigate the problem, but they still don't feel as natural as synchronous code. You can see the power of yielding functions if you look at the similar concept of generators in ECMAScript 6. Node.js is known for its async capabilities and the problems with callback ****. With generators you can write code that reads more like natural synchronous code. You can find more information here:


https://medium.com/@tjholowaychuk/callbacks-vs-coroutines-174f1fe66127


This is a post from TJ Holowaychuk. He's the creator of express.js, one of the most used node.js frameworks. In this post he explains the advantage of generators vs callbacks. With the advent of generators he stoped working on express.js to start working with koa.js which is his new framework that is completely based on generators.


http://koajs.com/


tl;dr


GCD + yielding functions = asynchronous code that reads closer to synchronous code

First of all, the way I see it, it is more testable and composable than GCD. Try to write this in with GCD and callbacks and see how verbose it becomes:


func asyncDouble(input: Int) -> Void yields -> Int {
    let delay = dispatch_time(DISPATCH_TIME_NOW, Int64(4 * Double(NSEC_PER_SEC)))
    dispatch_after(delay, dispatch_get_main_queue()) {
        yield 2 * input
    }
}

let a = [1, 2, 3, 4, 5, 6, 7, 8, 9, 10].reduce(0) { $0 + try! asyncDouble($1)() }


Secondly, it makes for very good iterators/enumerators/generators or however you call them:


struct Collection<T> {
    func next() -> T? {
          // ...
    }
  
    func enumerator() yields -> T {
        while let element = self.next() {
            yield element
        }
    }
}

I may be strange, but I see zero value in hiding async code from the developer. Instead I value stating clearly when something is asynchronous.


Btw GCD has no callback-****. What would it need them for?

I gladly map them to sane code, but what is that yield nonsense supposed to do?

If I'm reading the asyncDouble example correctly, the outer thread iterates the array in order and blocks until each doubling computation completes. So the threading is more or less a no-op here. Is that right?


Even if you were to extend this to use a parallelized version of reduce(), I'm not seeing how the yield construct facilitates asynchronous coding. When you call a yielding function, you always block the current thread, yes?

It's not about hiding async code. It's always going to be obivous that the code is async. GCD is block-based (or closure based if we use the swift term) it is mostly used with callbacks, so of course it can suffer from callback h3ll, unless you use futures and promises. Callback h3ll is not something a library would need or not need, that doesn't even make sense. It is a side-effect of using closures to run code asynchronously.

I really don't get what you mean. In all my GCD code I never needed a callback. I create a block and send it off, if it has something else to do afterwards it creates new blocks to update UI, data, etc. The initial code isn't interested in this after the creation of the block and simply continues with other duties.

Actually the asyncDouble wouldn't be a valid code, because you can not yield from inside a closure just as you can not return from inside a closure. Maybe that could be possible if the function that receives the closure (in this case dispatch_after) is marked as @noescape, but that wouldn't be the case with dispatch_after anyway. I'm writing an example that makes things more clear. I'll post it here when I get to it.

Exactly, in this scenario that you're stating, the block is the callback. From wikipedia:

In computer programming, a callback is a piece of executable code that is passed as an argument to other code, which is expected to call back (execute) the argument at some convenient time. The invocation may be immediate as in a synchronous callback, or it might happen at later time as in an asynchronous callback.


When you have a lot of callbacks inside of callbacks, you get callback h3ll. I'm writing an example that makes things more clear. I'll post it here when I get to it.

No, yielding functions alone have nothing to do with paralelism or blocking threads. It is simply a function that allows multiple entry points, or suspending and resuming execution at certain locations (the top of the function, the yield and the return). When you call a yielding function first, it starts from the top untill it finds a yield or a return. If it finds a return, the yielding function "dies" becoming useless. Conversely, if it finds a yield, it will return the value it should return in a regular return, BUT when the yielding function is called again, it will continue to execute from the yield. Untill it finds a return (and thus, "dying") or a yield, which will repeat the whole process.

With this functions


import Foundation

struct Error : ErrorType, CustomStringConvertible {
    let description: String
}

func getString(completion: (String?, ErrorType?) -> Void) {
    let delay = dispatch_time(DISPATCH_TIME_NOW, Int64(1 * Double(NSEC_PER_SEC)))
    dispatch_async(dispatch_get_global_queue(priority, 0))

    dispatch_after(delay, queue) {
        if arc4random_uniform(2) == 0 {
            completion("hello", nil)
        } else {
            completion(nil, Error(description: "Error in getString"))
        }
    }
}

func getOtherString(string: String, completion: (String?, ErrorType?) -> Void) {
    let delay = dispatch_time(DISPATCH_TIME_NOW, Int64(1 * Double(NSEC_PER_SEC)))
    let queue = dispatch_get_global_queue(DISPATCH_QUEUE_PRIORITY_DEFAULT, 0)

    dispatch_after(delay, queue) {
        if arc4random_uniform(2) == 0 {
            completion("\(string) world!", nil)
        } else {
            completion(nil, Error(description: "Error in getOtherString"))
        }
    }
}


Instead of this callback h3ll


getString { string, error in
    if let error = error {
        print(error)
    } else {
        let upperCaseString = string!.uppercaseString
        getOtherString(upperCaseString) { anotherString, anotherError in
            if let error = anotherError {
                print(error)
            } else {
                print(anotherString!)
            }
        }
    }
}
dispatch_main()


With yielding functions you could write a function like this one:


func async<T>(f: T! yields -> Future<T, Error>) {
    let queue = dispatch_get_global_queue(DISPATCH_QUEUE_PRIORITY_DEFAULT, 0)
    dispatch_async(queue) {   
        do {
            let semaphore = dispatch_semaphore_create(0);
            var future = try f(nil)
            while true {
                future.onSucces { value in
                    future = try f(value)
                    dispatch_semaphore_signal(semaphore)
                }
                future.onError { error in
                    throw f error
                    dispatch_semaphore_signal(semaphore)
                }
                dispatch_semaphore_wait(semaphore, DISPATCH_TIME_FOREVER);
                if !f.isAlive { break }
            }
        } catch {
            throw f error
        }
    }
}


And then with this functions


import Foundation

struct Error : ErrorType, CustomStringConvertible {
    let description: String
}


func getString() -> Future<String, Error> {
    let delay = dispatch_time(DISPATCH_TIME_NOW, Int64(1 * Double(NSEC_PER_SEC)))
    let queue = dispatch_get_global_queue(DISPATCH_QUEUE_PRIORITY_DEFAULT, 0)
    let promise = Promise<String, Error>()

    dispatch_after(delay, queue) {
        if arc4random_uniform(2) == 0 {
            promise.success("hello")
        } else {
            promise.failure(Error(description: "Error in getString"))
        }
    }
    return promise.future
}


func getOtherString(string: String) -> Future<String, Error> {
    let delay = dispatch_time(DISPATCH_TIME_NOW, Int64(1 * Double(NSEC_PER_SEC)))
    let queue = dispatch_get_global_queue(DISPATCH_QUEUE_PRIORITY_DEFAULT, 0)
    let promise = Promise<String, Error>()

    dispatch_after(delay, queue) {
        if arc4random_uniform(2) == 0 {
            promise.success("\(string) world!", nil)
        } else {
            promise.failure(Error(description: "Error in getOtherString"))
        }
    }
    return promise.future
}


You could write this code with no callback h3ll.


async { _ in
    try {
        let string = yield getString()
        let upperCaseString = string.uppercaseString
        let anotherString = yield getOtherString(upperCaseString)
        print(anotherString)
    } catch {
        print(error)
    }
}

dispatch_main()


This makes the code more readable and easier to deal with errors. The code seems like a regular synchronous code and only grows vertically, instead of the callback h3ll or pyramid of doom with many levels of nested indentation.

Call me stupid but the following code from you doesn't make any sense (neither for me or the Swift parser...). What is it supposed to do?


func getData(completion: (Data, Error) -> Void) { ... } 
func getMoreData(data: Data, completion: (Data, Error) -> Void) { ... } 
 
getData { data, error in 
    if let error = error { 
        // deal with error 
    } else { 
        // do some stuff with data 
        getMoreData(data) { moreData, anotherError in 
            if let error = anotherError { 
                // deal with error 
            } else { 
                // do stuff with more data 
            } 
        } 
    } 
}

This comparison is not fair:


func getData(completion: (Data) throws -> ()) rethrows  { ... }
func getMoreData(data: Data, completion: (Data) throws -> () ) rethrows { ... }

do {
   try getData(){ data in
    // do some stuff with data
    try getMoreData(data){ data in
        // do stuff with more data
    }
   }
} catch {
    // deal with error
}


And this example shows nothing about async, and is not a valid GCD block/callback anyway.


I hope we can get a better concurrency handler in Swift (like in Rust maybe), GCD is good, but we can get something more Swift and less C. But this example does help either.


Maybe you can show a "real" example using GCD vs Yielding.

I don't think I understand what you mean. This code that you showed is not possible. You can't throw from inside a code that is not marked with @noescape as it is in the case with dispatch_async, dispatch_after, etc..


I'll say again, yielding functions has nothing to do with concurrency by themselves. And yes, swift could have something like async await from C# baked in the language. But then it would have to handle concurrency in the language itself. With yielding functions we can use GCD to work with concurrency without the need for swift handling concurrency. If you use GCD through a wrapper you don't even feel its C origin.


The example I showed is GCD + yielding functions. It's not about GCD vs Yielding, I never stated that.


I'll edit the code to make it even more explicit...