Wednesday, September 8, 2010

Standalone Generic Scala Generator

A generic Scala generator class built directly on Scala's shift and reset operators.

In my previous post I showed a generic generator class built on top of my coroutines library. I commented in that post that I was taking the expedient approach of using my existing library, but that it would be possible to package all of the functionality into the Generator class.

I soon realized that it would in fact be relatively easy to do that packaging, and that the resulting relatively simple class would probably make a good vehicle for demonstrating the usefulness of Scala's delimited continuations. Thus I give you the StandaloneGenerator class:

package net.jimmc.scoroutine

import scala.collection.Iterator
import scala.util.continuations._

class StandaloneGenerator[T] extends Iterator[T] {
    private var nextValue:Option[T] = None
    private var nextStep:Option[Unit=>Unit] = None

    /** Subclass calls this method to generate values.
     * @param body The code for your generator.
     */
    protected def generate(body: => Unit @suspendable) {
        reset {
            suspend
            body
        }
    }

    /** Yield the next generated value.
     * Call this code from your generator to deliver the next value.
     */
    protected def yld(x:T):Unit @suspendable = {
        nextValue = Some(x)
        suspend
    }

    /** Retrieve the next generated value.
     * Call this from your main code.
     */
    def next:T = {
        step
        nextValue match {
            case None => throw new NoSuchElementException("next on empty generator")
                    //make it similar to the equivalent Iterator exception
            case Some(x) => nextValue = None; x
        }
    }

    /** True if there is another value to retrieve.
     * Call this from your main code.
     */
    def hasNext:Boolean = {
        step
        nextValue.isDefined
    }

    /** Save our continuation to resume later. */
    private def suspend:Unit @suspendable = {
        shift { k:(Unit=>Unit) =>
            nextStep = Some(k)
        }
    }

    /** If we have a next step but we don't have a value queued up,
     * run the generator until it calls yld or completes. */
    private def step = {
        if (nextValue.isEmpty) nextStep foreach { nextStep = None; _() }
    }
}
The StandaloneGenerator class is a plug-compatible replacement for the Generator class described in my previous post, as long as the derived generator does not use fancy scheduling as in the Primes Generator example. So, for example, you could take the Integers Generator example from that post, replace the two occurences of Generator with StandaloneGenerator, and everything would work the same way.

We have two variables: nextValue is our one-element queue where we store the next value to be returned by the generator, and nextStep is our "scheduler queue" where we store our continuation each time we suspend the generator to return a value. Both of these variables are of type Option so that we can tell when they hold a value.

Control is managed by the two functions suspend and step. The suspend method has a shift block that could not be much simpler: all it does is store the passed-in continuation in the nextStep variable. Since the body of a shift block is always the last thing executed within the scope of the CPS code (delimited either by the enclosing reset block or an explicitly invoked continuation), the fact that suspend does not execute its continuation means that after suspend does its thing, control returns to the point just past the enclosing reset, or to the next statement after the explicit continuation call.

I considered calling the step method resume instead, to make it clear that it was the complement to suspend, but from the point of view of the code calling step, by the time control returns to the point after the call to step, the generator code has already been suspended again: it has completed running one step, which is from one yld call to the next.

The step function executes our continuation if we have one, but only if we don't already have a value in nextValue. Using foreach on an Option is a neat way to execute a block of code only if the Option has a value (i.e. is not None). In this case, since the contents of nextStep (if it has any) is a function, the placeholder variable _ gets set to the continuation and the code fragment _() is what actually executes the continuation. Here are two other ways this function could be implemented that do the same thing:
    private def step1 = {
        if (nextValue.isEmpty) nextStep match {
            case None => //do nothing
            case Some(p) => nextStep = None; p()
        }
    }

    private def step2 = {
        if (nextValue.isEmpty && nextStep.isDefined) {
            val p = nextStep.get; nextStep = None; p()
        }
    }
Let's walk through this and see how it works (assuming the generator generates at least one value):
  1. The main code instantiates a generator, which passes its generator body to the generate function, which calls suspend, which saves that body in nextStep without executing it.
  2. The main code calls hasNext, which calls step, which runs the continuation. This starts the generator body code running, which runs until it calls yld with the first value. That first value is stored in nextValue and the generator code is suspended, with the continuation stored in nextStep. Since we now have a value stored in nextValue, hasNext returns true.
  3. The main code calls next. Since we have a value in nextValue, the call to step does nothing (it is only there in case the caller calls next without first calling hasNext). The value in nextValue is returned, and that variable is cleared (set to None).
  4. Each time the main code calls hasNext and it calls step, the continuation stored in nextStep is executed, the generator code calls yld with the next value, that value is stored into nextValue, the continuation is stored in nextStep, and the generator is suspended, and hasNext returns true.
  5. Eventually (for a finite generator) there is a call to step where the generator finishes execution without calling yld. When this happens, no values are stored in either nextValue or nextStep. Since there is no value in nextValue, hasNext returns false. Since there is also no value in nextStep, further calls to step do nothing, and hasNext will continue to return false.
I dare say this might be one of the simplest realistic examples of the use of Scala's reset and shift operators that you will find. Two variables, six functions each not more than four lines of body code, 16 lines of function body code in total, 35 lines of code altogether.

3 comments:

Dawid Marcin Grzesiak said...

How about this problem http://stackoverflow.com/questions/3797699/block-to-stream-convertion ?

Dawid Marcin Grzesiak said...

I tried this but it doesn't work:

def data(block: Int => Unit) { for (i <- 0 to 10) block(i) }

class IntGen extends StandaloneGenerator[Int] {
generate {
var x = 1

data { x =>
yld(x)
}
}
}

Jim McBeath said...

Dawid: Your example can be made to work by fixing two problems:

1) Annotations. Since the block you are passing to your data function gets suspended by a call to yld, you must mark it as suspendable. Now, since your data function is calling a suspendable function, it too must be marked as suspendable. Thus you must change the signature of your data function to

def data(block: Int => Unit @suspendable): Unit @suspendable = ...

2) You can not currently use a "for" loop with a body that is suspendable. Instead, you can use a "while" loop as in Integer Generator in my previous post. In general you will need to be aware of the CPS restrictions that apply to any code that is to be suspended.

Once you have your generator working using the above StandaloneGenerator, you can easily convert it from an Interator to a Stream by calling toStream on it.