Reading List

The Selfish Gene
The Psychopath Test: A Journey Through the Madness Industry
Bad Science
The Feynman Lectures on Physics
The Theory of Everything: The Origin and Fate of the Universe


ifknot's favorite books »

Sunday, 24 August 2014

A simplified libfbp component and a general purpose work-stealing, back-off threadpool to get things going.

Sucked into a general purpose threadpool!


TL;DR To paraphrase Thoreau "Decouple, decouple, decouple!" the clumsy component that needed to take a dequeuer is simplified to a virtual interface PPM API facilitating generalisation for PXMs. There's also a threadpool PXM so that FBP can run out of the box.

As previously stated the primary strategic goal here is that the libfbp implementation of Flow-Based Programming (FBP) as a Parallel Programming Model (PPM), though the FBP paradigm is a lot more more than simply a PPM, should be agnostic to its underlying Program eXecution Model (PXM) but must, never-the-less, provide an Application Programming Interface (API) that remains versatile in its implementation. 

In an effort to achieve this goal the original component here was overly complex and took on too much of the scheduling burden the simplified component presents a single virtual operator()() member function that returns a component*. The API contract is simple, the returned component* points to the next component to be executed and a returned value of null_ptr removes the component from the execution pipeline.


The resulting component interface is much more flexible and ultimately, satisfying:


In order for libfbp to have "out of the box" immediate usability then a fully functional PXM implementation is required and, in keeping with the quick and simple (huzzah!), ideology of libfbp a threadpool is presented based upon the Java threadpool described by Herlihy & Shavit[1].

The work stealing executor is a template class that wraps a C++11 system thread of execution and takes a concurrent queue type T, that is compliant with the queue ADT, presented to it as a vector of queues that it indexes with its own ordinal identifier.

The executing worker has 4 states: WAITING, RUNNABLE, STEALING & TERMINATED that are processed by a simple state machine implemented using a switch statement in the body of the executors private run() member function and cycles as follows:

Executor state diagram:

I find the whiteboard at work is a great place for ideas :)


Unlike the random stealing example presented by Herlihy & Shavit[1] the libfbp approach is one of complete search and back-off. Such that an executor thread that finds no work in the set of work queues available to the pool simply shuts itself down. This approach allows the whole threadpool to gracefully shut down when the FBP application itself no longer has any components that require servicing.

This executor work balancing strategy is effected by searching all the queues in the vector until a non-empty queue is found from which it steals a functor to execute. However, if the search completes without finding a non-empty queue then the executor simply shuts itself down.

From a fungibility perspective the load balancing policy of the executor is eminently suited to a template policy class and this has been reserved for future work, which does at least offer readability for the executor class itself:


Should I add more in-code explanation despite the clutter it brings?


The threadpool (executor-pool) itself is a trivial class:


Usage is straight forward as per this contrived example:


Having previously presented mathematical semantics for FBP and now developed a compliant rudimentary implementation it is time to move from the theoretical into the experimental via the canonical for the next and subsequent post(s).

References:

[1] Maurice Herlihy and Nir Shavit. 2012. The Art of Multiprocessor Programming, Revised Reprint (1st ed.). Morgan Kaufmann Publishers Inc., San Francisco, CA, USA - chapter 16 Futures, Scheduling and Work Distribution

No comments:

Post a Comment