FennelExecStreamInventory
Contents |
Introduction
The creation of this page was motivated by a post from Julian:
For background on ExecStreams, read:
http://fennel.sourceforge.net/doxygen/html/structExecStreamDesign.html
If you would like to fill in one of the missing pages below, copy from an existing page as a starting template.
Abstract Base ExecStreams
These are ExecStream classes which serve as abstract bases for concrete ExecStream implementations. Each one abstracts a common ExecStream input/output connectivity pattern. Derived classes inherit the base structure and prepare/open/close behavior, saving a bit of work and avoiding duplication.
Location: fennel/exec
- ConduitExecStream: an abstract base for streams with exactly one input and output
- ConfluenceExecStream: an abstract base for streams with many inputs but one output
- DiffluenceExecStream: an abstract base for streams with one input but many outputs
- SingleInputExecStream: an abstract base for streams with one input (with output count determined by derived class)
- SingleOutputExecStream: an abstract base for streams with one output (with input count determined by derived class)
General-purpose ExecStreams
These are concrete ExecStream classes which implement general-purpose operations commonly needed in implementing relational query processing.
Location: fennel/exec
- BarrierExecStream: a control-flow confluence which drains all of its upstream producers before allowing its consumer to execute
- BernoulliSamplingExecStream: a conduit which filters rows from its input to produce a sample with a Bernoulli distribution
- CartesianJoinExecStream: a confluence which computes the CROSS PRODUCT of two input streams
- CollectExecStream: a conduit which implements the COLLECT operator, reading multiple tuples from its input and writing them out as a single large tuple (encoded as a binary MULTISET)
- CopyExecStream: a conduit which copies tuples unchanged from input to output
- CorrelationJoinExecStream: an alternate implementation of nested-loop join
- MergeExecStream: a confluence which computes the UNION ALL of its inputs
- NestedLoopJoinExecStream: a confluence which computes the JOIN of two input streams, optionally using a third input stream when joining via a temporary index
- ReshapeExecStream: a conduit which performs simple transformation and filtering
- ScratchBufferExecStream: a conduit which provides a memory buffer without actually doing any processing
- SegBufferExecStream: a conduit which buffers its input either to memory or disk before passing it on to its output, optionally allowing restarts
- SortedAggExecStream: a conduit which computes aggregations over a pre-sorted input stream
- SplitterExecStream: a diffluence which replicates tuples from a single input stream into all of its outputs
- UncollectExecStream: a conduit which implements the UNCOLLECT operator, reading each binary MULTISET-encoded tuple from its input and decoding it into multiple output tuples
- ValuesExecStream: a source which produces a predefined VALUES list of tuples
Mock ExecStreams
These are concrete ExecStream classes used in constructing unit test scenarios as part of test-driven development of ExecStreams. They are not intended for use in implementing real query plans.
Location: fennel/exec
- MockConsumerExecStream: a sink which saves and/or prints the input it receives
- MockProducerExecStream: a source which produces a stream of synthetically generated tuples
- MockResourceExecStream: a "dog-in-the-manger" source which attempts to allocate resources without actually using them
FTRS ExecStreams
These are ExecStreams which are used to implement the FTRS local data wrapper. Some of them also serve as useful building blocks and base classes for other kinds of indexing implementations such as LucidDB's bitmap indexing. Except where noted as abstract, all classes are concrete.
Location: fennel/ftrs
- BTreeExecStream: an abstract base for any ExecStream which accesses a BTree
- BTreeInsertExecStream: a conduit which inserts tuples from its input into a BTree, writing a final insertion count to its output
- BTreeReadExecStream: an abstract base for any ExecStream which reads tuples from a BTree and writes them to its output
- BTreeScanExecStream: a source which reads all of the tuples from a BTree unconditionally and writes them to its output
- BTreeSearchExecStream: a conduit which reads search directives from its input, performs corresponding searches on a BTree, and writes the matching results to its output
- BTreeSearchUniqueExecStream: an optimized derivative of BTreeSearchExecStream for the case where it is guaranteed that each search will find at most one match
- BTreeSortExecStream: a conduit which sorts its input by inserting each tuple into the correct location in a BTree, and then scans the final tree in order and writes the results to its output
- FtrsTableWriterExecStream: a conduit which reads insert/update/delete tuples from its input and applies them to multiple BTrees at once in transactional fashion, writing a final modification count to its output
Java Hybrid-Plan ExecStreams
These are concrete ExecStream classes which implement JNI glue for efficiently passing tuples across the C++/Java memory boundary.
Location: fennel/farrago
- JavaSinkExecStream: a sink which reads tuples from its input and transmits them to a Java peer (where they appear as a source in the Java portion of the execution plan)
- JavaTransformExecStream: a confluence which reads tuples from its inputs, transmits them to corresponding Java peers, and then writes the result of Java-level processing to its output
Calculator ExecStreams
This is a concrete ExecStream class which implements scalar expression evaluation. It has its own subdirectory since it has a large number of helper classes for implementing a virtual machine.
Location: fennel/calculator
- CalcExecStream: a conduit which runs a calculator program on each input tuple, filtering and/or transforming to produce its output
LucidDB Column-Store and Bitmap Index ExecStreams
These are ExecStream implementations contributed by LucidEra as part of LucidDB's implementation of column-store and bitmap indexing. All classes are concrete except where noted as abstract.
Location: fennel, subdirectories lcs and lbm
- LbmBitOpExecStream: an abstract base for bitmap intersection and subtraction
- LbmChopperExecStream: a conduit which passes bitmaps through from input to output, chopping them into smaller ones if they exceed a certain limit
- LbmGeneratorExecStream: a conduit which reads control variables from its input, together with data from one or more previously loaded column-store clusters, and writes corresponding bitmap index entries to its output
- LbmIntersectExecStream: a confluence which performs bitmap intersection on its inputs, writing the resulting bitmaps to its output
- LbmMinusExecStream: a confluence which performs bitmap subtraction of second and subsequent inputs from its first input, writing the resulting bitmaps to its output
- LbmNormalizerExecStream: a conduit which reads bitmaps as input and converts them to normal tuples as output
- LbmSearchExecStream: an adaptation of BTreeSearchExecStream for use in bitmap index searching
- LbmSortedAggExecStream: a conduit which computes aggregations against a bitmap representation
- LbmSplicerExecStream: a diffluence which combines bitmaps from its input and writes them into a bitmap index BTree (either creating new entries or unioning into existing ones), writing out final modification counts in one output, and a stream of UNIQUE violation row ID's in the other output
- LbmUnionExecStream: a confluence which performs bitmap union on its inputs, writing the result of the union to its output (currently only handles windowed self-union for one input)
- LcsClusterAppendExecStream: a conduit which reads tuples from its input and writes them to a column-store cluster BTree, writing the final modification count to its output
- LcsRowScanBaseExecStream: an abstract base for LcsRowScanExecStream and LbmGeneratorExecStream, factoring out the commonality of scanning column-store cluster BTrees
- LcsRowScanExecStream: a confluence which reads scan directives from its inputs and uses them to access and filter one or more column-store BTrees in synchronized fashion, writing combined tuples as output
LucidDB Special-Purpose ExecStreams
These are concrete ExecStream classes contributed by LucidEra which implement specific LucidDB query-processing operators.
Location: fennel, subdirectories hashexe, flatfile, and sorter
- LhxAggExecStream: a conduit which computes aggregations on its input using hash aggregation (partitioning to disk as necessary)
- LhxJoinExecStream: a confluence which computes the JOIN of its two inputs using hash join (partitioning to disk as necessary)
- FlatFileExecStream: a source which parses a text file and produces tuples as output
- ExternalSortExecStream: a conduit which sorts its input and writes the results to its output (using disk-based external sort as necessary)