FennelExecStreamInventory

From LucidDB Wiki
Jump to: navigation, search

Contents

Introduction

The creation of this page was motivated by a post from Julian:

http://sourceforge.net/mailarchive/message.php?msg_id=20060310073510.83B6EF77F3%40c9mailgw26.amadis.com

For background on ExecStreams, read:

http://fennel.sourceforge.net/doxygen/html/structExecStreamDesign.html

If you would like to fill in one of the missing pages below, copy from an existing page as a starting template.

Abstract Base ExecStreams

These are ExecStream classes which serve as abstract bases for concrete ExecStream implementations. Each one abstracts a common ExecStream input/output connectivity pattern. Derived classes inherit the base structure and prepare/open/close behavior, saving a bit of work and avoiding duplication.

Location: fennel/exec

General-purpose ExecStreams

These are concrete ExecStream classes which implement general-purpose operations commonly needed in implementing relational query processing.

Location: fennel/exec

  • BarrierExecStream: a control-flow confluence which drains all of its upstream producers before allowing its consumer to execute
  • BernoulliSamplingExecStream: a conduit which filters rows from its input to produce a sample with a Bernoulli distribution
  • CartesianJoinExecStream: a confluence which computes the CROSS PRODUCT of two input streams
  • CollectExecStream: a conduit which implements the COLLECT operator, reading multiple tuples from its input and writing them out as a single large tuple (encoded as a binary MULTISET)
  • CopyExecStream: a conduit which copies tuples unchanged from input to output
  • CorrelationJoinExecStream: an alternate implementation of nested-loop join
  • MergeExecStream: a confluence which computes the UNION ALL of its inputs
  • NestedLoopJoinExecStream: a confluence which computes the JOIN of two input streams, optionally using a third input stream when joining via a temporary index
  • ReshapeExecStream: a conduit which performs simple transformation and filtering
  • ScratchBufferExecStream: a conduit which provides a memory buffer without actually doing any processing
  • SegBufferExecStream: a conduit which buffers its input either to memory or disk before passing it on to its output, optionally allowing restarts
  • SortedAggExecStream: a conduit which computes aggregations over a pre-sorted input stream
  • SplitterExecStream: a diffluence which replicates tuples from a single input stream into all of its outputs
  • UncollectExecStream: a conduit which implements the UNCOLLECT operator, reading each binary MULTISET-encoded tuple from its input and decoding it into multiple output tuples
  • ValuesExecStream: a source which produces a predefined VALUES list of tuples

Mock ExecStreams

These are concrete ExecStream classes used in constructing unit test scenarios as part of test-driven development of ExecStreams. They are not intended for use in implementing real query plans.

Location: fennel/exec

FTRS ExecStreams

These are ExecStreams which are used to implement the FTRS local data wrapper. Some of them also serve as useful building blocks and base classes for other kinds of indexing implementations such as LucidDB's bitmap indexing. Except where noted as abstract, all classes are concrete.

Location: fennel/ftrs

  • BTreeExecStream: an abstract base for any ExecStream which accesses a BTree
  • BTreeInsertExecStream: a conduit which inserts tuples from its input into a BTree, writing a final insertion count to its output
  • BTreeReadExecStream: an abstract base for any ExecStream which reads tuples from a BTree and writes them to its output
  • BTreeScanExecStream: a source which reads all of the tuples from a BTree unconditionally and writes them to its output
  • BTreeSearchExecStream: a conduit which reads search directives from its input, performs corresponding searches on a BTree, and writes the matching results to its output
  • BTreeSearchUniqueExecStream: an optimized derivative of BTreeSearchExecStream for the case where it is guaranteed that each search will find at most one match
  • BTreeSortExecStream: a conduit which sorts its input by inserting each tuple into the correct location in a BTree, and then scans the final tree in order and writes the results to its output
  • FtrsTableWriterExecStream: a conduit which reads insert/update/delete tuples from its input and applies them to multiple BTrees at once in transactional fashion, writing a final modification count to its output

Java Hybrid-Plan ExecStreams

These are concrete ExecStream classes which implement JNI glue for efficiently passing tuples across the C++/Java memory boundary.

Location: fennel/farrago

  • JavaSinkExecStream: a sink which reads tuples from its input and transmits them to a Java peer (where they appear as a source in the Java portion of the execution plan)
  • JavaTransformExecStream: a confluence which reads tuples from its inputs, transmits them to corresponding Java peers, and then writes the result of Java-level processing to its output

Calculator ExecStreams

This is a concrete ExecStream class which implements scalar expression evaluation. It has its own subdirectory since it has a large number of helper classes for implementing a virtual machine.

Location: fennel/calculator

  • CalcExecStream: a conduit which runs a calculator program on each input tuple, filtering and/or transforming to produce its output

LucidDB Column-Store and Bitmap Index ExecStreams

These are ExecStream implementations contributed by LucidEra as part of LucidDB's implementation of column-store and bitmap indexing. All classes are concrete except where noted as abstract.

Location: fennel, subdirectories lcs and lbm

  • LbmBitOpExecStream: an abstract base for bitmap intersection and subtraction
  • LbmChopperExecStream: a conduit which passes bitmaps through from input to output, chopping them into smaller ones if they exceed a certain limit
  • LbmGeneratorExecStream: a conduit which reads control variables from its input, together with data from one or more previously loaded column-store clusters, and writes corresponding bitmap index entries to its output
  • LbmIntersectExecStream: a confluence which performs bitmap intersection on its inputs, writing the resulting bitmaps to its output
  • LbmMinusExecStream: a confluence which performs bitmap subtraction of second and subsequent inputs from its first input, writing the resulting bitmaps to its output
  • LbmNormalizerExecStream: a conduit which reads bitmaps as input and converts them to normal tuples as output
  • LbmSearchExecStream: an adaptation of BTreeSearchExecStream for use in bitmap index searching
  • LbmSortedAggExecStream: a conduit which computes aggregations against a bitmap representation
  • LbmSplicerExecStream: a diffluence which combines bitmaps from its input and writes them into a bitmap index BTree (either creating new entries or unioning into existing ones), writing out final modification counts in one output, and a stream of UNIQUE violation row ID's in the other output
  • LbmUnionExecStream: a confluence which performs bitmap union on its inputs, writing the result of the union to its output (currently only handles windowed self-union for one input)
  • LcsClusterAppendExecStream: a conduit which reads tuples from its input and writes them to a column-store cluster BTree, writing the final modification count to its output
  • LcsRowScanBaseExecStream: an abstract base for LcsRowScanExecStream and LbmGeneratorExecStream, factoring out the commonality of scanning column-store cluster BTrees
  • LcsRowScanExecStream: a confluence which reads scan directives from its inputs and uses them to access and filter one or more column-store BTrees in synchronized fashion, writing combined tuples as output

LucidDB Special-Purpose ExecStreams

These are concrete ExecStream classes contributed by LucidEra which implement specific LucidDB query-processing operators.

Location: fennel, subdirectories hashexe, flatfile, and sorter

  • LhxAggExecStream: a conduit which computes aggregations on its input using hash aggregation (partitioning to disk as necessary)
  • LhxJoinExecStream: a confluence which computes the JOIN of its two inputs using hash join (partitioning to disk as necessary)
  • FlatFileExecStream: a source which parses a text file and produces tuples as output
  • ExternalSortExecStream: a conduit which sorts its input and writes the results to its output (using disk-based external sort as necessary)
Personal tools
Product Documentation