This page walks through the query preparation-time performance impact of adding a new rule to the optimizer.
The example query is the implementation for the JDBC DatabaseMetaData.getColumns call:
select * from sys_boot.jdbc_metadata.columns_view WHERE table_cat = ? AND table_name = ? AND column_name IS NOT NULL ORDER BY table_schem,table_name,ordinal_position,table_cat
The definition of columns_view can be found in farrago/initsql/createJdbcViews.sql. After expansion of underlying views, it involves a 5-way join across the following MDR tables:
The underlying view definitions are phrased with explicit INNER JOIN syntax so the correct join conditions are already available in the ON clauses. The optimizer finds the same plan (a left-deep MDR join tree in the order of the above bullet-list with deepest listed first) regardless of the addition of the new rule; what we want to examine is the effect of that new rule on the search space.
The rule in question is org.eigenbase.rel.ExtractJoinFilterRule. It matches a JoinRel containing a join condition and converts it into a FilterRel on top of a cartesian JoinRel (no join condition). This rule is useful when a cartesian product is the only applicable join implementation (e.g. a complex theta-join), but the query already has the join condition specified explicitly. Unfortunately, in this case study, the rule just leads Volcano astray, because the join conditions were already hand-coded to make sure they would be compatible with the MDR join implementation.
I used the following procedure for comparing the effect of running with and without the new rule:
- Turn on trace net.sf.farrago.query.plandump to record the optimizer execution time.
- Run sqllineEngine
- Execute alter system set "codeCacheMaxBytes"=min to turn off plan caching.
- Execute !columns DEPTS to exercise the query and get everything loaded (first query on Farrago always takes longer as stuff like MDR cache gets warmed up).
- Re-execute !columns DEPTS (we'll use the timing from this one).
- Get the timing for the second execution from FarragoTrace.log; the plandump trace setting results in trace entries before UDT flattening, after UDT flattening, and after optimization. The interval between the last two is the Volcano execution time, and involves no other work. See FarragoPreparingStmt.optimize for details.
- Run several times with and without the rule (by changing FarragoDefaultPlanner.addStandardRules) to get the average time.
With the rule enabled, average optimization time was 10 seconds. Without the rule, average optimization time was less than 1 second.
Next step was to turn on optimizer tracing (org.eigenbase.relopt=FINE) and count the number of rule-firings with and without the new rule (grep "Apply Rule" FarragoTrace.log | wc -l, being sure to delete the log before each run). Without the new rule, the count was 935. With the new rule, the count was 2602. So the number of rule firings increased by a factor of 2.8, but the runtime increased by at least a factor of ten. Without a profiler or more optimizer tracing, it's not possible to know whether the extra time was spent in the planner itself or in the rules, but the former seems more likely.
Tracing note: the planner wasn't tracing rule firing at level FINE as specified in FarragoTracing, so I changed it to do so.
The MDR table rels are able to provide precise cost information for the leaves because we are able to ask for the exact number of instances of any object. The join costing is accurate for all but the column-to-type join because MedMdrJoinRel assumes these are one-to-many joins, and the cartesian product is easy. But the filter costing is completely wrong (assumes zero selectivity).
Postscript: the pattern-trimming from eigenchange 5816 mostly fixed the performance problem, bringing the time down to around two seconds, although the number of rules fired is still higher with ExtractJoinFilterRule (1445) than without (602).