Accelerate PRODUCT JOIN by decomposition

How to optimize slow product joins

Case description

There are SQL queries that cannot be executed any other way, but using product join method, because of the content of join condition. Eg:

  • OR predicate
    ON (a.x = b.x OR a.y = b.y)
  • BETWEEN / LIKE operator

        ON (a.x LIKE b.y)
        ON (a.x LIKE b.y || '%')
        ON (a.b between b.y and b.y)
  • Comparison (=) of different datatype fields
    Example (a.x is INTEGER, b.x is VARCHAR)
    ON (a.x = b.x)
  • Arithmetic expression usage
    ON (a.x = b.x + 1)
  • SQL function (eg. substr() ) or UDF usage
    ON (substr(a.x,1,characters(b.x)) = b.x)

Product join is a technique when the execution will match each record combinations from the two joinable tables and evaluates the join condition on each of them. Product join usually causes huge CPU consumption and long response time.

How to identify

The query typically runs long time, contains a “PRODUCT JOIN” step in the Explain description, and that  step consumes high AMPCPUTime and lasts long. Those queries usually have >>1 LHR index (Larry Higa Ratio, showing the CPU and I/O rate), typicall 10s, 100s or more.

In the DBQL you should find high CPU usage queries ( dbc.DBQLogtbl.AMPCPUTime > 1000 , depends on system size) which also has “product join” expression in the execution plan text (dbc.DBQLExplaintbl.ExplainText like ‘%product join%’)

PRISE Tuning Assistant supplies easy-to-use GUI based search function.

Explanation of product join execution

Let’s assume that we join tables: Table1 (N records, bigger table) and Table2 (M records, smaller table) Join processing assumes that the matchable record pairs must reside on the same AMP. Since product join compares each Table1 records to each Table2 records, one of the tables’ all records must reside on all AMPs, therefore PRODUCT JOIN is preceded by a “Duplicated to all AMPs” step of the smaller table.
Each record pairs will be evaluated, if the JOIN condition satisfies, the result gets to the result spool, otherwise discarded.
The number of required comparisons: (N x M), and the cost (approx. the required CPU time) of one comparison depends on the complexity of the join expression.


In most cases the JOIN condition of the product join satisfies only small fraction of all possible combinations. In practice we can identify an often situation:
Significant subset of the bigger table’s records will fit to a small subset of the smaller table’s records.
Telco example: Number analysis. Most of the Call records are directed to national number areas (>90%), but the number area describing contains dominantly international number regions (>80..95%). We can declare that national calls will never fit to international areas. In addition it is very simple to identify both a “Number” and a “Number area” if it is national or international.
The base query looks like that:

select Table1.Number,Table2.Area_code
from Table1
join Table2 ON Table1.Number BETWEEN Table2.Area_Start_number and Table2.Area_end_number;

Let’s decompose the query into two parts:

select Table1.Number,Table2.Area_code
from Table1
join Table2 ON Table1.Number BETWEEN Table2.Area_Start_number and Table2.Area_end_number
where substr(Table1.Number,1,1) = '1'   -- USA area code
and substr(Table2.Area_Start_number,1,1) = '1' -- USA area code
and substr(Table2.Area_end_number,1,1)   = '1' -- USA area code
select Table1.Number,Table2.Area_code
from Table1
join Table2 ON Table1.Number BETWEEN Table2.Area_Start_number and Table2.Area_end_number
NOT (substr(Table1.Number,1,1) = '1')   -- non-USA area code
and NOT
substr(Table2.Area_Start_number,1,1) = '1' -- non-USA area code
and substr(Table2.Area_end_number,1,1)   = '1' -- non-USA area code

This way we added some low cost operations (full scan on tables to identify national/international) , and the const of UNIONing the results, but we eliminated lots of trivially not satisfing comparisions.

The following figures show the processing cost, the red area represents the number of comparisons, therefore the cost:

Figure1: Original case
Figure2: Decomposed case

Let’s do some maths, with imagined combinations:
N1: International calls
N2: National calls
M1: International area descriptions
M2: National area descriptions
90% of calls (N) are national (N2)
90% of area descriptions (M) are international (M1).
Originall we have to do N x M comparisons.
The decomposed query must do
((0.9 x N) x (0.1 x M)) + ((0.1 x N) x (0.9 x M)) = 0.09 x N x M + 0.09 x N x M = 0.18 x N x M

The optimized query will do only 18% of the original comparisons, with tradeoff
of two full scans (I/O intensive) of the base tables and one UNION ALL-ing (low cost)
of the results.

In this case we will get ~4 times faster and CPU saving execution.

Sometimes it can be worth to decompose to 3 or more process phases, depending on data.

It is important, if there are further joins or transformations on the result data, they should be done on the UNION ALL-ed result, and should not be dupliated on the decomposed phases, due to code management reasons.

Why can not do the Teradata PE the same?
The decomposition requires the knowledge of the data, and will vary from query to query, which is currently out of scope and intelligence of an automatic optimizer.


Eliminate trivially invalid record pairs from the PRODUCT JOIN by breaking the query in more parts.

What’s next

Next post will discuss slow OUTER (LEFT/RIGHT) JOIN.

4 thoughts on “Accelerate PRODUCT JOIN by decomposition”

  1. Hi Akos,
    Doesn't the second select clause with where clause of NOT doesn't undergo product join as in the first select clause, can you please explain it.

  2. Hi Cheeli,
    Both select sections will be executed as product join, because of the "BETWEEN" join condition. The point is: the original select must do N*M compare operations, while the decomposed far less. We do not eliminate the product join (we cannot because of the join condition type), but significantly reduce the number of comparisons.
    I hope it helps.

  3. Hi Akos,

    SUBSTR () is killing my query performannce.

    am using substr('column1',10,1)='2' in my where clause. is there any way to get same output by some other way with good performance.

    Column1 has avergae skew but much records for one value.

    Add your comment.

  4. Hi Agilan,

    I bet on other problem, this should not be a killing expression, I think there are other circumstances that cause the problem (table skewness, redistribution for other steps). I suggest to use PRISE Tuning Assistant to look at the query execution (requires DBQL switched on), it will highlight you the most consuming execution step. If you provide me details (sql, explain log, Tuning Assistant screenshot, step data from dbql) I can help you more.


Leave a Reply

Your email address will not be published. Required fields are marked *