Avoiding Product Joins

How to eliminate Product Joins

What is product join?

Avoiding Product Cross Join in Oracle
Product join is one of the implementation methods of an SQL JOIN operation.
Do not mix up with cross join (Cartesian product), which is one type of SQL joins.

SQL join types, eg.: inner join, left outer join, full outer join, cross (Cartesian) join
Join implementation types, eg.: nested join, merge join, hash join, product join.

Product join (of tables A and B ) is the most simple method of join implementation:

  • Produce each of <A;B> record combinations, say take each records from A singly, and match it with each records of B one-by-one.
  • Test the join condition on each produced <A;B> record pairs, and eliminate those combinations where the condition fails.
The two steps are often combined, and the “testing phase” is executed right after a record combination is generated, and the non valid combinations right after dropped. This saves a lot of temp space.

Why don’t we like product joins?

Well, it has a really bad reputation. It is slow, stuffs CPU, etc.
Yes, it usually is, does. It is the brute force method for executing a join, with costs in order of N*M (where N, M are the record numbers of the joinable tables)

Indeed there are situations when it is the best choice, or the only feasible way.

When is it good or necessary?

Please note that product join is the method what is always applicable, independently of all circumstances.


Product join is typically simple, dumb and slow algorithm, this is why we do not like it, but has a very important advantage: requires no pre-processing.* This is why we LIKE IT:)
If we have to join a really large table to a very small table (couple of records) product join is far the most effective method, since the sort of a very large table ( order of N*logN ) can cost a lot, while joining to 1-2 records is really not a big deal.


There are join situations when the only way to go is the product join. Why? Because of the join condition. The “clever joins” (merge, hash) require some information and/or condition that somehow enables to cheat the A x B comparisons: reduce them to the ones that really necessary, and be done in a more effective manner.

* OK, in Teradata this means: only requires that the matchable records from both tables must be on the same AMP. This implies the “small” table to be duplicated to all AMPs.

Merge join example

from A
join  B on A.customer_id = B.customer_id
and A.trx_dt between B.eff_dt and B.exp_dt

  • Customer_id clause is in AND condition with the others
  • Customer_id is selective enough that hash(customer_id) can reduce the comparisons reasonably
  • Note that A and B must be sorted (re-sorted) by the hash of customer_id

Product join example

from A
join   B on substr(A.telephone_no,1,B.prefix_length) = B.telephone_no_prefix

  • There is no comparison reducing partial-condition
  • Note that neither of the tables required to be sorted in a specific order.

Unavoidable product joins

  • Non-eqality condition
  • Function used (eg. substr())
  • Dependent expression is used (eg. A.x+B.y = A.z)
  • Cross join: intentional Cartesian product

Avoidable product joins

Data type mismatch

The merge join example above works only if customer_no in A and B tables have the same “style” data types, since their hash value will match only in this case. Say hash(13674) <> hash(‘13674’), however integer is compatible with decimal, and char is compatible with varchar.
Pay attention on data type consistence during physical data modeling.
  • Use domains to eliminate the possibility of mismatch
  • Align to used data types when defining temp tables, or use “create table as …” statements
  • If you cannot avoid mismatch, relocate the necessary data to temp tables with proper data types during processing.

OR condition

Let’s assume the following join condition:
select ...
from A
join  B on A.col1 = B.Col1

           A.Col2 = B.Col2

This is equivalent, w/o compulsory product join :

select ... 
from A
join  B on A.col1 = B.Col1 

select ...

from A
join  B on A.Col2 = B.Col2

Missing/stale statistics

As I mentioned before product join is the most effective join between a very large and a really small (couple of records) table. If the optimizer thinks that a table is pretty small, but it is not indeed, it may choose a product join in all good faith, misleaded by a stale or missing statistics.
Define and keep fresh those statistics by the optimizer can determine the size of the joinable record sets  properly.

How to find avoidable product joins

It is not trivial to list the avoidable product joins. Practically all product joins are required to be examined one-by-one and judged to be avoidable or not. And if avoidable, what to do for.

I strongly recommend to use PRISE Tuning Assistant for both finding the product joins and analyzing the possibility and necessity of elimination:

  • List top consuming queries with product join(s)
  • Check the PROD JOIN steps: which tables are processed that way
  • Check those join conditions for cases described above

What to do if cannot be avoided?

In this case I recommend to try the decomposition, described here.
It can help reducing the number of comparisons, saving CPU and runtime.

Have a successful optimization!

The “No more spool space” Error Explained

Why do I get “No more spool space” error?

This is the most familiar error message in Teradata world:
“Failure 2646 No more spool space”
What does it really mean, what is it caused by?
Let’s get back to the basics.

What is spool space?

Spool space is a temporary area, that can be used to store the part-results during query processing, as well as volatile tables. All free space in the database that are not allocated by PERM data, technically can be used for spool area, since a PERM data does not want to allocate that area.

Each database users may have a “spool limit” that restricts the user to allocate more spool area at a time, than its limit. Keep in mind that all active sessions of a username must share the spool limit together.

Teradata is a massive parallel system, therefore the spool limit must be interpreted on AMP level:
Eg: 100AMP system, a user with 10G spool limit means: 100M spool/AMP

What is spool space limit good for?

This limitation method is a quite simple way to cut those queries from the system that would suck too much resources. There is no exact relationship between high spool usage and ineffective query, but statistically the correlation is high.
Practically: a bad query is being kicked off before is consumes too much resources unnecessarily.

No more spool space scenarios

System ran out of spool space

This is the most rare situation, forget about. There are too few free space on the system, but this situation used to be avoided by defining a “SpoolReserve” database, where no objects are created, this way that area is always available for spool.
If many “big spool limit” users run high spool queries parallel, then this rare situation can yet occure.

Multiple session of the user are active together

This is a quite rare situation also. Check the active users from dbc.sessioninfo.

Volatile tables

All existing volatile tables reside in your available spool space, reducing the available. If you create many, and even with skewed distribution, you can stuff your spool up. Choose “primary index” carefully, when defining volatile tables also.

Improper execution plan

These are the >90% of cases that cause the “No more spool space” errors. Let’ see how:
  • “Duplication to all AMPs” of a non-small set of records
    The root cause is typically missing or stale statistics. Either system thinks that much less records will be duplicated than the real (sometimes billions of records get in this kind of spools), or knows this exactly, but on the other branch of the query there are more low quality estimations, and this execution seems to be cheaper.
  • Redistribution of records by a hash that causes skewed distribution
    Check the corresponding blog post: Accelerate skewed joins
  • Retrieve huge amount of records into spool (locally or redistributed onto the AMPs)
    Specific query structures imply this execution, like: join to a view that “union all”-s big tables.
I suggest to use PRISE Tuning Assistant to identify what is the problem. It spectacularly displays which execution step falls in the problems above.
Increasing the spool limit will not solve the problems in the most cases. 

Too big task

Sometimes a given SQL query requires big spool area to be performed, even with the best execution plan.

This is the only case when raising spool limit is the solution. But first you have to understand that the task is really big. PRISE Tuning Assistant is a good tool for identify this in a minute.

Teradata performance optimization project
I. Basics and prerequisites

Teradata performance optimization


The Teradata database is a pretty expensive box, which serves a lot of people in the company – in a direct or indirect way. The efficient utilization of its resources is essential, from both financial and user satisfaction aspects.

This series of posts will follow a real-life Teradata optimization project’s methodology, steps and results, with this achievement:

System level gain: 31,6%

Levels of performance optimization

We can typically apply 3 levels of Teradata performance optimization:


Meaning: Analyze user requests and system activity, then refine them, eliminating unnecessary activities.
This level checks if the executed processes are really needed, and they do what the user really needs. Eg: eliminate unnecessary data loads or reports; limit a report to a shorter period

  • Big gains possible w/o any trade-off
  • Comprehensive analysis (incl. find and interview appropriate persons)


Meaning: Pick individual SQLs and optimize its execution
This approach finds the top resource consumer SQL statements and reduces the resource consumption, while produces the same results.
Eg: Change PI; add/refresh statistics; rewrite SQL

  • Easy analysis
  • Low risk (small affect, good testing possibilities)
  • Needs SQL level analysis


Meaning: Tune low level system (RDBMS, OS, Hardware) parameters
Tuning on infrastructure level is a global intervention regardless of users/business areas or importance

  • Miracle in some special cases
  • High risk (affects the all system)
  • Complex cross-dependencies

I prefer and recommend SQL level optimization (b.) in the beginning, because it delivers high value along with low risk and scaleablity of project. Later on can the others come into focus.

Goal of SQL optimization

The goals of an optimization project can be different. Some examples:
“Finish load procedure until 6am”
“Present my report 2 hours earlier”
“Reduce CPU load by 20%”
“Let hardware extension postponed by 6 months”

The goal of our project was: speed up the “load batch” process, in order to let a specific identified data area load finished and available for further usage 1 hour earlier than before.


The key input for an SQL optimization is the information about what exactly happens:

What, when, how:

Which SQL statements ran, when, what steps were executed, how much resources did the consume with what skewness, etc.
All these can be found in the DBQL tables of Teradata (dbc.dbql*)
Switching on them will not cause significant extra load, but it is required to archive/purge them regularly (see PDCR tool in Teradata package).
Teradata Administrator enables to switch logging on (Tools/Query logging…) and one can check the actual settings:

select * from dbc.dbqlruletbl;
Explain/Obj/Sql/Step flags should be set to ‘T’, others are not necessary. Global (1 record with UserID: 00000000 ) logging is recommended.
If continuous logging is not possible, logging should be switched on 1 week before starting the optimization project

What / Who runs the statements?

Teradata provides a great feature for this: Query Banding
All executed SQL statements can be flagged with several information elements regarding the environment, job, script, version etc. which helps identifying a specific one.
The queryband data will appear in the DBQL logs – if the running environment is configured to flag the queries being run.
Querybanding is not essential for an optimization, but helps really a lot to identify the exact point in the data warehouse ecosystem, where the modifications should be done (which job/jobstep/script).

What are the environmental conditions?

Eg. Data volumes, statistics, etc.
This information is available in the data dictionary (DBC tables and views), but only as an actual – not for the time the statement run.
Most of the cases it is OK for us, but this property needs to be kept in mind.

When all these are finished, we are ready to do a successful performance optimization project.

The next post is about how to identify the good candidate SQL statements for optimization.

Stay tuned!

The hidden product join

Hiding enemy: the hidden “product” join


When we check query performance one of the first things we search for is the PRODUCT JOIN. Usually this is the component we hate and pursue, because we belive that it is the root problem. It is not always true, but it is a different topic.
Sometimes we find merge join, but it is still very slow, in spite there is no skewness, PIs seem OK.

So the question is: how can a simple merge join be so pathetically slow sometimes?

What can be the problem?

The real evil in this case is – my favourite – the hash collision.
Let’s imagine the tables below (transactions’ columns splitted into two tables):

  User_id INTEGER,
  User_trx_seq INTEGER

  User_id INTEGER,
  User_trx_seq INTEGER

The individual users produce generally equal number of transaction in our cited case, so table skewness is not experienced. The INSERTs into these tables are not that fast, but the daily increment is not that big, so it makes no outstanding problem – in our imagined case.
We have all required statistics either.

Let’s take this query:

JOIN TRX_2   b on a.User_id =b.User_id and a.User_trx_seq = b.User_trx_seq

How does the optimizer think?
Distribution of the tables are excellent, both have equal PI and in addition the column is part of an “AND”-ed join condition. The matchable records reside on the same AMP – the only thing to do is evaluate the “residual” condition: a.User_trx_id = b.User_trx_id.
However it is not that simple.
In the beginning each user has few transactions, eg. 10.
What does the system do? Each records of User1 in TRX_1 table must be compared with 10 records from TRX_2. That is 100 comparisons.
What if each user has 1000 transactions? 1million comparisons for each user!!
The growth is in the order of X2, where X is the number of records per PI (exactly PI hash) value.
In this type of cases the number of records per PI value grows linearly or quicker as time goes by, what means longer and longer execution times.

Please note thet the total time required for the join step is determined by the maximum number of records per PI value, since all these records will fall to the same AMP, therefore that one will perform the slowest.


What is the solution?

We can experience this problem in two cases:

  • Sources are regular tables of our datamodel
  • Sources (at least one of them) is a Spool during a query execution

In the first case we have to think on changing the PI, if we can (to <User_id,User_trx_seq> in our case). This will speed up the INSERT procedures, so does the JOINED DELETE and UPDATE operations. The disadvantage is that the simple User_id based joins will collapse. They will probably require redistribution of out TRX_x table. The User_id based filters will turn to full-table-scans instead of PI access, which can be handled by secondary- or join indices.
Another option is to do a pre-fetch from the TRX_1 and TRX_2 tables into volatile tables having PI: (User_id,User_trx_seq). But note that it may be expensive, if we cannot apply a pretty selective filtering. This means redistribution of the tables.

In the second case we may apply the method of storing the partial result. Put the Spool into a volatile table with the appropriate PI (User_id,User_trx_seq), and this will hopefully enforce the other table/spool to be redistributed to this hash-collision free distribution.


How can we catch those guys?

Unfortunately the “explain plan” will not have a blinking LED to highlight this situation, we need to have an experience to suspect that someting’s wrong. The main indicator is the cost/input ratio. Check how much records are in the bigger source table and compare with the step’s impactCPU, while your other eye is on the sources’ and target’s skew. If the skews are low, but the CPU (or I/O) is unreasonably high, you probably got it!

Does it seem complicated? Maybe yes, but quite simple if using PRISE Tuning Assistant:


PRISE Tuning Assistant use cases

What is PRISE Tuning Assistant (PTA) good for?

PRISE developed this tool to help DBAs and developers solve performance issues quickly and easily. In this post I describe a couple of typical situations when it can assist you.
A full functioning trial version of PRISE Tuning Assistant is available for download.

In general, PTA application consist of two main functions:

  • Query finder
    Enables applying lots of search criteria like: thresholds, skewness, queryband, SQL text, Explain text, used objects, time window, etc.
  • Query analyzer
    Unfolds the execution process of one selected query, helps understanding the query and its performance bottlenecks. Collects all required supplementary information like table sizes, DDLs, statistics, etc. in one comprehensive dashboard, and provides a “walking tour” feature, you can roam in the execution plan with.
All this enables even an average skilled Teradata expert to find which queries to be optimized and figure out the weak points of even a complex query in minutes.

PTA’s key architectural properties:

  • Supports active DBQL (dbc.dbqlogtbl and friends…) and also archived DBQL tables.
  • Optimized DBQL querying (fast response times and low resource usage)
  • Requires read only access to database
  • Portable, multi platform Java application, requires no installation

Let’s see typical use cases of the application.

Find the TOP consuming queries

The optimization’s “low hanging fruits” are the top consuming queries, I mean CPU or I/O  stuffing SQL statements. They are often small in number, however consume significant percent of the total system resources, as I mentioned in this post.

How to find them?
Simply set your focused time interval and select the “CPU”, “I/O” or “Duration” (for long time running ones) as TOP criteria, and click the “Get DBQL data” button. You will be given the list of the qualifying queries in descending order of your TOP criteria. After you can sort it by the other KPIs, like:

  • Skewness
  • Spool usage
  • LHR/RHL (Larry Higa ratio)
  • QueryBand (and specific variables of it)
  • Etc.

Skewed CPU or I/O usage

Teradata is a massive parallel system. It can utilize its full power if the work is spread evenly across the AMPs, otherwise (skewed queries) the response time increases dramatically while virtual resource consumption can remain low.

How to find the bad guys?
Start same as “TOP queries” case, but choose “CPU Impact” or “I/O Impact” as TOP criteria. This will result those queries that have the highest “Impact” (skew-normalized resource consumption) on the system.

“No more spool space”

Teradata uses Spool limiting in fact for cancelling those queries that have bad plans. Therefore “Failure 2646: No more spool space error” means a bad query execution plan, and the solution is usually not spool barrier raising. But often those spool limits are raised, until query is able to finish, and bad queries throttle the system… Why does a query usually run out of spool?

  • Duplicates a large spool/table to all AMPS
    Thinks it is small
  • Redistributes spool/table by a skewed value
    Thinks it is non-skewed
  • Processes (copies to spool) unreasonably large number of records

First two cases are usually caused by missing or stale statistics that can be easily found and supplied.

How to select?

Start same as “TOP queries” case, but choose “Spool” as TOP criteria. This will result those queries that have the highest spool consumption, and SQL dashboard will help you to figure out which statistics should be collected. In addition you can filter for “Error 2646” with Error code option, which will result those queries that has reached the spool limit.

Find a specific query

The DBA gets a question: “My query runs slow, please help me!” – occasionally an SQL is attached in the email. PTA lets figuring out which query was it in the DBQL. You can filter on:

  • Username
  • Session ID
  • Full query text (with LIKE)
  • QueryID (internal DBQL ID)
  • QueryBand (with LIKE)
  • Error code

Search disliked plan elements

PTA enables text search in Explain text of the queries. Eg. find quickly:

  • Product joins
  • Sliding-window joins
  • “Redistributed by”
If it is combined with CPU and/or I/O limitation (CPU >=; I/O >= criteria) then we can get those queries where they really cause problem.

Object usage

We have a table or index in the database, and we look for all the queries that utilize that object.
It is very useful for:

  • Preparing for a PI change
    PI change usually triggered by skewed data distribution, but the more important factor is the access path it provides. Before you change a PI, you need to examine the workload, accessing that table. What is the typical filter/join criteria, does the current/new PI support good access path? PTA enables to do it quickly and easily.
  • Determine usage of a table
    How many times a table is accessed a day? Is it used extensively, or just kept loaded for nothing?
  • Determine usage of a hash/join index
    A hash/join index requires processing and storage resources. Is it pretty utilized, or not?
Simply set the “Object used” filter citeria. In addition you can use the “Thru join/hash index” option for including the join/hash index based data accesses.

Recurring queries

Which query to be optimized? Maybe there are a medium consuming queries, but run lot of times a day, consuming much resource in total. Use the “Recurring query finder” function of the PTA to see aggregated information: total and average Duration, CPU and I/O.

What was a query run by?

If your ETL and OLAP systems use QueryBand properly, you can set specific variables (max of 6) to be parsed into separate columns in the result query list. Sort the result by it, and find which queries of a load job should be optimized.

DBQL analysis II. – Skewness – The big resource robbery detection

The big resource robbery detection


Please note that the solutions found in the article works on the DBQL logs, which covers only that users’ activity, for whom the logging is switched on.


About Skew

If you are not comfortably familiar with Skewness or DBQL, please read the corresponding posts before:

As you can see the difference between Impact[Resource] and Total[Resource] is a net technological loss, which should be minimized.

Skewness is a good indicator for highlight those workload that wastes lots of valuable CPU and I/O resources because of inefficient parallelism.

Find “Top skewed queries” in DBQL

With this SQL you can filter top impacting queries (replace the date value or maybe you have to adjust the date filtering according to local settings):

select top 50
    , QueryID
    , AMPCPUTime
    , MaxAMPCPUTime * (hashamp () + 1) CPUImpact
    , CAST (100 - ((AmpCPUTime / (hashamp () + 1)) * 100 / NULLIFZERO (MaxAMPCPUTime)) AS INTEGER) "CPUSkew%"
    , TotalIOCount
    , MaxAMPIO * (hashamp () + 1) IOImpact
    , CAST (100 - ((TotalIOCount / (hashamp () + 1)) * 100 / NULLIFZERO (MaxAMPIO) ) AS INTEGER) "IOSkew%"
    , AMPCPUTime * 1000 / nullifzero (TotalIOCount) LHR
    , TotalIOCount / nullifzero (AMPCPUTime * 1000) RHL
    , ParserCPUTime
    , Queryband
    , Substr(QueryText,1,2000) QueryText
/* For archived DBQL
    and ampcputime > 0
/* For online DBQL*/
    cast(cast(starttime as char(10)) as date) = '2013-12-18' (date)
    and ampcputime>0
order by 
    CPUImpact desc

Explanation of extra fields:

Time parser spent on producing the execution plan. This can be high if SQL is too complex or too many random AMP sampling has to be done.
Larry Higa ( inverse Larry Higa) index. Empirical index that shows the CPU vs I/O rate. By experience it should be usually around one (can be different depending on your system configuration, but is a constant). If it is far from 1, that indicates CPU or I/O dominance, which means unbalanced resource consumption, but it is a different dimension that skew.
Labels that sessions use to identify themselves within the DBQL logs
QueryText: First 200 characters of the query (depending on DBQL log settings)

OK, we’ve listed the terrible top consumers, but what’s next?
Have to identify those queries. If your ETL and Analytics software is configured to user QueryBand properly (this area deserves a separate post…), you can find which job or report issued that SQL, anyway, you can see the QueryText field.

If you want to get the full SQL text, select it from the DBQLSQLTbl (SQL logging needs to be switched on), replace the appropriate <procid> and <queryid> values:

    procid = <procid> 
    and queryid = <queryid> 
order by 
    SQLRowNo asc

You will get the SQL in several records, broken up to 30K blocks, simply concatenate them. Unfortunately the SQL will have very ugly make up, you can use PRISE Tuning Assistant to beautify and highlight it for easy reading.

System level Skewness


We have found those bad queries, nice. But what can we say about the whole system? What is the total parallel efficiency? Can we report how much resources were wasted due to bad parallel efficiency?

The answer is: yes, we can estimate quite closely. The exact value we cannot calculate because DBQL does not log AMP information for the query execution, but the most important metrics.

We can not calculate that situation when more skewed queries run the same time, but have peaks on different AMPs. This reduces the system level resource wasting, but is hard to calculate with, however its probability and effect is negligible now.

      sum(AMPCPUTime) AMPCPUTimeSum
    , sum(MaxAMPCPUTime * (hashamp () + 1)) CPUImpactSum
    , sum(TotalIOCount) TotalIOCountSum
    , sum(MaxAMPIO * (hashamp () + 1)) IOImpactSum
    , cast(100 - (AMPCPUTimeSum / CPUImpactSum) * 100 as integer) "CPUSkew%"
    , cast(100 - (TotalIOCountSum / IOImpactSum) * 100 as integer) "IOSkew%"
/* For archived DBQL
    logdate = '2013-12-18' (date)
    and (ampcputime > 0 or TotalIOCount > 0)
/* For online DBQL*/
    cast(cast(starttime as char(10)) as date) = '2013-12-18' (date)
    and (ampcputime > 0 or TotalIOCount > 0)

Look at the last two columns. That percent of your CPU and I/O goes to the sink…

Top bad guys

OK, let’s check how many queries accumulate 5%, 10%, 25%, 50%, 75%, 90% of this loss?
Here you are (CPU version, transform for I/O implicitly):

      'How many queries?' as "_"
    , min(limit5) "TOP5%Loss"
    , min(limit10) "TOP10%Loss"
    , min(limit25) "TOP25%Loss"
    , min(limit50) "TOP50%Loss"
    , min(limit75) "TOP75%Loss"
    , min(limit90) "TOP90%Loss"
    , max(rnk) TotalQueries
    , sum(ResourceTotal) "TotalResource"
    , sum(ResourceImpact) "ImpactResource"
         case when ResRatio < 5.00 then null else rnk end limit5
        ,case when ResRatio < 10.00 then null else rnk end limit10
        ,case when ResRatio < 25.00 then null else rnk end limit25
        ,case when ResRatio < 50.00 then null else rnk end limit50
        ,case when ResRatio < 75.00 then null else rnk end limit75
        ,case when ResRatio < 90.00 then null else rnk end limit90
        , ResourceTotal
        , ResourceImpact
              sum(ResourceLoss) over (order by ResourceLoss desc ) totalRes
            , sum(ResourceLoss) over (order by ResourceLoss desc  rows unbounded preceding) subtotalRes
            , subtotalRes *100.00 / totalRes Resratio
            , sum(1) over (order by ResourceLoss desc  rows unbounded preceding) rnk
            , ResourceTotal
            , ResourceImpact
                AMPCPUTime ResourceTotal
                , (MaxAMPCPUTime * (hashamp () + 1)) ResourceImpact
                , ResourceImpact - ResourceTotal ResourceLoss
            /* For archived DBQL
                and ampcputime > 0
            /* For online DBQL*/
                cast(cast(starttime as char(10)) as date) = '2013-12-18' (date)
                and ampcputime > 0
        ) x
    ) y
) z
group by 1

I expect you are a bit shocked now, how few queries waste how much golden resources.

I think we will agree that it is worth to tune those dozen of queries,
and you save in orders of $100K..M USD for your company annually, am I right?

PRISE Tuning Assistant helps you to find those queries and to get the hang of how to accelerate them.

Typical reasons of skewness – in a nutshell

  • Skewed tables: Bad choice of PI, Skewed data 
  • Bad execution plans (typically skewed redistributions)
    • Bad data model (normalization,data types,PI, etc.)
    • Missing or stale statistics
    • Too many joins (break up the query!)
  • Hash collision (load time problem)