DBQL analysis IV – Monitor index usage

Analyze “Index usage” in DBQL

Please note that the solutions found in the article works on the DBQL logs, which covers only that users’ activity, for whom the logging is switched on. “Object” option in DBQL is required “on” to use the scripts provided.

About Indices

Teradata provides possibility of creating INDEX objects for allowing alternative access path to the data records. They are quite different structures then the good old B*Tree or Bitmap indices (common in non-MPP RDBMSes)
The main goal of the index objects is to improve data access performance in exchange for storage and maintenance processing capacity.
Having indices is not free (storage and maintenance resources), those ones that bring not enough gain is better being dropped.

Index usage footprint in DBQL

If a query uses an index for accessing the data it is declared in the “explain text”, and also registered in the DBQL: dbqlobjtbl. The appearance type of indices depend on the type of index. In case of primary/secondary index only IndexId and the columns of the index are registered, while join/hash indices appear like a regular table: at database, object and column levels all.

If the join/hash index is covering, the base table may not be listed in the DBQL objects, therefore be careful if analyze table usage from DBQLobjtbl.
I recommend to use PRISE Tuning Assistant to easily find all type of access to a table data.

Examples

Primary index access (2 columns):
Plan:
  1) First, we do a single-AMP RETRIEVE step from d01.z by way of the
     primary index “d01.z.i = 1, d01.z.j = 1” with no residual

DBQLobjtbl:
ObjectDatabaseName ObjectTableName ObjectColumnName ObjectNum ObjectType
D01 Z i 1 Idx
D01 Z j 1 Idx

Secondary index access (1 column):
Plan:
  3) We do an all-AMPs RETRIEVE step from d01.x by way of index # 4
     without accessing the base table “d01.x.j = 1” with no residual

DBQLobjtbl:
ObjectDatabaseName ObjectTableName ObjectColumnName ObjectNum ObjectType
D01 X 4 Idx

Join index:
create join index d01.ji as sel b from d01.q primary index (b);
select b from d01.q where a=1;
Plan:
  1) First, we do a single-AMP RETRIEVE step from D01.JI by way of the
     primary index “D01.JI.a = 1” with no residual conditions into

DBQLobjtbl:
ObjectDatabaseName ObjectTableName ObjectColumnName ObjectNum ObjectType
D01 JI 0 Jix
D01 JI a 1 Idx
D01 JI b 1026 Col

Please note that

  • ObjectNum identifies the index (refers to dbc.indices.Indexnumber)
  • As many rows appeas as many columns the index has
  • Eg. in V13.10 Teradata Express the single column secondary index lacks the column name in the logs

Analyzing DBQL data 

Prepare data

CREATE VOLATILE TABLE DBQLIdx_tmp1 AS (
SELECT databasename,tablename,indexnumber,indextype,uniqueflag,Columnposition,Columnname
, SUM (1) OVER (partition BY databasename,tablename,indexnumber ORDER BY Columnname ROWS UNBOUNDED PRECEDING) ABCOrder
FROM dbc.indices WHERE indextype IN ('K','P','Q','S','V','H','O','I')
) WITH DATA
PRIMARY INDEX (databasename,tablename,indexnumber)
ON COMMIT PRESERVE ROWS


CREATE VOLATILE TABLE DBQLIdx_tmp2 AS (
WITH RECURSIVE idxs (Databasename,Tablename,Indexnumber,Indextype,Uniqueflag,Indexcolumns,DEPTH)
AS (
SELECT
databasename,tablename,indexnumber,indextype,uniqueflag,TRIM (Columnname) (VARCHAR (1000)),ABCorder
FROM DBQLIdx_tmp1 WHERE ABCorder = 1
UNION ALL
SELECT
b.databasename,b.tablename,b.indexnumber,b.indextype,b.uniqueflag,b.Indexcolumns||','||TRIM (a.Columnname),a.ABCOrder
FROM DBQLIdx_tmp1 a
JOIN idxs b ON a.databasename = b.databasename AND a.tablename = b.tablename AND a.indexnumber = b.indexnumber AND a.ABCOrder = b.Depth + 1
)
SELECT databasename db_name,tablename table_name,indextype,uniqueflag,indexcolumns
,indexnumber
,CASE WHEN uniqueflag = 'Y' AND indextype IN ('P','Q','K') THEN 'UPI'
WHEN uniqueflag = 'N' AND indextype IN ('P','Q') THEN 'NUPI'
WHEN uniqueflag = 'Y' AND indextype IN ('S','V','H','O') THEN 'USI'
WHEN uniqueflag = 'N' AND indextype IN ('S','V','H','O') THEN 'NUSI'
WHEN indextype = 'I' THEN 'O-SI'
ELSE NULL
END Index_code
FROM idxs
QUALIFY SUM (1) OVER (partition BY db_name,table_name,indexnumber ORDER BY DEPTH DESC ROWS UNBOUNDED PRECEDING) = 1
) WITH DATA
PRIMARY INDEX (db_name,table_name)
ON COMMIT PRESERVE ROWS
;

UPDATE a
FROM DBQLIdx_tmp2 a,DBQLIdx_tmp2 b
SET Index_code = 'PK'
WHERE a.db_name = b.db_name
AND a.table_name = b.table_name
AND a.Index_code = 'UPI'
AND a.indextype = 'K'
AND b.Index_code = 'NUPI'
AND b.indextype <> 'K'
;

Report: How many times have the indices been used?

You may need to modify the script:
  • Date filtering (use between for interval)
  • Online/archived DBQL: use commented section for archived

SELECT
  COALESCE(usg.objectdatabasename,idx.db_name) db
, COALESCE(usg.objecttablename,idx.table_name) tbl
, COALESCE(usg.ObjectNum,idx.IndexNumber) idxNo
, idx.Index_code
, idx.Indexcolumns Index_columnss
, coalesce(usg.drb,0) Nbr_of_usg
FROM
(SELECT objectdatabasename,objecttablename,objecttype,ObjectNum,COUNT (*) drb

—  Archived DBQL
—  FROM dbql_arch.dbqlobjtbl_hst WHERE logdate = ‘2014-02-20’ (date)

—  Online DBQL
  FROM dbc.dbqlobjtbl WHERE  cast(collecttimestamp as char(10)) = ‘2014-02-20’
AND objecttablename IS NOT NULL
AND ((objecttype IN (‘JIx’,’Hix’) AND objectcolumnname IS NULL)
OR
(objecttype IN (‘Idx’))
)
AND objectnum <> 1
GROUP BY 1,2,3,4
) usg
FULL OUTER JOIN
( SELECT db_name,table_name,Indextype,Uniqueflag,indexcolumns,Indexnumber,Index_code 

  FROM DBQLIdx_tmp2 a WHERE indextype NOT IN (‘P’,’Q’,’K’)
union all
SELECT databasename,tablename,tablekind,cast(null as char(1))

              ,cast(null as varchar(1000)),cast(null as smallint)
,case when tablekind=’I’ then ‘JIX’ else ‘HIX’ end from dbc.tables where tablekind in (‘I’,’N’)
) idx ON usg.objectdatabasename = idx.db_name
AND usg.objecttablename = idx.table_name
AND ((usg.ObjectNum = idx.IndexNumber) or usg.objecttype IN (‘JIx’,’Hix’))
ORDER BY 6 DESC
;

DBQL analysis III – Monitor “collect statistics”

Analyze “collect statistics” in DBQL

Please note that the solutions found in the article works on the DBQL logs, which covers only that users’ activity, for whom the logging is switched on. “Object” and “SQL” option in DBQL is required “on” to use the scripts provided.
This article is applicable up to V13.10 w/o modifications, statistics handling changed from V14. 

About Statistics

“Statistics” is a descriptive object in the Teradata database that are used by the optimizer for transforming SQLs to effective execution plans.
Statistics reflect the key data demographic information of one or more table column(s).
These objects should be created and maintained, the RDBMS will not do it by itself.
Statistics internally contain value histogram, which needs the table data (or sample) to be analyzed, which is an expensive task.

Summarized: appropriate statistics are required for getting good and effective executon plans for SQLs, but statistics consume resources to be collected or refreshed.

“Statistics” footprint in DBQL

When a “statistics” is created or refreshed it is executed by an SQL command: collect statistics….
This command will create a log entry into the DBQL if the logging is switched on.

One can track when, which “statistics” was collected, consuming how much CPU and I/O.
Those statements are very easy to identify in the central table:

select * from dbc.DBQLogTbl where StatementType=’Collect statistics’

Analyzing DBQL data 

Prepare data

You may need to modify the script:

  • Date (interval)
  • Online/archived: use commented section
  • QueryBand: “JOB” variable is used, modify according to your ETL settings

create volatile table DBQLStat_tmp1
as
(
sel a.procId,a.QueryId,a.StartTime,a.AMPCpuTime,a.TotalIOCount

,case when a.querytext like '% sample %' then 'S' else 'F' end Full_Sample
,UserName,(FirstRespTime - StartTime) DAY(4) TO SECOND(4) AS RUNINTERVAL      
,(EXTRACT(DAY FROM RUNINTERVAL) * 86400 + EXTRACT(HOUR FROM RUNINTERVAL)  * 3600 + EXTRACT(MINUTE FROM RUNINTERVAL)  * 60 + EXTRACT(SECOND FROM RUNINTERVAL) ) (decimal(10,1)) Duration
,b.ObjectDatabaseName DatabaseName,b.ObjectTableName TableName,c.ObjectColumnName ColumnName
,case when d.SQLTextInfo like any ('%"PARTITION"%', '%,PARTITION %', '%,PARTITION,%', '% PARTITION,%', '% PARTITION %', '%(PARTITION,%', '%(PARTITION %', '%,PARTITION)%', '% PARTITION)%', '%(PARTITION)%') then 'Y' else 'N' end inclPartition
,CAST((case when index(queryband,'JOB=') >0 then  substr(queryband,index(queryband,'JOB=') ) else '' end) AS VARCHAR(500)) tmp_Q
,case when queryband = '' then 'N/A'
         when tmp_q = '' then '-Other'
else CAST( (substr(tmp_Q,characters('JOB=')+1, nullifzero(index(tmp_Q,';'))-characters('JOB=')-1)) AS VARCHAR(500)) end QB_info
,sum(1) over (partition by a.procid,a.Queryid order by c.ObjectColumnName, a.QueryID rows unbounded preceding) Rnk
from
/* For achived tables
     dbql_arch.DBQLogTbl_hst       a
join dbql_arch.DBQLObjTbl_hst      b on b.ObjectType='Tab' and a.procid=b.procid and a.QueryID=b.QueryID and a.logDate=b.logDate
left join dbql_arch.DBQLObjTbl_hst c on c.ObjectType='Col' and a.procid=c.procid and a.QueryID=c.QueryID and a.logDate=c.logDate
join dbql_arch.DBQLSQLTbl_hst      d on d.SQLRowNo=1       and a.procid=d.procid and a.QueryID=d.QueryID and a.logDate=d.logDate
where a.logDate=1140113
*/
/*end*/
/* For online tables */
     dbc.DBQLogTbl       a
join dbc.DBQLObjTbl      b on b.ObjectType='Tab' and a.procid=b.procid and a.QueryID=b.QueryID
left join dbc.DBQLObjTbl c on c.ObjectType='Col' and a.procid=c.procid and a.QueryID=c.QueryID
join dbc.DBQLSQLTbl      d on d.SQLRowNo=1       and a.procid=d.procid and a.QueryID=d.QueryID
where cast(cast(a.starttime as char(10)) as date) = '2014-01-13' (date)
/*end*/
and a.StatementType='Collect statistics'
) with data
primary index (procId,QueryId)
on commit preserve rows
;

create volatile table DBQLStat
as
(
WITH RECURSIVE rec_tbl
(
 procId,QueryId,StartTime,AMPCpuTime,TotalIOCount,Duration,Full_Sample,UserName,DatabaseName,TableName,QB_info,inclPartition,ColumnName,Rnk,SColumns
)
AS
(
select
 procId,QueryId,StartTime,AMPCpuTime,TotalIOCount,Duration,Full_Sample,UserName,DatabaseName,TableName,QB_info,inclPartition,ColumnName,Rnk,cast(case when ColumnName is null and inclPartition='Y' then '' else '('||ColumnName end as varchar(10000)) SColumns
from DBQLStat_tmp1 where Rnk=1
UNION ALL
select
  a.procId,a.QueryId,a.StartTime,a.AMPCpuTime,a.TotalIOCount,a.Duration,a.Full_Sample,a.UserName,a.DatabaseName,a.TableName,a.QB_info,a.inclPartition,a.ColumnName,a.Rnk,b.SColumns ||','||a.ColumnName
from DBQLStat_tmp1     a
join rec_tbl b on a.procId=b.ProcId and a.QueryId=b.QueryID and a.Rnk=b.Rnk+1
)
select   procId,QueryId,StartTime,AMPCpuTime,TotalIOCount,Duration,Full_Sample,UserName,DatabaseName,TableName,QB_info,Rnk NumOfColumns
        ,case when SColumns = '' then '(PARTITION)' else SColumns || case when inclPartition='Y' then ',PARTITION)' else ')' end end StatColumns
from rec_tbl qualify sum(1) over (partition by procid,queryid order by Rnk desc, QueryID rows unbounded preceding) = 1
) with data
primary index (procid,queryid)
on commit preserve rows
;

Reports

  • How many statistics has been collected for how much resources?

select
  UserName /*Or: DatabaseName*//*Or: Full_sample*/
, count(*) Nbr
, sum(AMPCpuTIme) CPU
, sum(TotalIOCount) IO
from DBQLStat
group by 1
order by 1
;

  • Which statistics has been collected multiple times?
    (If more days are in preapred data, frequency can be determined, erase “qualify”)

select a.*,
sum(1) over (partition by databasename,tablename,statcolumns)  Repl
from DBQLStat a

/* Comment for frequency report*/
qualify sum(1) over (partition by databasename,tablename,statcolumns) > 1

/*end*/
order by repl desc, databasename,tablename,statcolumns
;

DBQL analysis II. – Skewness – The big resource robbery detection

The big resource robbery detection

Skewness

Please note that the solutions found in the article works on the DBQL logs, which covers only that users’ activity, for whom the logging is switched on.

 

About Skew

If you are not comfortably familiar with Skewness or DBQL, please read the corresponding posts before:

As you can see the difference between Impact[Resource] and Total[Resource] is a net technological loss, which should be minimized.

Skewness is a good indicator for highlight those workload that wastes lots of valuable CPU and I/O resources because of inefficient parallelism.

Find “Top skewed queries” in DBQL

With this SQL you can filter top impacting queries (replace the date value or maybe you have to adjust the date filtering according to local settings):

select top 50
      ProcID
    , QueryID
    , AMPCPUTime
    , MaxAMPCPUTime * (hashamp () + 1) CPUImpact
    , CAST (100 - ((AmpCPUTime / (hashamp () + 1)) * 100 / NULLIFZERO (MaxAMPCPUTime)) AS INTEGER) "CPUSkew%"
    , TotalIOCount
    , MaxAMPIO * (hashamp () + 1) IOImpact
    , CAST (100 - ((TotalIOCount / (hashamp () + 1)) * 100 / NULLIFZERO (MaxAMPIO) ) AS INTEGER) "IOSkew%"
    , AMPCPUTime * 1000 / nullifzero (TotalIOCount) LHR
    , TotalIOCount / nullifzero (AMPCPUTime * 1000) RHL
    , ParserCPUTime
    , Queryband
    , Substr(QueryText,1,2000) QueryText
from
/* For archived DBQL
    dbql_arch.dbqlogtbl_hst 
where
    logdate=1131201
    and ampcputime > 0
*/
/* For online DBQL*/
    dbc.dbqlogtbl 
where
    cast(cast(starttime as char(10)) as date) = '2013-12-18' (date)
    and ampcputime>0
order by 
    CPUImpact desc

Explanation of extra fields:

ParserCPUTime
Time parser spent on producing the execution plan. This can be high if SQL is too complex or too many random AMP sampling has to be done.
LHR/RHL
Larry Higa ( inverse Larry Higa) index. Empirical index that shows the CPU vs I/O rate. By experience it should be usually around one (can be different depending on your system configuration, but is a constant). If it is far from 1, that indicates CPU or I/O dominance, which means unbalanced resource consumption, but it is a different dimension that skew.
QueryBand
Labels that sessions use to identify themselves within the DBQL logs
QueryText: First 200 characters of the query (depending on DBQL log settings)

OK, we’ve listed the terrible top consumers, but what’s next?
Have to identify those queries. If your ETL and Analytics software is configured to user QueryBand properly (this area deserves a separate post…), you can find which job or report issued that SQL, anyway, you can see the QueryText field.

If you want to get the full SQL text, select it from the DBQLSQLTbl (SQL logging needs to be switched on), replace the appropriate <procid> and <queryid> values:

select 
    SQLTextInfo 
from 
    dbc.dbqlsqltbl 
where 
    procid = <procid> 
    and queryid = <queryid> 
order by 
    SQLRowNo asc

You will get the SQL in several records, broken up to 30K blocks, simply concatenate them. Unfortunately the SQL will have very ugly make up, you can use PRISE Tuning Assistant to beautify and highlight it for easy reading.

System level Skewness

Totals

We have found those bad queries, nice. But what can we say about the whole system? What is the total parallel efficiency? Can we report how much resources were wasted due to bad parallel efficiency?

The answer is: yes, we can estimate quite closely. The exact value we cannot calculate because DBQL does not log AMP information for the query execution, but the most important metrics.

We can not calculate that situation when more skewed queries run the same time, but have peaks on different AMPs. This reduces the system level resource wasting, but is hard to calculate with, however its probability and effect is negligible now.

select
      sum(AMPCPUTime) AMPCPUTimeSum
    , sum(MaxAMPCPUTime * (hashamp () + 1)) CPUImpactSum
    , sum(TotalIOCount) TotalIOCountSum
    , sum(MaxAMPIO * (hashamp () + 1)) IOImpactSum
    , cast(100 - (AMPCPUTimeSum / CPUImpactSum) * 100 as integer) "CPUSkew%"
    , cast(100 - (TotalIOCountSum / IOImpactSum) * 100 as integer) "IOSkew%"
from
/* For archived DBQL
    dbql_arch.dbqlogtbl_hst
where 
    logdate = '2013-12-18' (date)
    and (ampcputime > 0 or TotalIOCount > 0)
*/
/* For online DBQL*/
    dbc.dbqlogtbl
where
    cast(cast(starttime as char(10)) as date) = '2013-12-18' (date)
    and (ampcputime > 0 or TotalIOCount > 0)

Look at the last two columns. That percent of your CPU and I/O goes to the sink…

Top bad guys

OK, let’s check how many queries accumulate 5%, 10%, 25%, 50%, 75%, 90% of this loss?
Here you are (CPU version, transform for I/O implicitly):

select 
      'How many queries?' as "_"
    , min(limit5) "TOP5%Loss"
    , min(limit10) "TOP10%Loss"
    , min(limit25) "TOP25%Loss"
    , min(limit50) "TOP50%Loss"
    , min(limit75) "TOP75%Loss"
    , min(limit90) "TOP90%Loss"
    , max(rnk) TotalQueries
    , sum(ResourceTotal) "TotalResource"
    , sum(ResourceImpact) "ImpactResource"
from
(
    select
         case when ResRatio < 5.00 then null else rnk end limit5
        ,case when ResRatio < 10.00 then null else rnk end limit10
        ,case when ResRatio < 25.00 then null else rnk end limit25
        ,case when ResRatio < 50.00 then null else rnk end limit50
        ,case when ResRatio < 75.00 then null else rnk end limit75
        ,case when ResRatio < 90.00 then null else rnk end limit90
        ,rnk
        , ResourceTotal
        , ResourceImpact
    from
    (
        select
              sum(ResourceLoss) over (order by ResourceLoss desc ) totalRes
            , sum(ResourceLoss) over (order by ResourceLoss desc  rows unbounded preceding) subtotalRes
            , subtotalRes *100.00 / totalRes Resratio
            , sum(1) over (order by ResourceLoss desc  rows unbounded preceding) rnk
            , ResourceTotal
            , ResourceImpact
        from
        (
            select
                AMPCPUTime ResourceTotal
                , (MaxAMPCPUTime * (hashamp () + 1)) ResourceImpact
                , ResourceImpact - ResourceTotal ResourceLoss
            /* For archived DBQL
            from 
                dbql_arch.dbqlogtbl_hst 
            where 
                logdate=1131207
                and ampcputime > 0
            */
            /* For online DBQL*/
            from 
                dbc.dbqlogtbl
            where
                cast(cast(starttime as char(10)) as date) = '2013-12-18' (date)
                and ampcputime > 0
        ) x
    ) y
) z
group by 1

I expect you are a bit shocked now, how few queries waste how much golden resources.

I think we will agree that it is worth to tune those dozen of queries,
and you save in orders of $100K..M USD for your company annually, am I right?

PRISE Tuning Assistant helps you to find those queries and to get the hang of how to accelerate them.

Typical reasons of skewness – in a nutshell

  • Skewed tables: Bad choice of PI, Skewed data 
  • Bad execution plans (typically skewed redistributions)
    • Bad data model (normalization,data types,PI, etc.)
    • Missing or stale statistics
    • Too many joins (break up the query!)
  • Hash collision (load time problem)

DBQL analysis I. – Monitor “Top CPU consumers”

DBQL analysis

About DBQL

What is it?

DataBase Query Logging.

It is a nice feature of Teradata RDBMS, which comprehensively logs the issued queries execution – if it is switched on.
Configuration can be checked/administered eg. in the Teradata tools or from DBC.DBQLRuleTbl.
Logging can be set on global/user level, and in respect of details (see DBQL tables)

For detailed information please refer Teradata documentation of your version.

DBQL tables

Table Content
DBQLogTbl Central table, 1 record for each query.
DBQLSQLTbl Whole SQL command, broken up to 30k blocks
DBQLStepTbl Execution steps of the query, one row for each step.
DBQLObjTbl Objects participated in the query. Logged on different levels (db,table, column, index, etc.)
DBQLExplainTbl English explain text, broken up to 30k blocks
DBQLXMLTbl Explain in XML format, broken up to 30k blocks
DBQLSummaryTbl PEs’ aggregated table, which accounts on the desired level.

DBQL tables logically organized into 1:N structure, where DBQLogTbl is the master entity and others (except DBQLSummaryTbl) are the children.
Join fields are the ProcID and QueryId together, eg:

...
from 
    DBQLogTbl a
    join DBQLStepTbl b on 
        a.ProcID = b.ProcID 
        and 
        a.QueryID = b.QueryID
...

Unfortunately PI of DBQL tables are not in sync with logical PK-FK relation in (also in latest V14.10), therefore JOIN-ed selects against online DBQL tables are not optimal.

Cost of using DBQL

DBQL basically consumes negligible amount of processing resources, since it has cached&batch write and generates data proportional to issued queries (flush rate is DBScontrol parameter).
It is important to regularly purge/archive them from the DBC tables, Teradata has a recommendation for it. This ensures that PERM space consumption of the DBQL remains low.
In an environment where ~1M SQLs are issued a day, comprehensive logging generates ~8..10G of DBQL data daily w/o XML and Summary. Less SQLs generate proportionally less data.

It is worth to switch on all option except XML and Summary, since the first generates huge data volume (~makes it double), and the second is similar to Acctg info. If you want to utilize them, they should be switched on, of course.

What is it good for?

It contains:

  • Query run time, duration
  • Consumed resources
  • Environment info (user, default db, etc)
  • SQL text
  • Explain
  • Step resource info
  • Objects involved
  • Etc.

One can get a lot of useful aggregated and query specific tuning information, some of them I will share in the blog.

CPU usage distribution info

(Everything applies to I/O also, just replace CPU with I/O, AMPCPUTime with TotalIOCount…)

Do you think Query optimization is rewarding?

Yes, I know it is hard work to find out why is ONE query run sub-optimally, and what to do with it.

But guess how many queries consume how many percent of the processing resources (CPU) within a whole day’s workload.
Tip it and write down for CPU%: 5%, 10%, 25% and 50%

And now run the query below, which will result it to you. (replace the date value or maybe you have to adjust the date filtering according to local settings)

select 
    'How many queries?' as "_"
    ,min(limit5) as "TOP5%CPU"
    ,min(limit10) as "TOP10%CPU"
    ,min(limit25) as "TOP25%CPU"
    ,min(limit50) as "TOP50%CPU" 
    ,max(rnk) as TotalQueries
from
(
    select
        case when CPURatio < 5.00 then null else rnk end as limit5
        ,case when CPURatio < 10.00 then null else rnk end as limit10
        ,case when CPURatio < 25.00 then null else rnk end as limit25
        ,case when CPURatio < 50.00 then null else rnk end as limit50
        ,rnk
    from
    (
        select
            sum(ampcputime) over (order by ampcputime desc) as totalCPU
            ,sum(ampcputime) over (
                order by ampcputime desc  
                rows unbounded preceding
             ) as subtotalCPU
            ,subtotalCPU * 100.00 / totalCPU as CPUratio
            ,sum(1) over (
                order by ampcputime desc 
                rows unbounded preceding
             ) as rnk
        from
        (
            select *
            /* For archived DBQL
            from dbql_arch.dbqlogtbl_hst 
            where 
                logdate=1131201 
                and 
                ampcputime > 0 
            */
            /* For online DBQL*/
            from dbc.dbqlogtbl 
            where
                cast(cast(starttime as char(10)) as date) = '2013-12-10' (date)
                and 
                ampcputime > 0
        ) x
    ) y
) z
group by 1;

Are you surprised?
I bet:

  • Less than 10 queries will consume 5% of the CPU
  • Less than 1% of the queries will consume 50% of the CPU

Let’s calculate.
How much does your Teradata system cost a year? It is all for storage and processing capacity.
If you can save eg. X% of CPU&I/O and X% storage using MVC optimization, you saved X% of the price of the Teradata system, by:

  • Improved user experience (earlier load, faster responses)
  • Resources for additional reports and applications
  • Enable postponing a very expensive Teradata hardware upgrade
PRISE Tuning Assistant

helps you to find those queries and to get the hang of how to accelerate them.