Thursday, May 28, 2009

New blogs

Hi,

I have found some interesting blogs

Tanel Poder Oracle blog
Miladin Modrakovic Oracle blog

Take a look if you have a time.

Marcin

Saturday, May 16, 2009

A short story about Oracle CBO and data – part 3 - 11g

I have decided to check that same query on 11g.
And there is an improvement – I gathered a statistics and it is running without any problems.
So upgrade to 11g can be a step in good direction.
The main thing is that in 11g a new multicolumn join selectivity algorithm had been
used. I have to do some more research on that. Right now I want to show only
difference using my query as an example



Trace file from 10.2.0.4


Using multi-column join key sanity check for table TAB8
Revised join sel:6.1576e-04 = 0.0021556 * (1/1624.00) * (1/0.0021556)
Join Card: 126.39 = outer (373.19) * inner (550.00) * sel (6.1576e-04)
Join Card - Rounded: 126 Computed: 126.39


Trace file from 11.1.0.7


ColGroup cardinality sanity check: ndv for TAB8[H] = 550 TAB7[S] = 1
Join selectivity using 1 ColGroups: 0.002156 (sel1 = 0.000000, sel2 = 0.000000)
Join Card: 442.439495 = = outer (373.189655) * inner (550.000000) * sel (0.002156)
Join Card - Rounded: 442 Computed: 442.44


Please compare line 3 in both of them. Did you realize that computed join cardinality in 11g is almost 3.5 times bigger ?
In 10g a join selectivity was based on join columns selectivity, in 11g ?
I don't know yet.

update 18.05.2009
I shouldn't write anything during a weekend ;)
In 11g there is a new feature called a multicolumn histogram statistics
and now this is a source of cardinality in multicolumn joins.
See example on Pythian blog



ps.
Thanks to Alex GorbatchevI have a new syntax highlighting.

regards,
Marcin

Monday, May 11, 2009

Do we still need DBA ? A short story about Oracle CBO and data. – part 2

There is more deep investigation about potential cardinality errors in CBO.
At this place I want to thanks Jonathan Lewis for this brilliant book about Cost Based Optimization
in Oracle. That was and still this is a base of all my investigations.
I have got a query (described in part 1) which cause me a lot of problems with temporary tablespace.

I have used a 10053 event to generate details about CBO decisions for that particular query. Here I want to show only a part of that 5MB text file. According to that file join number 295 has been chosen as best join order.

...
Final - All Rows Plan: Best join order: 295
...

Here is a part of 10056 file

Join order[295]: TAB1_MV[P]#0 TAB2_MV[E]#1 TAB3_MV[S]#13 TAB4_MV[H]#11 TAB5_MV[D]#9 TAB6_MV[AC]#12 TAB7_MV[SR]#7 SP_MV[R]#19 ... rest of tables

***************
Now joining: SP_MV[R]#19
***************
....
Join Card: 0.00 = outer (11.29) * inner (30335.98) * sel (1.8863e-09)
Join Card - Rounded: 1 Computed: 0.00

There is no information about joining previous tables from TAB1 to TAB7. But join cardinality for those 7 tables has been calculated to 11.29. I want to know how it has been calculated. CBO can reuse calculations from other join orders so I have to find where it was.
Here is some information about tables. This is Single Table Cardinality which is used in join calculations.

TAB1_MV[P]#0 Card: Original: 1 Rounded: 1 Computed: 1.00 Non Adjusted: 1.00
TAB2_MV[E]#1 Card: Original: 3 Rounded: 3 Computed: 3.00 Non Adjusted: 3.00
TAB3_MV[S]#13 Card: Original: 21645 Rounded: 21645 Computed: 21645.00 Non Adjusted: 21645.00
TAB4_MV[H]#11 Card: Original: 550 Rounded: 550 Computed: 550.00 Non Adjusted: 550.00
TAB5_MV[D]#9 Card: Original: 117 Rounded: 117 Computed: 117.00 Non Adjusted: 117.00
TAB6_MV[AC]#12 Card: Original: 199128 Rounded: 2797 Computed: 2796.88 Non Adjusted: 2796.88
TAB7_MV[SR]#7 Card: Original: 23 Rounded: 23 Computed: 23.00 Non Adjusted: 23.00

As you can see it can be different between num_rows and computed cardinality. So CBO calculate a number of rows. There are detail table stats

Table Stats::
Table: TAB6_MV Alias: AC
#Rows: 199128 #Blks: 2688 AvgRowLen: 171.00
Column (#16): TAB6_CR_ID
AvgLen: 2.00 NDV: 6262 Nulls: 151566 Density: 8.4931e-05 Min: 1000 Max: 70516
Histogram: HtBal #Bkts: 254 UncompBkts: 254 EndPtVals: 140
Column (#17): TAB6_CR_NUMBER
AvgLen: 2.00 NDV: 5 Nulls: 150840 Density: 7.1446e-04 Min: 1 Max: 33
Histogram: Freq #Bkts: 5 UncompBkts: 48288 EndPtVals: 5
Column (#2): TAB6_ID
AvgLen: 5.00 NDV: 14584 Nulls: 0 Density: 6.1554e-05 Min: 607 Max: 70943
Histogram: HtBal #Bkts: 254 UncompBkts: 254 EndPtVals: 230
Column (#25): TAB_CR_SRCE_CD
AvgLen: 2.00 NDV: 2 Nulls: 150840 Density: 0.0016153
Histogram: Freq #Bkts: 2 UncompBkts: 48288 EndPtVals: 2

There are NULL values in 3 of 4 columns and because in join condition Oracle doesn’t use null values it is trying to adjust a number of rows using those values.
For only one column with null values calculated table cardinality is equal to

Cal cardinality = num_rows – number of nulls

# of rows # of nulls # of not null # not null/ # of rows
199128 151566 47562 0.238851
199128 150840 48288 0.242497
199128 0 199128 1
199128 150840 48288 0.242497


If there is more than one column with null values some more calculations are in place.

Cal cardinality col 1 and 2 = Cal cardinality col 1 * (number of not null / num_rows) for col 2

And so on

Cal cardinality col 1 and 2 and 3 = Cal cardinality col 1 and 2 * (number of not null / num_rows) for col 2

So in our example

Cal card = 47562 * 0.242497 * 1 *0.242497 = 2796.88

Did you read a part number 1 ? Above calculation are based on number of rows and number of nulls and there are no information about data values. Neither number of distinct values nor information from histograms was used.
So let’s CBO do it work

Join order[294]: TAB1_MV[P]#0 TAB2_MV[E]#1 TAB3_MV[S]#13 TAB4_MV[H]#11 TAB5_MV[D]#9 TAB6_MV[AC]#12 TAB7_MV[SR]#7 ...
***************
Now joining: TAB2_MV[E]#1
***************
...
Join Card: 1.00 = outer (1.00) * inner (3.00) * sel (0.33333)
Join Card - Rounded: 1 Computed: 1.00
...
***************
Now joining: TAB3_MV[S]#13
***************
...
Join Card: 373.19 = outer (1.00) * inner (21645.00) * sel (0.017241)
Join Card - Rounded: 373 Computed: 373.19
...
***************
Now joining: TAB4_MV[H]#11
***************
...
Join Card: 126.39 = outer (373.19) * inner (550.00) * sel (6.1576e-04)
Join Card - Rounded: 126 Computed: 126.39
...
***************
Now joining: TAB5_MV[D]#9
***************
...
Join Card: 126.39 = outer (126.39) * inner (117.00) * sel (0.008547)
Join Card - Rounded: 126 Computed: 126.39
...
***************
Now joining: TAB6_MV[AC]#12
***************
...
Join Card: 11.29 = outer (126.39) * inner (2796.88) * sel (3.1939e-05)
Join Card - Rounded: 11 Computed: 11.29
...
***************
Now joining: TAB7_MV[SR]#7
***************
...
Join Card: 11.29 = outer (11.29) * inner (23.00) * sel (0.043478)
Join Card - Rounded: 11 Computed: 11.29
...

As you can see calculated output cardinality is 11.29.
I have made some small experiment and I cut out from SQL query part related only to those 7 tables.

select count(*) FROM TAB4_MV H, TAB3_MV S,TAB1_MV P,TAB2_MV E,TAB5_MV D,TAB6_MV AC,TAB7_MV SR
WHERE S.TAB3_TC = H.TAB4_TC AND S.TAB3_CON_ID = H.TAB4_ID
AND S.TAB3_CON_NUMBER = H.TAB4_CON_NUMBER
AND H.TAB4_DETAIL_PAY_CODE = D.TAB5_DETAIL_CODE
AND S.TAB3_CON_ID = AC.TAB6_CR_ID
AND S.TAB3_CON_NUMBER = AC.TAB6_CR_NUMBER
AND AC.TAB6_CR_SRCE_CODE = SR.TAB7_CODE
AND S.TAB3_CON_ID = P.TAB1_ID
AND P.TAB1_ECAT_CODE = E.TAB2_CODE;

Here is a result of that query

COUNT(*)
---------
141 354 138

Comparing that to estimated 11.92 rows you can see that real join cardinality is almost 10 mln times more that calculated.
Ok so let’s execute a full SQL query. Best join order is set to 295 - see calculation at the begining of that post. As you can see cardinality is set to 1 now but result of query

select count(*) FROM SP_MV R, TAB4_MV H,TAB3_MV S,TAB1_MV P,TAB2_MV E,TAB5_MV D,
TAB6_MV AC,TAB7_MV SR
WHERE R.SP_ENTITY_IND = 'P'
AND R.SP_CHANGE_IND IS NULL
AND R.SID = S.TAB3_STU_ID
AND S.TAB3_TC = H.TAB4_TC
AND S.TAB3_CON_ID = H.TAB4_ID
AND S.TAB3_CON_NUMBER = H.TAB4_CON_NUMBER
AND H.TAB4_DETAIL_PAY_CODE = D.TAB5_DETAIL_CODE
AND S.TAB3_CON_ID = AC.TAB6_CR_ID
AND S.TAB3_CON_NUMBER = AC.TAB6_CR_NUMBER
AND R.SID = AC.TAB6_ID
AND AC.TAB6_CR_SRCE_CODE = SR.TAB7_CODE
AND S.TAB3_CON_ID = P.TAB1_ID
AND P.TAB1_ECAT_CODE = E.TAB2_CODE;

is

ORA-01652: unable to extend temp segment by 64 in tablespace TEMP

This is not what a customer is expecting as results of report. I find out that when cardinality of table is set via dbms_stats.set_table_stats it is not changed in Single Table Cardinality Section. I have to investigate this more deeply why but for now this is my solution for that problem.

See you in part 3.

regards,
Marcin

Thursday, May 7, 2009

Do we still need DBA ? A short story about Oracle CBO and data. – part 1

What is a Cost Based Optimizer ?
It is a mathematic model which is trying to estimate using some calculation what will be a best execution plan for particular SQL. All the calculations are based on objects (tables and indexes) statistics, instance statistics and some build in rules. Step by step Oracle CBO is calculating a cost of query (join by join, filter by filter) but some very important part is missing from calculations. What is missing from that ?
A data itself. CBO (in normal mode) doesn’t care about data of course there can be some histograms or column selectivity but in general there is no overview of values of rows.

Why I’m writing about that – I have created a 14 database for 14 branches of that same intuition. There is one source system which is replicated to BI databases and in each database a materialized view based on that same query are generated.
For 12 database it was working perfectly well, but for 2 doesn’t at all. During creation or refresh phase a following error appear:

ORA-01652: unable to extend temp segment by 64 in tablespace TEMP



My first try was to increase a TEMP tablespace from 2 to 4 GB. But it didn’t help.
OK – lets resize once again – I have set up a autoextend for that file and maximum file size has been set to 20 GB. Should be enough – that tables has no more then 100 000 rows. But not,
After 15 min still

ORA-01652: unable to extend temp segment by 64 in tablespace TEMP

This query has been joined about 12 tables. Nine from them was really small and have a good filter condition – a number of rows for that 9 join was really small between 20 and 500. But these tables have been joined to 2 big tables which have thousands of rows. A join condition between was really unique so overall results of query was limited to hundreds or thousands of rows.
I started with review of query to find a join which generate a lot of rows which has to be save in temporary space, after a few tries I had one !
Two tables have been joined on both columns – unfortunately it was not a unique join.

In table A it was four pair of values which meet join condition

Col1 Col2 Rows COUNT(*)
6001 3 8811
6001 4 292
6001 2 8246
6001 1 3292

In table B I got that same number of pairs

Col1 Col2 Rows COUNT(*)
6001 3 2700
6001 4 75
6001 2 10495
6001 1 6813

So if we multiple every join condition it is about 130 mln of rows generated from that query.
Of course design was a little bit different – there are more tables and joins which limit number of rows returned to user – but CBO decide to add that joins and filter after that.
So yes – at this stage I can say we still need DBA.

Next investigation in near future.