Wednesday, April 27, 2011

MySQLPerformance Schema - is this introduction of Oracle event based monitoring in MySQL ?

Today I read presentation about MySQL performance MySQL Essentials part 8  recommended by @surachart and I learned that Performance Schema has been introduced - see documentation.
After basic review it looks like introduction of well know Oracle wait interface and statistics. I'm going to download MySQL version 5.5 shortly and check it out. I have a lot of concerns what happen to MySQL after Oracle acquisition but at least this is step in good direction.

regards,
Marcin

Sunday, April 3, 2011

SASH visualization using Excel

OraSASH is a free implementation of Oracle Active Session History mechanism but it is providing only raw tables with data. To help other people use that functionality and inspired by Charles Hooper blog I have created a Excel based simple GUI interface for SASH. It is providing historical and real time view based on repository. It has been tested on Windows 7 64 bit with 32 bit Excel 2007. To make it running it is necessary to install and configure Oracle ODAC 11.2.0.2. Zip file can be downloaded from SourceForge repository. Please be aware that this is alpha stage of this Excel sheet and it can contain errors. Read only access is needed to OraSASH repository. Macros have to be enabled in Excel as all functionality is based on Visual Basic.

Setup sheet
Fill out repository user and tns alias name on Setup sheet and click Start button.


Historic data should be read from repository for whole available period of time and available instances list will be fill out as well. When all data will be collected active sheet will be switched to Graph sheet.
Graph sheet

There are two charts on Graph sheet – upper one is displaying Average Active Session - for AAS concept see John Beresniewicz presentation or Kyle Hailey blog.
Chart on the right side is displaying SQL activity for selected period of time. Percent of activity is calculated against number of all samples from that time. This same activity percent can be seen on OEM performance tab. SQL details like query text and plan will be displayed in Detail SQL view when chart bar for appropriate SQL ID will be clicked.
Historic view
To change period displayed on screen change start and stop date and then click Refresh Historic View button. 
Real time view
To see current database activity switch radio button to Real Time refresh. Refresh rate can be change on this same sheet in Refresh Rate field. It take one full refresh cycle to pick up a refresh new value.
Detail SQLsheet

Separate sheet for every SQL query. Sheet name is SQL_ID and detail available information about query text and plan will be displayed there. In addition to that detail information about wait event and CPU utilization will be displayed as well.

If you have any comments or questions please feel free to contact me.
regards,
Marcin

Saturday, April 2, 2011

Oracle XE 11g - beta release

Hello,

Good news Oracle XE 11g has been released. Right now it is a beta version but we can expect production release soon. I was waiting for that release as my idea was to use it a repository database for S-ASH. I'm going to check if all new 11g functionality like PIVOT functions will be working fine.


regards,
Marcin

Friday, April 1, 2011

New version of S-ASH

Hello,

I'm pleased to inform you that new version of Simulate ASH has been published.
You can download it from ASH Masters website or from Source Forge project site.

If you have any questions or ideas how to improve it please feel free to contact me


regards,
Marcin

Monday, March 14, 2011

Rolling partition and global index coalesce

Resolving one of performance issues I figure out that primary key index is much bigger that it should be. Most of database activity is inserting data into partitioning table using some date field as a partitioning key. Rows can be inserted to every partition not only to most recent one. Daily job is dropping one oldest partition per day. Unfortunately primary key index is not partitioned and is maintained during partition drop as a global index. Correlation of these two things is a root cause of index size explosion.
Short test case below:
  • script test.sql has been used to check index size, number of rows in table and partition drop
  • in second session simple script inserting data into random partition was running continuously
Table definition:
  CREATE TABLE LOGS
   (    "LOG_ID" NUMBER NOT NULL ENABLE,
        "DELDATE" DATE NOT NULL ENABLE,
        "FILL" VARCHAR2(100),
         CONSTRAINT "LOGS_PK" PRIMARY KEY ("LOG_ID")
   )
  TABLESPACE "USERS"
  PARTITION BY RANGE ("DELDATE")
 (PARTITION "PARTITION1"  VALUES LESS THAN (TO_DATE(' 2011-02-28 00:00:00', 'SYYYY-MM-DD HH24:MI:SS', 'NLS_CALENDAR=GREGORIAN')),
  PARTITION "PARTITION2"  VALUES LESS THAN (TO_DATE(' 2011-03-28 00:00:00', 'SYYYY-MM-DD HH24:MI:SS', 'NLS_CALENDAR=GREGORIAN')),
  PARTITION "PARTITION3"  VALUES LESS THAN (TO_DATE(' 2011-04-28 00:00:00', 'SYYYY-MM-DD HH24:MI:SS', 'NLS_CALENDAR=GREGORIAN')),
  PARTITION "PARTITION4"  VALUES LESS THAN (TO_DATE(' 2011-05-28 00:00:00', 'SYYYY-MM-DD HH24:MI:SS', 'NLS_CALENDAR=GREGORIAN')),
  PARTITION "PARTITION5"  VALUES LESS THAN (TO_DATE(' 2011-06-28 00:00:00', 'SYYYY-MM-DD HH24:MI:SS', 'NLS_CALENDAR=GREGORIAN')),
  PARTITION "PARTITION6"  VALUES LESS THAN (TO_DATE(' 2011-07-28 00:00:00', 'SYYYY-MM-DD HH24:MI:SS', 'NLS_CALENDAR=GREGORIAN')),
  PARTITION "PARTITION7"  VALUES LESS THAN (TO_DATE(' 2011-08-28 00:00:00', 'SYYYY-MM-DD HH24:MI:SS', 'NLS_CALENDAR=GREGORIAN')),
  PARTITION "PARTITION8"  VALUES LESS THAN (TO_DATE(' 2011-09-28 00:00:00', 'SYYYY-MM-DD HH24:MI:SS', 'NLS_CALENDAR=GREGORIAN')),
  PARTITION "PARTITION9"  VALUES LESS THAN (TO_DATE(' 2011-10-28 00:00:00', 'SYYYY-MM-DD HH24:MI:SS', 'NLS_CALENDAR=GREGORIAN')),
  PARTITION "PARTITION10"  VALUES LESS THAN (TO_DATE(' 2011-11-28 00:00:00', 'SYYYY-MM-DD HH24:MI:SS', 'NLS_CALENDAR=GREGORIAN')),
  PARTITION "PARTITION11"  VALUES LESS THAN (TO_DATE(' 2011-12-28 00:00:00', 'SYYYY-MM-DD HH24:MI:SS', 'NLS_CALENDAR=GREGORIAN'))
 );
Source of test.sql script 
prompt number of rows in table
 select count(*) from pioro.logs;
 
 prompt LOGS_PK index size
 select sum(bytes)/1024/1024 from dba_extents where segment_name = 'LOGS_PK';

 prompt calculating stats
 exec dbms_stats.gather_table_stats('PIORO','LOGS',cascade=>true, estimate_percent=>null);
 
 prompt rows/block without update global indexes
 select blevel, leaf_blocks, distinct_keys, NUM_ROWS, NUM_ROWS/leaf_blocks "rows/block" from dba_indexes 
where owner='PIORO' and table_name='LOGS' and leaf_blocks <> 0;

 prompt number of rows in table partition
 select count(*) from pioro.logs partition(&partname);
 set timing on 

 alter table pioro.logs drop partition &partname update global indexes;
Inserts script running in loop looks like

insert into logs select seq_log_id.nextval, to_date('2011-'|| trunc(dbms_random.value(2,9)) ||'-15','yyyy-mm-dd'),
lpad('x',70,'x') from all_source a, all_source b where rownum <= 3000000;
commit;
Here is a output from test.sql for first partition drop
SQL> @test
number of rows in table COUNT(*)                                                                                                 
----------                                                                                                          
  12000000                                                                                                          

Elapsed: 00:00:13.74
LOGS_PK index size

SUM(BYTES)/1024/1024                                                                                                
--------------------                                                                
242           

Elapsed: 00:00:00.51
calculating stats

PL/SQL procedure successfully completed.

Elapsed: 00:01:11.28
rows/block without update global indexes

    BLEVEL LEAF_BLOCKS DISTINCT_KEYS   NUM_ROWS rows/block                                                          
---------- ----------- ------------- ---------- ----------                                                          
         2       30000      12000000   12000000        400         

Elapsed: 00:00:00.03
number of rows in table partition
Enter value for partname: partition1
old   1:  select count(*) from pioro.logs partition(&partname)
new   1:  select count(*) from pioro.logs partition(partition1)

  COUNT(*)                                                                                                          
----------                                                                                                          
   3001361                                                                              
Elapsed: 00:00:00.35
Enter value for partname: partition1
old   1:  alter table pioro.logs drop partition &partname update global indexes
new   1:  alter table pioro.logs drop partition partition1 update global indexes

Table altered.

Elapsed: 00:00:55.37
As you can see before first drop average number of rows in one index leaf for primary key was 400. ( 8 kB database block ) Output from next run looks like:
SQL> @test
number of rows in table

  COUNT(*)   
----------   
  11998639   

Elapsed: 00:00:09.59
LOGS_PK index size

SUM(BYTES)/1024/1024  
--------------------  
                 296  

Elapsed: 00:00:01.12
calculating stats

PL/SQL procedure successfully completed.

Elapsed: 00:01:13.16
rows/block without update global indexes

    BLEVEL LEAF_BLOCKS DISTINCT_KEYS   NUM_ROWS rows/block    
---------- ----------- ------------- ---------- ----------    
         2       37500      11998639   11998639 319.963707    

Elapsed: 00:00:00.05
number of rows in table partition
Enter value for partname: partition2
old   1:  select count(*) from pioro.logs partition(&partname)
new   1:  select count(*) from pioro.logs partition(partition2)

  COUNT(*)                  
----------                  
   3749280                  

Elapsed: 00:00:00.43
Enter value for partname: partition2
old   1:  alter table pioro.logs drop partition &partname update global indexes
new   1:  alter table pioro.logs drop partition partition2 update global indexes

Table altered.

Elapsed: 00:01:22.85
Numbers of rows in one leaf block drop from 400 to 320 rows per block. After a few next runs it looks like:
SQL> @test
number of rows in table

  COUNT(*)                                                                                                          
----------                                                                                                          
   9748583                                                                                                          

Elapsed: 00:00:17.67
LOGS_PK index size

SUM(BYTES)/1024/1024      
--------------------      
                 414      

Elapsed: 00:00:00.85
calculating stats

PL/SQL procedure successfully completed.

Elapsed: 00:01:11.78
rows/block without update global indexes

    BLEVEL LEAF_BLOCKS DISTINCT_KEYS   NUM_ROWS rows/block        
---------- ----------- ------------- ---------- ----------        
         2       52500       9748583    9748583 185.687295        

Elapsed: 00:00:00.01
Now is really bad - only 186 rows in one leaf block and size of index segment is now 414 MB. It will have huge impact on database performance as much more index blocks have to be read during index range scans. Solution is quite simple and can be done online without huge overhead. It is a index coalesce - this command will consolidate a free space in existing index and provide more free blocks for particular index segment. See example:
SQL> alter index pioro.LOGS_PK coalesce;

Index altered.

Elapsed: 00:03:57.12
SQL> prompt number of rows in table partition
number of rows in table partition
SQL> select count(*) from pioro.logs partition(&partname);
Enter value for partname: partition4
old   1: select count(*) from pioro.logs partition(&partname)
new   1: select count(*) from pioro.logs partition(partition4)

  COUNT(*)                    
----------                    
   5248055                    

Elapsed: 00:00:10.47
SQL> alter table pioro.logs drop partition &partname update global indexes;
Enter value for partname: partition4
old   1: alter table pioro.logs drop partition &partname update global indexes
new   1: alter table pioro.logs drop partition partition4 update global indexes

Table altered.

Elapsed: 00:02:09.65
SQL> @test
number of rows in table

  COUNT(*)                          
----------                          
   7500528                          

Elapsed: 00:00:15.78
LOGS_PK index size

SUM(BYTES)/1024/1024      
--------------------      
                 414      

Elapsed: 00:00:01.47
calculating stats

PL/SQL procedure successfully completed.

Elapsed: 00:00:46.30
rows/block without update global indexes

    BLEVEL LEAF_BLOCKS DISTINCT_KEYS   NUM_ROWS rows/block      
---------- ----------- ------------- ---------- ----------      
         2       26682       7500528    7500528 281.108163      

Elapsed: 00:00:00.08
After index coalesce we can see that Oracle is able to allocate space from free block and size of index remain stable and number of rows in one leaf block is now increasing.
This post has been created based on test data and test scenario but after sucesfull implementation in life systems I will blog about real numbers from production databases.

Wednesday, January 12, 2011

Moving standby database or Oracle know limitation story

It has been a long time since I last blogged. There were some changes in my professional live and now I'm again more close with 24/7 databases. 
I really like Oracle approach to know bugs or limitation and dealing with that from version to version. I have hit one of those recently and I want to share it with all my readers. This know limitation/bug appear first time in 10g and still exist in 11g R1.
I had a simple task to do - move standby database managed by DataGuard to new host. According to some bandwidth limitation I couldn't copy files from old host but I need to copy it from production box. In 11g it is not a problem we can use duplication from active database so we don't need to wait until backup solution will be setup on new box. This is what I have chosen  - RMAN and duplicate from active database.
Here is a command line used by me:
$ rman target sys@production auxiliary sys@standby
RMAN> duplicate target database for standby from active database dorecover NOFILENAMECHECK; 
And here is my point - I have specified NOFILENAMECHECK so Oracle allow me to keep same disk layout on production and standby database and will no claim that datafiles have the same names on both boxes. So far so good - duplication process has been finished without any issues.
I have started standby database and there is where I my problem begun.
Here are an example entry from alert.log
Errors in file /logs/diag/rdbms/dbs0_a/dbs0/trace/dbs0_mrp0_23247.trc:
ORA-00313: open failed for members of log group 1 of thread 1
ORA-00312: online log 1 thread 1: '/redo1/dbs0/redo11.log'
ORA-27037: unable to obtain file status
Linux-x86_64 Error: 2: No such file or directory
Additional information: 3
Errors in file /logs/diag/rdbms/dbs0_a/dbs0/trace/dbs0_mrp0_23247.trc:
ORA-00313: open failed for members of log group 1 of thread 1
ORA-00312: online log 1 thread 1: '/redo1/dbs0/redo11.log'
ORA-27037: unable to obtain file status
Linux-x86_64 Error: 2: No such file or directory
Additional information: 3
Clearing online redo logfile 1 /redo1/dbs0/redo11.log
Clearing online log 1 of thread 1 sequence number 2357
Errors in file /logs/diag/rdbms/dbs0_a/dbs0/trace/dbs0_mrp0_23247.trc:
ORA-00313: open failed for members of log group 1 of thread 1
ORA-00312: online log 1 thread 1: '/redo1/dbs0/redo11.log'
ORA-27037: unable to obtain file status
Linux-x86_64 Error: 2: No such file or directory
Additional information: 3
Errors in file /logs/diag/rdbms/dbs0_a/dbs0/trace/dbs0_mrp0_23247.trc:
ORA-19527: physical standby redo log must be renamed
ORA-00312: online log 1 thread 1: '/redo1/dbs0/redo11.log'
Clearing online redo logfile 1 complete
Oracle is trying to open redo log file which doesn't exist on standby machine - this is a first error and of course expected one. Than Oracle is trying to clear (in that case recreate) redo log and it failed with error ORA-19527 and of course at the end it is informing DBA that clearing online redo log 1 has been completed.
Unfortunately this files has not been created and this error appear every time standby database has been restarted or recovery process has been restarted. The main issue is ORA-19527 "physical standby redo log must be renamed" even if in that case this is not a standby redo but online redo log. Simplest solution like drop and add a new groups of redo logs failed as Oracle is claiming that all those unexisting files will be used to database recovery. Hmmm - i like that. This is what Oracle support answer for that problem (783113.1) "These are not bug's, they are known limitations" and proposed solution is to set LOG_FILE_NAME_CONVERT to convert old redo log name to new one - even if both are same. As a part of explanation other note (352879.1) has following sentence "It is the equivalent of asking - Are you sure you want the logs to be called this....".
I can agree with that and this can be some protection to prevent overwriting of existing redo logs but why this same error happen if
  • I have specified NOFILENAME checking (ok I could do it by mistake) 
  • There were not online redo logs files nor standby logs files on my new box so I can't imagine how Oracle could overwritten one of those files
This is KNOW limitation so why RMAN is not forcing user to specify LOG_FILE_NAME_CONVERT or not checking if this parameter is set in new instance ?
regards,
Marcin

Tuesday, November 2, 2010

Unsafe deinstall using Oracle Univeral Installer.

Are Oracle Homes totally independent from OUI point of view? This was my impression until yesterday when I hit Oracle OUI de-installer bug in Oracle Real Application Cluster environment. I did 3 simple steps:
  • I have installed new Oracle Home into new directory on both nodes
  • There was timing issue between servers so I need to remove new Oracle Home using OUI de-install functionality.
  • I have corrected timing issue and installed new Oracle Home once again.
All old homes have been used by working Oracle ClusterWare, ASM and database and I believed that there should not be any issues. Unfortunately a few minutes after that work I have got first error message but only for new session:
ORA-27504: IPC error creating OSD context 
ORA-27300: OS system dependent operation: IPC init failed with status:65 
ORA-27301: OS failure message: Package not installed 
ORA-27302: failure occurred at: skgxpcini 
ORA-27303: additional information: libskgxpd.so called
I was surprised and I have checked $ORACLE_HOME/lib/ directory and I have found root cause - libskgxpd.so has very recent modification date and it was date of my installation. I have checked all OUI logs immediately looking for that file and in deinstallation log I found that entry:
INFO: The output of this make operation is also available at: '/opt/app/oracle/product/10.2.0/Db_new/install/make.log'
INFO:
INFO: Start output from spawned process:
INFO: ----------------------------------
INFO:
INFO: rm -f /opt/app/oracle/product/10.2.0/Db_1/lib/libskgxp10.so
INFO: cp /opt/app/oracle/product/10.2.0/Db_1/lib//libskgxpd.so /opt/app/oracle/product/10.2.0/Db_1/lib/libskgxp10.so
But in that same log I have found following message at the end:
INFO: Current Inventory:
        Oracle Home: OraCrs10g_home
                This Oracle Home has not been changed in this session.
        Oracle Home: ASM_HOME
                This Oracle Home has not been changed in this session.
        Oracle Home: RDBMS_HOME
                This Oracle Home has not been changed in this session.
INFO: This deinstallation was successful
So OUI replaced library libskgxp10.so in existing old home (RDBMS_HOME) and at the end of work stated that this home has not been change. There are bugs related to that : 7006848, 5474623.

If you ever hit that error – there is two possible solutions:
  1. Copy backup libskgxp10.so from untouched home if you have home with this same patch level (in my case it was ASM)
  2. Rebuild Oracle using the following steps   
    • cd $ORACLE_HOME/rdbms/lib   
      rename the original library (if exists)
      mv libskgxp10.so libskgxp10.so.old  
    • Relink to configure UDP for IPC    
      make -f ins_rdbms.mk rac_on ipc_udp ioracle   
    • Check whether the library exists    
      ls -l $ORACLE_HOME/lib/libskgxp10.so
    • startup the instance
And good advise at the end – your testing environment should be same as production one. If you are going to do some work on RAC test it on your test RAC too.