Wednesday, January 12, 2011

Moving standby database or Oracle know limitation story

It has been a long time since I last blogged. There were some changes in my professional live and now I'm again more close with 24/7 databases. 
I really like Oracle approach to know bugs or limitation and dealing with that from version to version. I have hit one of those recently and I want to share it with all my readers. This know limitation/bug appear first time in 10g and still exist in 11g R1.
I had a simple task to do - move standby database managed by DataGuard to new host. According to some bandwidth limitation I couldn't copy files from old host but I need to copy it from production box. In 11g it is not a problem we can use duplication from active database so we don't need to wait until backup solution will be setup on new box. This is what I have chosen  - RMAN and duplicate from active database.
Here is a command line used by me:
$ rman target sys@production auxiliary sys@standby
RMAN> duplicate target database for standby from active database dorecover NOFILENAMECHECK; 
And here is my point - I have specified NOFILENAMECHECK so Oracle allow me to keep same disk layout on production and standby database and will no claim that datafiles have the same names on both boxes. So far so good - duplication process has been finished without any issues.
I have started standby database and there is where I my problem begun.
Here are an example entry from alert.log
Errors in file /logs/diag/rdbms/dbs0_a/dbs0/trace/dbs0_mrp0_23247.trc:
ORA-00313: open failed for members of log group 1 of thread 1
ORA-00312: online log 1 thread 1: '/redo1/dbs0/redo11.log'
ORA-27037: unable to obtain file status
Linux-x86_64 Error: 2: No such file or directory
Additional information: 3
Errors in file /logs/diag/rdbms/dbs0_a/dbs0/trace/dbs0_mrp0_23247.trc:
ORA-00313: open failed for members of log group 1 of thread 1
ORA-00312: online log 1 thread 1: '/redo1/dbs0/redo11.log'
ORA-27037: unable to obtain file status
Linux-x86_64 Error: 2: No such file or directory
Additional information: 3
Clearing online redo logfile 1 /redo1/dbs0/redo11.log
Clearing online log 1 of thread 1 sequence number 2357
Errors in file /logs/diag/rdbms/dbs0_a/dbs0/trace/dbs0_mrp0_23247.trc:
ORA-00313: open failed for members of log group 1 of thread 1
ORA-00312: online log 1 thread 1: '/redo1/dbs0/redo11.log'
ORA-27037: unable to obtain file status
Linux-x86_64 Error: 2: No such file or directory
Additional information: 3
Errors in file /logs/diag/rdbms/dbs0_a/dbs0/trace/dbs0_mrp0_23247.trc:
ORA-19527: physical standby redo log must be renamed
ORA-00312: online log 1 thread 1: '/redo1/dbs0/redo11.log'
Clearing online redo logfile 1 complete
Oracle is trying to open redo log file which doesn't exist on standby machine - this is a first error and of course expected one. Than Oracle is trying to clear (in that case recreate) redo log and it failed with error ORA-19527 and of course at the end it is informing DBA that clearing online redo log 1 has been completed.
Unfortunately this files has not been created and this error appear every time standby database has been restarted or recovery process has been restarted. The main issue is ORA-19527 "physical standby redo log must be renamed" even if in that case this is not a standby redo but online redo log. Simplest solution like drop and add a new groups of redo logs failed as Oracle is claiming that all those unexisting files will be used to database recovery. Hmmm - i like that. This is what Oracle support answer for that problem (783113.1) "These are not bug's, they are known limitations" and proposed solution is to set LOG_FILE_NAME_CONVERT to convert old redo log name to new one - even if both are same. As a part of explanation other note (352879.1) has following sentence "It is the equivalent of asking - Are you sure you want the logs to be called this....".
I can agree with that and this can be some protection to prevent overwriting of existing redo logs but why this same error happen if
  • I have specified NOFILENAME checking (ok I could do it by mistake) 
  • There were not online redo logs files nor standby logs files on my new box so I can't imagine how Oracle could overwritten one of those files
This is KNOW limitation so why RMAN is not forcing user to specify LOG_FILE_NAME_CONVERT or not checking if this parameter is set in new instance ?
regards,
Marcin