Latest version of Oracle 11g R2 has new Recovery Manager functionality - database duplication from backup without connectivity to target database. During tests I have found unexpected behavior of that functionality. I'm still not sure if it is bug or RMAN is doing it by design but if backup doesn't contain all files and SKIP TABLESPACE is not included, duplication process can overwrite target database files. When database is using ASM it can happen regardless of database state. Due to file locking mechanism this issue occur on file system only when target database is down.
All has been performed using Oracle version 11.2.0.2 on Red Hat 5.3 running target and auxiliary database on one server using file system. I have tested it with ASM as well but log files are gone together with my VM machine.
Target database is down, Backup has been taken without one tablespace,
no SKIP TABLESPACE clause.
Duplication script:
What happen to target database ? The following error will be found in target alert.log file
Target database is running, backup has been taken without one table space,
no SKIP TABLESPACE clause.
Duplication process is starting in similar way to previous scenario but instead of overwriting target file the following error is reporting during recovery process.
Target database is down, Backup has been taken without one tablespace,
SKIP TABLESPACE clause has been setup
This time backup still doesn't have all files but SKIP_TABLESPACE has been added to script. Recovery manager is working now without any issues
This exercise is showing that all new features should be tested in safe environment and in addition to that every operation on production database has to be well tested. I can't blame only Recovery Manager here but it would nice to have more checks in place (ex. if not all files are in backup just cancel duplication after control file has been restored). This is a perfect example of human error (at least someone created that scripts and run it) which can in complex environment.
regards,
Marcin
All has been performed using Oracle version 11.2.0.2 on Red Hat 5.3 running target and auxiliary database on one server using file system. I have tested it with ASM as well but log files are gone together with my VM machine.
Target database is down, Backup has been taken without one tablespace,
no SKIP TABLESPACE clause.
Duplication script:
Let's take a look on output - unimportant lines have been skippedrun { set newname for tempefile 1 to '/opt/app/oracle/oradata/clon/temp01.dbf'; duplicate target database to clon SPFILE PARAMETER_VALUE_CONVERT '/opt/app/oracle/admin/testdb/', '/opt/app/oracle/admin/clon/' SET SGA_MAX_SIZE '500M' SET CONTROL_FILES '/opt/app/oracle/oradata/clon/control01.ctl','/opt/app/oracle/oradata/clon/control02.ctl' SET SGA_TARGET '450M' SET CLUSTER_DATABASE 'false' SET db_create_file_dest '/opt/app/oracle/oradata/clon/' SET db_create_online_log_dest_1 '/opt/app/oracle/oradata/clon/' SET LOG_FILE_NAME_CONVERT '/opt/app/oracle/oradata/testdb/','/opt/app/oracle/oradata/clon/' SET DB_FILE_NAME_CONVERT '/opt/app/oracle/oradata/testdb/','/opt/app/oracle/oradata/clon/' BACKUP LOCATION '/opt/app/oracle/backup/'; }
So far so good - new name has been set for all files. What is nextRecovery Manager: Release 11.2.0.2.0 - Production on Tue Jun 21 07:17:34 2011 Copyright (c) 1982, 2009, Oracle and/or its affiliates. All rights reserved. connected to auxiliary database: CLON (not mounted) Starting Duplicate Db at 21-JUN-11 ... Starting restore at 21-JUN-11 allocated channel: ORA_AUX_DISK_1 channel ORA_AUX_DISK_1: SID=171 device type=DISK channel ORA_AUX_DISK_1: restoring control file channel ORA_AUX_DISK_1: restore complete, elapsed time: 00:00:03 output file name=/opt/app/oracle/oradata/clon/control01.ctl output file name=/opt/app/oracle/oradata/clon/control02.ctl Finished restore at 21-JUN-11 database mounted released channel: ORA_AUX_DISK_1 allocated channel: ORA_AUX_DISK_1 channel ORA_AUX_DISK_1: SID=171 device type=DISK contents of Memory Script: { set until scn 95159515; set newname for datafile 1 to "/opt/app/oracle/oradata/clon/system-01.dbf"; set newname for datafile 2 to "/opt/app/oracle/oradata/clon/undo_t1-01.dbf"; set newname for datafile 3 to "/opt/app/oracle/oradata/clon/sysaux-01.dbf"; set newname for datafile 4 to "/opt/app/oracle/oradata/clon/administrator-01.dbf"; set newname for datafile 5 to "/opt/app/oracle/oradata/clon/administrator_idx-01.dbf"; set newname for datafile 6 to "/opt/app/oracle/oradata/clon/replication-01.dbf"; set newname for datafile 7 to "/opt/app/oracle/oradata/clon/replication_idx-01.dbf"; set newname for datafile 8 to "/opt/app/oracle/oradata/clon/soe.dbf"; set newname for datafile 9 to "/opt/app/oracle/oradata/clon/soeindex.dbf"; set newname for datafile 10 to "/opt/app/oracle/oradata/clon/users-01.dbf"; restore clone database ; } executing Memory Script executing command: SET until clause executing command: SET NEWNAME executing command: SET NEWNAME executing command: SET NEWNAME executing command: SET NEWNAME executing command: SET NEWNAME executing command: SET NEWNAME executing command: SET NEWNAME executing command: SET NEWNAME executing command: SET NEWNAME executing command: SET NEWNAME
All files from backupset have been restored and Oracle realize that one file is missing. Keep in mind that new name has been prepared for all files.file 9 is excluded from whole database backup channel ORA_AUX_DISK_1: starting datafile backup set restore channel ORA_AUX_DISK_1: specifying datafile(s) to restore from backup set channel ORA_AUX_DISK_1: restoring datafile 00001 to /opt/app/oracle/oradata/clon/system-01.dbf channel ORA_AUX_DISK_1: restoring datafile 00002 to /opt/app/oracle/oradata/clon/undo_t1-01.dbf channel ORA_AUX_DISK_1: restoring datafile 00003 to /opt/app/oracle/oradata/clon/sysaux-01.dbf channel ORA_AUX_DISK_1: restoring datafile 00004 to /opt/app/oracle/oradata/clon/administrator-01.dbf channel ORA_AUX_DISK_1: restoring datafile 00005 to /opt/app/oracle/oradata/clon/administrator_idx-01.dbf channel ORA_AUX_DISK_1: restoring datafile 00006 to /opt/app/oracle/oradata/clon/replication-01.dbf channel ORA_AUX_DISK_1: restoring datafile 00007 to /opt/app/oracle/oradata/clon/replication_idx-01.dbf channel ORA_AUX_DISK_1: restoring datafile 00008 to /opt/app/oracle/oradata/clon/soe.dbf channel ORA_AUX_DISK_1: restoring datafile 00010 to /opt/app/oracle/oradata/clon/users-01.dbf channel ORA_AUX_DISK_1: reading from backup piece /opt/app/oracle/backup/bck_13_testdb_1 channel ORA_AUX_DISK_1: piece handle=/opt/app/oracle/backup/bck_13_testdb_1 tag=TAG20110621T064346 channel ORA_AUX_DISK_1: restored backup piece 1 channel ORA_AUX_DISK_1: restore complete, elapsed time: 00:06:36 failover to previous backup Finished restore at 21-JUN-11
This switch command is starting this issue - missing file hasn't been switch to new locationcontents of Memory Script: { switch clone datafile all; } executing Memory Script datafile 1 switched to datafile copy input datafile copy RECID=10 STAMP=754384699 file name=/opt/app/oracle/oradata/clon/system-01.dbf datafile 2 switched to datafile copy input datafile copy RECID=11 STAMP=754384699 file name=/opt/app/oracle/oradata/clon/undo_t1-01.dbf datafile 3 switched to datafile copy input datafile copy RECID=12 STAMP=754384699 file name=/opt/app/oracle/oradata/clon/sysaux-01.dbf datafile 4 switched to datafile copy input datafile copy RECID=13 STAMP=754384699 file name=/opt/app/oracle/oradata/clon/administrator-01.dbf datafile 5 switched to datafile copy input datafile copy RECID=14 STAMP=754384699 file name=/opt/app/oracle/oradata/clon/administrator_idx-01.dbf datafile 6 switched to datafile copy input datafile copy RECID=15 STAMP=754384699 file name=/opt/app/oracle/oradata/clon/replication-01.dbf datafile 7 switched to datafile copy input datafile copy RECID=16 STAMP=754384699 file name=/opt/app/oracle/oradata/clon/replication_idx-01.dbf datafile 8 switched to datafile copy input datafile copy RECID=17 STAMP=754384699 file name=/opt/app/oracle/oradata/clon/soe.dbf datafile 10 switched to datafile copy input datafile copy RECID=18 STAMP=754384700 file name=/opt/app/oracle/oradata/clon/users-01.dbf
The worst part is here - RMAN was trying to apply logs to other database using entries from control file - the TARGET database entries. DBID at that time for target and clone database are still the same and target database was down so there are no locks. Nothing could prevent this process to overwrite target database file header.contents of Memory Script: { set until scn 95159515; recover clone database delete archivelog ; } executing Memory Script executing command: SET until clause Starting recover at 21-JUN-11 using channel ORA_AUX_DISK_1 starting media recovery channel ORA_AUX_DISK_1: starting archived log restore to default destination channel ORA_AUX_DISK_1: restoring archived log archived log thread=1 sequence=1024 channel ORA_AUX_DISK_1: restoring archived log archived log thread=1 sequence=1025 channel ORA_AUX_DISK_1: reading from backup piece /opt/app/oracle/backup/bck_arch_16_testdb_1 channel ORA_AUX_DISK_1: piece handle=/opt/app/oracle/backup/bck_arch_16_testdb_1 tag=TAG20110621T070623 channel ORA_AUX_DISK_1: restored backup piece 1 channel ORA_AUX_DISK_1: restore complete, elapsed time: 00:00:03 archived log file name=/opt/app/oracle/fra/CLON/archivelog/2011_06_21/o1_mf_1_1024_700kfxcx_.arc thread=1 sequence=1024 channel clone_default: deleting archived log(s) archived log file name=/opt/app/oracle/fra/CLON/archivelog/2011_06_21/o1_mf_1_1024_700kfxcx_.arc RECID=2 STAMP=754384704 archived log file name=/opt/app/oracle/fra/CLON/archivelog/2011_06_21/o1_mf_1_1025_700kfxg6_.arc thread=1 sequence=1025 Oracle Error: ORA-01547: warning: RECOVER succeeded but OPEN RESETLOGS would get error below ORA-01152: file 9 was not restored from a sufficiently old backup ORA-01110: data file 9: '/opt/app/oracle/oradata/testdb/soeindex.dbf'
What happen to target database ? The following error will be found in target alert.log file
File in target database has been corrupted during duplication process and it has to be restored from backup or dropped.Read of datafile '/opt/app/oracle/oradata/testdb/soeindex.dbf' (fno 9) header failed with ORA-01206 Rereading datafile 9 header failed with ORA-01206 Errors in file /opt/app/oracle/diag/rdbms/testdb/testdb/trace/testdb_dbw0_4984.trc: ORA-01186: file 9 failed verification tests ORA-01122: database file 9 failed verification check
Target database is running, backup has been taken without one table space,
no SKIP TABLESPACE clause.
Duplication process is starting in similar way to previous scenario but instead of overwriting target file the following error is reporting during recovery process.
This time duplication process failed but target database wasn't touched.RMAN-11001: Oracle Error: ORA-00283: recovery session canceled due to errors ORA-01110: data file 9: '/opt/app/oracle/oradata/testdb/soeindex.dbf' ORA-01157: cannot identify/lock data file 9 - see DBWR trace file ORA-01110: data file 9: '/opt/app/oracle/oradata/testdb/soeindex.dbf'
Target database is down, Backup has been taken without one tablespace,
SKIP TABLESPACE clause has been setup
This time backup still doesn't have all files but SKIP_TABLESPACE has been added to script. Recovery manager is working now without any issues
Additional step has been run before recover process - file from skipped table space has been dropped. In that case recovery process should touch it anymore from any location.Recovery Manager: Release 11.2.0.2.0 - Production on Tue Jun 21 09:32:21 2011 Copyright (c) 1982, 2009, Oracle and/or its affiliates. All rights reserved. connected to auxiliary database: CLON (not mounted) RMAN> run { 2> set newname for tempfile 1 to '/opt/app/oracle/oradata/clon/temp01.dbf'; 3> 4> duplicate target database to clon 5> SPFILE 6> PARAMETER_VALUE_CONVERT '/opt/app/oracle/admin/testdb/', 7> '/opt/app/oracle/admin/clon/' 8> SET SGA_MAX_SIZE '500M' 9> SET CONTROL_FILES '/opt/app/oracle/oradata/clon/control01.ctl','/opt/app/oracle/oradata/clon/control02.ctl' 10> SET SGA_TARGET '450M' 11> SET CLUSTER_DATABASE 'false' 12> SET db_create_file_dest '/opt/app/oracle/oradata/clon/' 13> SET db_create_online_log_dest_1 '/opt/app/oracle/oradata/clon/' 14> SET LOG_FILE_NAME_CONVERT '/opt/app/oracle/oradata/testdb/','/opt/app/oracle/oradata/clon/' 15> SET DB_FILE_NAME_CONVERT '/opt/app/oracle/oradata/testdb/','/opt/app/oracle/oradata/clon/' 16> SKIP TABLESPACE 'SOEINDEX' 17> UNDO TABLESPACE 'UNDO_T1' 18> BACKUP LOCATION '/opt/app/oracle/backup/'; 19> } 20> executing command: SET NEWNAME Starting Duplicate Db at 21-JUN-11 ... allocated channel: ORA_AUX_DISK_1 channel ORA_AUX_DISK_1: SID=171 device type=DISK Not connected to TARGET or TARGET not open, cannot verify that subset of tablespaces is self-contained Not connected to TARGET, cannot verify that set of tablespaces being duplicated does not have SYS objects Datafile 9 skipped by request contents of Memory Script: { set until scn 95159515; set newname for datafile 1 to "/opt/app/oracle/oradata/clon/system-01.dbf"; set newname for datafile 2 to "/opt/app/oracle/oradata/clon/undo_t1-01.dbf"; set newname for datafile 3 to "/opt/app/oracle/oradata/clon/sysaux-01.dbf"; set newname for datafile 4 to "/opt/app/oracle/oradata/clon/administrator-01.dbf"; set newname for datafile 5 to "/opt/app/oracle/oradata/clon/administrator_idx-01.dbf"; set newname for datafile 6 to "/opt/app/oracle/oradata/clon/replication-01.dbf"; set newname for datafile 7 to "/opt/app/oracle/oradata/clon/replication_idx-01.dbf"; set newname for datafile 8 to "/opt/app/oracle/oradata/clon/soe.dbf"; set newname for datafile 10 to "/opt/app/oracle/oradata/clon/users-01.dbf"; restore clone database skip forever tablespace "SOEINDEX" ; } executing Memory Script executing command: SET until clause executing command: SET NEWNAME executing command: SET NEWNAME executing command: SET NEWNAME executing command: SET NEWNAME executing command: SET NEWNAME executing command: SET NEWNAME executing command: SET NEWNAME executing command: SET NEWNAME executing command: SET NEWNAME Starting restore at 21-JUN-11 using channel ORA_AUX_DISK_1 channel ORA_AUX_DISK_1: starting datafile backup set restore channel ORA_AUX_DISK_1: specifying datafile(s) to restore from backup set channel ORA_AUX_DISK_1: restoring datafile 00001 to /opt/app/oracle/oradata/clon/system-01.dbf channel ORA_AUX_DISK_1: restoring datafile 00002 to /opt/app/oracle/oradata/clon/undo_t1-01.dbf channel ORA_AUX_DISK_1: restoring datafile 00003 to /opt/app/oracle/oradata/clon/sysaux-01.dbf channel ORA_AUX_DISK_1: restoring datafile 00004 to /opt/app/oracle/oradata/clon/administrator-01.dbf channel ORA_AUX_DISK_1: restoring datafile 00005 to /opt/app/oracle/oradata/clon/administrator_idx-01.dbf channel ORA_AUX_DISK_1: restoring datafile 00006 to /opt/app/oracle/oradata/clon/replication-01.dbf channel ORA_AUX_DISK_1: restoring datafile 00007 to /opt/app/oracle/oradata/clon/replication_idx-01.dbf channel ORA_AUX_DISK_1: restoring datafile 00008 to /opt/app/oracle/oradata/clon/soe.dbf channel ORA_AUX_DISK_1: restoring datafile 00010 to /opt/app/oracle/oradata/clon/users-01.dbf channel ORA_AUX_DISK_1: reading from backup piece /opt/app/oracle/backup/bck_13_testdb_1 channel ORA_AUX_DISK_1: piece handle=/opt/app/oracle/backup/bck_13_testdb_1 tag=TAG20110621T064346 channel ORA_AUX_DISK_1: restored backup piece 1 channel ORA_AUX_DISK_1: restore complete, elapsed time: 00:06:25 Finished restore at 21-JUN-11 contents of Memory Script: { switch clone datafile all; } executing Memory Script datafile 1 switched to datafile copy input datafile copy RECID=10 STAMP=754393152 file name=/opt/app/oracle/oradata/clon/system-01.dbf datafile 2 switched to datafile copy input datafile copy RECID=11 STAMP=754393152 file name=/opt/app/oracle/oradata/clon/undo_t1-01.dbf datafile 3 switched to datafile copy input datafile copy RECID=12 STAMP=754393152 file name=/opt/app/oracle/oradata/clon/sysaux-01.dbf datafile 4 switched to datafile copy input datafile copy RECID=13 STAMP=754393152 file name=/opt/app/oracle/oradata/clon/administrator-01.dbf datafile 5 switched to datafile copy input datafile copy RECID=14 STAMP=754393152 file name=/opt/app/oracle/oradata/clon/administrator_idx-01.dbf datafile 6 switched to datafile copy input datafile copy RECID=15 STAMP=754393152 file name=/opt/app/oracle/oradata/clon/replication-01.dbf datafile 7 switched to datafile copy input datafile copy RECID=16 STAMP=754393152 file name=/opt/app/oracle/oradata/clon/replication_idx-01.dbf datafile 8 switched to datafile copy input datafile copy RECID=17 STAMP=754393152 file name=/opt/app/oracle/oradata/clon/soe.dbf datafile 10 switched to datafile copy input datafile copy RECID=18 STAMP=754393152 file name=/opt/app/oracle/oradata/clon/users-01.dbf contents of Memory Script: { set until scn 95159515; recover clone database skip forever tablespace "SOEINDEX" delete archivelog ; } executing Memory Script executing command: SET until clause Starting recover at 21-JUN-11 using channel ORA_AUX_DISK_1 Executing: alter database datafile 9 offline drop starting media recovery channel ORA_AUX_DISK_1: starting archived log restore to default destination channel ORA_AUX_DISK_1: restoring archived log archived log thread=1 sequence=1024 channel ORA_AUX_DISK_1: restoring archived log archived log thread=1 sequence=1025 channel ORA_AUX_DISK_1: reading from backup piece /opt/app/oracle/backup/bck_arch_16_testdb_1 channel ORA_AUX_DISK_1: piece handle=/opt/app/oracle/backup/bck_arch_16_testdb_1 tag=TAG20110621T070623 channel ORA_AUX_DISK_1: restored backup piece 1 channel ORA_AUX_DISK_1: restore complete, elapsed time: 00:00:07 archived log file name=/opt/app/oracle/fra/CLON/archivelog/2011_06_21/o1_mf_1_1024_700sp3lj_.arc thread=1 sequence=1024 channel clone_default: deleting archived log(s) archived log file name=/opt/app/oracle/fra/CLON/archivelog/2011_06_21/o1_mf_1_1024_700sp3lj_.arc RECID=2 STAMP=754393158 archived log file name=/opt/app/oracle/fra/CLON/archivelog/2011_06_21/o1_mf_1_1025_700sp3p1_.arc thread=1 sequence=1025 channel clone_default: deleting archived log(s) archived log file name=/opt/app/oracle/fra/CLON/archivelog/2011_06_21/o1_mf_1_1025_700sp3p1_.arc RECID=1 STAMP=754393155 media recovery complete, elapsed time: 00:00:04 Finished recover at 21-JUN-11 Oracle instance started
This exercise is showing that all new features should be tested in safe environment and in addition to that every operation on production database has to be well tested. I can't blame only Recovery Manager here but it would nice to have more checks in place (ex. if not all files are in backup just cancel duplication after control file has been restored). This is a perfect example of human error (at least someone created that scripts and run it) which can in complex environment.
regards,
Marcin
0 comments:
Post a Comment