Wednesday, February 4, 2009

RMAN - random errors from years


I have been work with RMAN from 8 years and I'm still wondering why some of RMAN errors are taking from /dev/random ;)

Last example:

Environment : Linux 32 bit - Oracle 10g 10.2.04 on ASM

Performed steps:
  1. Drop existing test DB from ASM - using drop database
  2. Copy backup from production server into test server
  3. restore controlfile from new location
  4. mount database
After that I wanted to restore a database. So I have catalog all necessary backup pieces
in controlfile and check it using list backupset command.
There was one backupset with all datafile with correct status. So it is simple let try to restore

RMAN>restore database;

creating datafile No=1 name=+DATA/oracle/orcl/datafile/o1_mf_system_3n5w1nky_.dbf
released channel: t1
released channel: t2
RMAN-00571: ===========================================================
RMAN-00569: =============== ERROR MESSAGE STACK FOLLOWS ===============
RMAN-00571: ===========================================================
RMAN-03002: failure of restore command at 07/04/2008 10:54:29
ORA-01180: can not create datafile 1
ORA-01110: data file 1: '+DATA/dataprd/ORCL/datafile/o1_mf_system_3n5w1nky_.dbf'

Yeah, nice error.
There is some notes on metalink related to duplicated incarnation and corrupted controlfile
(BTW there is a solution to recreate a controlfile from command line before you restore datafiles - it is possible to recreate a controlfile without datafiles ???)

Anyway there was not my case.

A solution is very simple - I have found out that during catalog phase RMAN is scaning existing flash recovery area and I found archive logs in backupset from previous (droped) database and
bacuse there was differect incarnation of that archive log ... incarnation of my new database has been changed too. And now we have a strange behaviour of RMAN.

RMAN> list backupset;

still display a valid backups for that incarnation

RMAN> restore database;

raise error (see above)


RMAN> reset database to incarnation xxx;

where xxx is a previous incarnation of database.

I can understand that Oracle could use a backup from previous incarnation in new (but why ?)
but why there is so stuip error about datafile number 1 ?

Is is impossible to display something more useful like there is no backup for that incarnation ?

All databases have this same DBID - there are clones
I know there is a bad idea to keep one DBID for many databases but I have thought that with RMAN catalog there is no issue.



Unknown said...

Yep, this is exactly what I have found out - the same stupid error, but after reseting incarnation, restore started nicely.