Oracle Performance and Backup Blog: Oracle RAC One Node

In this post I want to share with you some scenarios about RAC One Node and client behaviour. In that part I’m using SQL*Plus from 10g so there is no FAN notification, next part will be with FAN enabled client. The main difference between using and not using FAN is that FAN client has a feedback from Oracle Cluster (formally Grid Infrastructure in 11R2 or Oracle ClusterWare in previous releases) and has information about current cluster status. Non FAN client has to wait for TCP/IP time-out for current session before switching to other node. More information about client failover and cluster configuration can be found here.

My server and client configuration has been presented in part 1 (entry from TNSNAMES.ORA) so this is only short recall:

Two nodes – RAC1 and RAC2 with SCAN configured
SQL*Plus 10g Client

I want to present 3 scenarios:

One node in cluster is running

Two nodes are running – current instance is crashing

Active query

Non active session

Fixing configuration

One node in cluster is running
Due to failure or maintenance only one node in cluster is running. RAC One node service is starting on that node and it is opened for all sessions. Irrespective of which node was current previously instance is open on node which is running now. Current status can be checked via raconestatus


[oracle@rac1 ~]$ raconestatus

RAC One Node databases on this cluster:


Database UP Fix Required        Current Server                    Candidate Server Names
======== == ============ ==============================  ========================================
testone  Y       N                                rac1                                 rac1 rac2                             


Available Free Servers:

As we can see there is no available server to move our instance on. If we start Omotion process we will hit error.


[oracle@rac1 ~]$ Omotion

RAC One Node databases on this cluster:

#      Database                    Server                      Fix Required
===     ========        ==============================          ============
[1]      testone                                  rac1                N

Enter number of the database to migrate [1]:

Specify maximum time in minutes for migration to complete (max 30) [30]: 2

Available Target Server(s) :
#            Server            Available
===     ==================      =========
ERROR: Cannot find any available servers in this cluster.

Exiting...

[oracle@rac1 ~]$

In case of shutdown node RAC1 all session will be terminated as in single node configuration. Whatever server will be started after both nodes are down will become a current one.

Two nodes are running – current instance is crashing
Both servers are up – instance is running on RAC1 server. Current cluster configuration can be checked via crsctl command


[oracle@rac1 grid]$ $GRID_HOME/bin/crsctl status resource -t
--------------------------------------------------------------------------------
NAME           TARGET  STATE        SERVER                   STATE_DETAILS
--------------------------------------------------------------------------------
Local Resources
--------------------------------------------------------------------------------
ora.DATA.dg
        ONLINE  ONLINE       rac1
        ONLINE  ONLINE       rac2
ora.LISTENER.lsnr
        ONLINE  ONLINE       rac1
        ONLINE  ONLINE       rac2
ora.asm
        ONLINE  ONLINE       rac1                     Started
        ONLINE  ONLINE       rac2                     Started
ora.eons
        ONLINE  ONLINE       rac1
        ONLINE  ONLINE       rac2
ora.gsd
        OFFLINE OFFLINE      rac1
        OFFLINE OFFLINE      rac2
ora.net1.network
        ONLINE  ONLINE       rac1
        ONLINE  ONLINE       rac2
ora.ons
        ONLINE  ONLINE       rac1
        ONLINE  ONLINE       rac2
--------------------------------------------------------------------------------
Cluster Resources
--------------------------------------------------------------------------------
ora.LISTENER_SCAN1.lsnr
1        ONLINE  ONLINE       rac1
ora.oc4j
1        OFFLINE OFFLINE
ora.rac1.vip
1        ONLINE  ONLINE       rac1
ora.rac2.vip
1        ONLINE  ONLINE       rac2
ora.scan1.vip
1        ONLINE  ONLINE       rac1
ora.testone.db
1        ONLINE  ONLINE       rac1                     Open
ora.testone.serviceone.svc
1        ONLINE  ONLINE       rac1
[oracle@rac2 grid]$

Session is connected to RAC1 server


Connected to:
Oracle Database 11g Enterprise Edition Release 11.2.0.1.0 - Production
With the Partitioning, Real Application Clusters, Automatic Storage Management, OLAP,
Data Mining and Real Application Testing options

SQL> select * from test;

 ID
----------
  1
  2
  3
  4
  5
  6

6 rows selected.

SQL> insert into test values (7);

1 row created.

SQL> commit;

Commit complete.

SQL> insert into test values (8);

1 row created.

SQL> select instance_name, host_name from v$instance;

INSTANCE_NAME              HOST_NAME
----------------           ----------------------------------------------------------------
testone_2                  rac1

SQL>

Active Query
Now first scenario when I’m repeating a select command in SQL*Plus and in that same time I’m turning off RAC1 server. In that case a active statement is cancelled and client is trying to reconnect at that time.


SQL> select to_char(sysdate,'hh24:mi') from dual;

TO_CH
-----
09:55

SQL> select to_char(sysdate,'hh24:mi') from dual;
select to_char(sysdate,'hh24:mi') from dual
*
ERROR at line 1:
ORA-03135: connection lost contact


SQL> select to_char(sysdate,'hh24:mi') from dual;
ERROR:
ORA-03114: not connected to ORACLE

As you can see command has been cancelled and session has been terminated. There is no other instance up and running yet to there is no target to failover a session. In client session network trace file you can find following order of errors:

ORA -12547 – TNS Lost Contact

ORA- 12151 – Bad packages

ORA-03115 – connection lost contact

After that client is going to reconnect session to other nodes like in normal RAC configuration. Client is calling SCAN address to get another VIP address but cluster is still in that same state – a reconfiguration has been just started – and Oracle Cluster is returning a rac1-vip once again as it is only one VIP configured for database service.
Keep in mind that current, not committed transaction in that case will be rollback.

Idle session
When a session is in an idle state a failover will take place when a first SQL query will be executed. See example


Connected to:
Oracle Database 11g Enterprise Edition Release 11.2.0.1.0 - Production
With the Partitioning, Real Application Clusters, Automatic Storage Management, OLAP,
Data Mining and Real Application Testing options

SQL> select to_char(sysdate,'hh24:mi') from dual;

TO_CH
-----
15:30


SQL> select instance_name, host_name from v$instance;

INSTANCE_NAME            HOST_NAME
----------------         ----------------------------------------------------------------
testone_2                rac1


SQL> select to_char(sysdate,'hh24:mi') from dual;

TO_CH
-----
15:38

SQL>
SQL> select instance_name, host_name from v$instance;

INSTANCE_NAME            HOST_NAME
----------------         ----------------------------------------------------------------
testone_1                rac2

SQL>

As you can see a connection has been migrated between servers without any error. This is because a before a last query a new instance was up and running. If we take a look into client network trace file we will see following errors

ORA- 12151
ORA- 3113

After that client is going to reconnect session to other nodes. Client is calling SCAN address to get another VIP address and cluster in returning a new VIP (rac2-vip) as a database service has been successfully migrated to other node.
Keep in mind that current, not committed transaction has to be rollback after session migration.

Fixing configuration
After unexpected crash it is possible that RAC One Node configuration need to be fixed. I have figure out two possible scenarios:

After node crash and migration of instance and service the crashed node has been rebooted and there are two running instances


[oracle@rac1 grid]$ raconestatus

RAC One Node databases on this cluster:


Database UP Fix Required        Current Server                    Candidate Server Names
======== == ============ ==============================  ========================================
testone  Y       Y                           rac1 rac2                                 rac1 rac2


Available Free Servers:

[oracle@rac1 grid]$

Fixing of configuration will close one of instance (on rebooted node) and will clean a configuration.


[oracle@rac1 rac1]$ raconefix

RAC One Node databases on this cluster:

#      Database                    Server                      Fix Required
===     ========        ==============================          ============
[1]      testone                                  rac1                Y

Enter number of the database to fix [1]:

If database is up, it will be checked and cleaned after a previous fail over.

Very similar situation like in previous point but only one instance is working and there is a “mess” in RAC service configuration. Oracle RAC One Node fix will clean up a configuration.

At the end I want to mention that I have hit some strange behavior related to migration of instance between servers and starting and stopping servers. I’m not sure if it a bug or not but after whole cluster restart instances have been migrated across.
First configuration was

RAC1 – instance name : testone_2
RAC2 – instance name : testone_1

Yes I know it was my inconsequence in naming so Oracle decided to fix it. No, not at the beginning but after a week of testing and restarts. And now it looks like:

RAC1 – instance name : testone_1
RAC2 – instance name : testone_2

Unfortunately after that I was unable to play with RAC One Node and I have to clear a whole configuration and create it from scratch.

Next part will be about FAN enabled clients and more about transactions.

1 comments:

Vladimir said...: This is an excellent article.
Still, if you want something more hands-on, try these:
http://vgrigorian.com/11gsimulator/1_rac11gr2.htm
http://vgrigorian.com/11gsimulator/2_rac11gr2rdbms1.htm
http://vgrigorian.com/11gsimulator/3_rac11gasm.htm
http://vgrigorian.com/11gsimulator/4_11gr2dbcreate.htm

You can find more demos (including dataguard, goldengate, streams) there at http://vgrigorian.com/

Thanks.
Vladimir Grigorian; May 16, 2011 at 6:04 AM

Monday, December 14, 2009

Oracle RAC One Node – part 2

1 comments:

Subscribe Now: Feed Icon

Search This Blog

Author

Blog Archive

Label Cloud

My Blog List - Alphabetically ordered

Counter