Friday, April 2, 2010

Shutdown immediate and background sqlplus process

Hello,

I have been playing with Oracle 11g and memory utilization and I finished myself with strange problem. I have started about 60 background sessions using following command:
$ sqlplus pioro/pioro @a.sql
Script a.sql is a simple one and it is allocating about 8 MB of memory for PL/SQL table but this is story for different post.
After some tests I was going to restart database. First thought was simple - shutdown immediate and that's it.
When I come back with next cup of tea database was still open. Hmmm.
Let's check alter.log

License high water mark = 59
Waiting for dispatcher 'D000' to shutdown
Waiting for shared server 'S000' to die
All dispatchers and shared servers shutdown
Tue Mar 30 09:31:00 2010
SHUTDOWN: Active processes prevent shutdown operation
Tue Mar 30 09:36:01 2010
SHUTDOWN: Active processes prevent shutdown operation
and trace file
Oracle Database 11g Enterprise Edition Release 11.1.0.7.0 - Production
With the Partitioning, Oracle Label Security, OLAP, Data Mining
and Real Application Testing options
ORACLE_HOME = /oracle/app/product/11.1.0/db_1
...
*** 2010-03-30 09:25:58.372
*** SESSION ID:(170.15) 2010-03-30 09:25:58.372
*** CLIENT ID:() 2010-03-30 09:25:58.372
*** SERVICE NAME:(SYS$USERS) 2010-03-30 09:25:58.372
*** MODULE NAME:(sqlplus@piorovm.localdomain (TNS V1-V3)) 2010-03-30 09:25:58.372
*** ACTION NAME:() 2010-03-30 09:25:58.372
...
ksukia: Starting kill, force = 0
ksukia: killed 57 out of 57 processes.

*** 2010-03-30 09:26:03.421
ksukia: Starting kill, force = 0
ksukia: Attempt 1 to re-kill process OS PID=24040.
...
ksukia: Attempt 1 to re-kill process OS PID=23958.
ksukia: Attempt 1 to re-kill process OS PID=23954.
ksukia: Attempt 1 to re-kill process OS PID=23951.
ksukia: Attempt 1 to re-kill process OS PID=23949.
...
Hmm - all this session was idle and there was no transactions at all - so why "shutdown immediate" can't shut down database. Next what has to be done is check of status of processes in Linux
$ ps -ef 
...
oracle    7426 11402  0 09:34 pts/1    00:00:00 sqlplus             @a.sql
oracle    7427  7426  0 09:34 ?        00:00:00 [oracle] <defunct>
...

I just put two lines as example - as you can see oracle process 7427 becomes zombie.
So let's kill it and solve problem

$ kill -9 7427
$ ps -ef
...
oracle    7426 11402  0 09:34 pts/1    00:00:00 sqlplus             @a.sql
oracle    7427  7426  0 09:34 ?        00:00:00 [oracle] <defunct>
...

Still working. I google a little bit and found answer (http://www.linuxsa.org.au/tips/zombies.html)
This is true - you can kill dead process. What you have to do is kill his parent process.
PPID for my process is 7426 and this is a sqlplus process starting my script.

$ kill -9 7426 

and database is going down. (of course I have to kill all sqlplus processes without that one with shutdown immediate command running)

Most funny thing is that is sqlplus process is running in foreground or when is started from cron there is no issue with other sessions
and shutdown immediate is working well.

Example:

$ ps -ef
...
oracle    8831 11402  0 09:40 pts/1    00:00:00 sqlplus             @a.sql
oracle    8832  8831  1 09:40 ?        00:00:00 oraclepioro (DESCRIPTION=(LOCAL=YES)(ADDRESS=(PROTOCOL=beq)))
...

Shutdown immediate executed on other sqlplus session and following result in observed session

SQL> select * from dual;
select * from dual
*
ERROR at line 1:
ORA-03135: connection lost contact
Process ID: 8832
Session ID: 153 Serial number: 5

This is expected result of shutdown immediate command.

Where is a problem ?
If sqlplus is running as background process and it become idle (no active SQL) will be not finished until it will be killed or taken to foreground using fg.
Be aware if you want to shutdown database and you have some sqlplus'es in background – especially in this same Linux session.

regards,
Marcin

2 comments:

Anonymous said...

mostly would due to wrong ORACLE_HOME, wrong oracle binary path

Anonymous said...

Absolutely correct, I had the same problem and late found that I kept many SQLPLUS sessions in background and that caused the problem. Oracle should correct this.