Sunday, July 31, 2011

Unable to create database at 11g release 2 grid, scan IP was not running on both nodes

After successful installation of Oracle 11g release2 patchset1 grid and database software, I tried to create a new 11g RAC database using DBCA, but it stuck at an error 'database creation requires scan IP to be running, can not create database', on further investigation I found SCAN IP was not configured at my grid infrastructure by default, so I follwed the below sequence of commands to create and start the SCAN

On node1
========
srvctl add vip -n testnode01 -k 1 -A testnode01-vip/255.255.255.0/en0

srvctl add scan -n 
<scan-name>  in my case scan name is 'testnode-scan'

srvctl add scan_listener

srvctl start vip -n testnode01
srvctl start scan
srvctl start scan_listener

On node2
========
srvctl add vip -n testnode02 -k 1 -A testnode02-vip/255.255.255.0/en0

srvctl start vip -n testnode02
srvctl start scan
srvctl start scan_listener


       After completing above steps I am able to create a new database.

Regards,

Farrukh Salman.

Sunday, July 17, 2011

Unable to connect ASM instance using sqlplus/asmcmd at 11gR2 'Connected To An Idle Instance'

Just after installing 11gR2 patch set 1, I checked all of my cluster services every thing was up and running then I tried to connect ASM instance which is now builtin from 11g onward, I got an error 'Connected To An Idle Instance', I checked all my services related to ASM it were all up and running i.e. ora.asm, pmon, smon etc., then I got an Oracle support document saying I have to remove forward slash '/' from ORACLE_HOME environment variable in my .profile file setting.

Incorrect value:
                       ORACLE_HOME=/u01/app/11.2.0/grid/

Correct value:
                      ORACLE_HOMe=/u01/app/11.2.0/grid

                 After this I can connect to ASM instance properly :).

Regards,

Farrukh Salman,

Reference Link:

https://support.oracle.com/CSP/ui/flash.html#tab=KBHome(page=KBHome&id=()),(page=KBNavigator&id=(viewingMode=1143&bmDocID=1179825.1&bmDocDsrc=KB&bmDocType=PROBLEM&bmDocTitle=Unable%20To%20Connect%20To%20ASM%20Due%20To%20SQL*Plus%20Shows%20%E2%80%9CConnected%20To%20An%20Idle%20Instance.&from=BOOKMARK))

root.sh failed at first node when installing 11gr2

Hello bussies,
                         Finally I got installed Oracle 11gR2 patchset 1 at AIX 6.1, the one which gave me hard time, and I was stuck in an issue for long time, I am sharing it with you guys.

                         During execution of root.sh at first node my configuration gets fail with below error

Start of resource "ora.cluster_interconnect.haip" failed
CRS-2672: Attempting to start 'ora.cluster_interconnect.haip' on 
'node1'
CRS-5017: The resource action "ora.cluster_interconnect.haip start" 
encountered the following error:
Start action for HAIP aborted
CRS-2674: Start of 'ora.cluster_interconnect.haip' on 'node1' failed
CRS-2679: Attempting to clean 'ora.cluster_interconnect.haip' on 'node1'
CRS-2681: Clean of 'ora.cluster_interconnect.haip' on 'mpdwsr01' succeeded
CRS-4000: Command Start failed, or completed with errors.
Failed to start Oracle Clusterware stack
Failed to start High Availability IP at 
/oracle/app/11.2.0/grid/crs/install/crsconfig_lib.pm line 1047.
/oracle/app/11.2.0/grid/perl/bin/perl -I/oracle/app/11.2.0/grid/perl/lib 
-I/oracle/app/11.2. 0/grid/crs/install 
/oracle/app/11.2.0/grid/crs/install/rootcrs.pl execution failed 


              Upon checking crsd.log I got an extract

2011-07-12 10:07:22.751: [GIPCXCPT][1543] gipchaInternalResolve: failed to resolve ret gipcretKeyNotFound (36), host 'ouptdbb01', port '14be-277c-83fd-56f3', hctx 110ea2210 [0000000000000010] { gipchaContext : host 'ouptdbb01', name 'a83b-2d41-7a3d-b0b6', luid '56bbeab0-00000000', numNode 0, numInf 1, usrFlags 0x0, flags 0x1 }, ret gipcretKeyNotFound (36)
             Orace support says its an undocumented bug 9593552 which is fixed in 11.2.0.2 PSU3, 11.2.0.3 and above, but unfortunately at time of this blog post I did not find its intermi patch 9593552 which Oracle says is the right solution.

            On further exploring this error I saw an abnormal behaviour of my private interconnect ethernet port that its going down during root.sh execution which iscausing the failure of starting ora.clusterinterconnect.haip,
I tried to start it manually using command 'crsctl start cluster' and good thing is that my ora.cluesterinterconnect.haip came up with out any issues and then I executed root.sh on second node and completed my 11gr2 patchset 1 grid installation.

            All srvices are up and running  on both nodes.

Regards.

Farrukh Salman,
Oracle DBA.

Update kernel version of RHEL 5 which has Oracle 10g RAC installed

My client was facing the problem of limited utilisation of memory i.e. max. 2GB, while they had 16GB actual, the root cause was that the Redhat linux 5.1 was using the kernel version 2.6.18.128 which has maximum 3GB of memory (RAM) support, the way to fix this issue was to update the existing kernel version with the 2.6.18.128PAE kernel, but mean time Oracle 10g Real application cluster was installed at this machine which mean it was not just the simple update of kernel version but also it should maintain the RAC and keep it up and running, below are the steps to do, to upgrade the kernel while Oracle RAC should keep running after kernel update.

1) ASM and Oracle cluster file system should have the version compatible to as that of new kernel version i.e. some thing similar to, asmlib-2.6.18.128PAE.****.rpm and ocfs-2.6.18.128PAE.rpm
2) Installed the abve mentioned RPMs while system is running at old kernel level.
3) Install the new kernel version kernel-2.6.18.128PAE, while system is running at old kernel.
4) Reboot the machine with new PAE kernel.
5) Uninstall the RDAC library for old kernel version and install fresh RDAC specific to new PAE kenel version.
6) Reboot the machine.
7) All CRS services will be up and running.



Cheers,

Farrukh Salman,
Oracle Certified Professional 10g/11g,
Oracle Database Administrator,
eOman Portal Project, ITA, Muscat,Oman.