Download as pdf or txt
Download as pdf or txt
You are on page 1of 7

Oracle RAC instances are composed of following background processes:

ACMS(11g) Atomic Control file to Memory Service (ACMS)


GTX0-j (11g) Global Transaction Process
LMON Global Enqueue Service Monitor
LMD Global Enqueue Service Daemon
LMS Global Cache Service Process
LCK0 Instance Enqueue Process
DIAG Diagnosability Daemon
RMSn Oracle RAC Management Processes (RMSn)
RSMN Remote Slave Monitor
DBRM Database Resource Manager (from 11g R2)
PING Response Time Agent (from 11g R2)

Oracle Real Application Clusters New features

Oracle 9i RAC
OPS (Oracle Parallel Server) was renamed as RAC
CFS (Cluster File System) was supported

OCFS (Oracle Cluster File System) for Linux and Windows

watchdog timer replaced by hangcheck timer

Oracle 10g R1 RAC


Cluster Manager replaced by CRS
ASM introduced

Concept of Services expanded

ocrcheck introduced

ocrdump introduced

AWR was instance specific

Oracle 10g R2 RAC


CRS was renamed as Clusterware
asmcmd introduced

CLUVFY introduced

OCR and Voting disks can be mirrored

Can use FAN/FCF with TAF for OCI and ODP.NET

Oracle 11g R1 RAC


1. Oracle 11g RAC parallel upgrades - Oracle 11g have rolling upgrade features whereby
RAC database can be upgraded without any downtime.
2. Hot patching - Zero downtime patch application.

3. Oracle RAC load balancing advisor - Starting from 10g R2 we have RAC load balancing
advisor utility. 11g RAC load balancing advisor is only available with clients who use
.NET, ODBC, or the Oracle Call Interface (OCI).

4. ADDM for RAC - Oracle has incorporated RAC into the automatic database diagnostic
monitor, for cross-node advisories. The script addmrpt.sql run give report for single
instance, will not report all instances in RAC, this is known as instance ADDM. But using
the new package DBMS_ADDM, we can generate report for all instances of RAC, this
known as database ADDM.

5. Optimized RAC cache fusion protocols - moves on from the general cache fusion
protocols in 10g to deal with specific scenarios where the protocols could be further
optimized.

6. Oracle 11g RAC Grid provisioning - The Oracle grid control provisioning pack allows us to
"blow-out" a RAC node without the time-consuming install, using a pre-installed
"footprint".

Oracle 11g R2 RAC


1. We can store everything on the ASM. We can store OCR & voting files also on the ASM.
2. ASMCA

3. Single Client Access Name (SCAN) - eliminates the need to change tns entry when
nodes are added to or removed from the Cluster. RAC instances register to SCAN
listeners as remote listeners. SCAN is fully qualified name. Oracle recommends
assigning 3 addresses to SCAN, which create three SCAN listeners.
4. AWR is consolidated for the database.

5. 11g Release 2 Real Application Cluster (RAC) has server pooling technologies so its
easier to provision and manage database grids. This update is geared toward
dynamically adjusting servers as corporations manage the ebb and flow between data
requirements for datawarehousing and applications.

6. By default, LOAD_BALANCE is ON.

7. GSD (Global Service Deamon), gsdctl introduced.

8. GPnP profile.

9. Oracle RAC OneNode is a new option that makes it easier to consolidate databases that
arent mission critical, but need redundancy.

10.raconeinit - to convert database to RacOneNode.

11.raconefix - to fix RacOneNode database in case of failure.

12.racone2rac - to convert RacOneNode back to RAC.

13.Oracle Restart - the feature of Oracle Grid Infrastructure's High Availability Services
(HAS) to manage associated listeners, ASM instances and Oracle instances.

14.Oracle Omotion - Oracle 11g release2 RAC introduces new feature called Oracle
Omotion, an online migration utility. This Omotion utility will relocate the instance from
one node to another, whenever instance failure happens.

15.Omotion utility uses Database Area Network (DAN) to move Oracle instances. Database
Area Network (DAN) technology helps seamless database relocation without losing
transactions.

16.Cluster Time Synchronization Service (CTSS) is a new feature in Oracle 11g R2


RAC, which is used to synchronize time across the nodes of the cluster. CTSS will be
replacement of NTP protocol.

17.Grid Naming Service (GNS) is a new service introduced in Oracle RAC 11g R2. With GNS,
Oracle Clusterware (CRS) can manage Dynamic Host Configuration Protocol (DHCP) and
DNS services for the dynamic node registration and configuration.

18.Oracle Local Registry (OLR) - From Oracle 11gR2 "Oracle Local Registry (OLR)"
something new as part of Oracle Clusterware. OLR is nodes local repository, similar to
OCR (but local) and is managed by OHASD. It pertains data of local node only and is not
shared among other nodes.

19.Multicasting is introduced in 11gR2 for private interconnect traffic.

20.I/O fencing prevents updates by failed instances, and detecting failure and preventing
split brain in cluster. When a cluster node fails, the failed node needs to be fenced off
from all the shared disk devices or diskgroups. This methodology is called I/O Fencing,
sometimes called Disk Fencing or failure fencing.

21.Re-bootless node fencing (restart) - instead of fast re-booting the node, a graceful
shutdown of the stack is attempted.

22.Virtual Oracle 11g RAC cluster - Oracle 11g RAC supports virtualization.

SPLIT BRAIN CONDITION AND IO FENCING MECHANISM IN ORACLE CLUSTERWARE


Oracle clusterware provides the mechanisms to monitor the cluster operation and detect
some potential issues with the cluster. One of particular scenarios that needs to be prevented is
called split brain condition. A split brain condition occurs when a single cluster node has a failure
that results in reconfiguration of cluster into multiple partitions with each partition forming its
own sub-cluster without the knowledge of the existence of other. This would lead to collision and
corruption of shared data as each sub-cluster assumes ownership of shared data [1]. For a
cluster databases like Oracle RAC database, data corruption is a serious issue that has to be
prevented all the time. Oracle clustereware solution to the split brain condition is to provide IO
fencing: if a cluster node fails, Oracle clusterware ensures the failed node to be
fenced off from all the IO operations on the shared storage. One of the IO fencing method is
called STOMITH which stands for Shoot the Other Machine in the Head.

In this method, once detecting a potential split brain condition, Oracle clusterware
automatically picks a cluster node as a victim to reboot to avoid data corruption. This process is
called node eviction. DBAs or system administrators need to understand how this IO fencing
mechanism works and learn how to troubleshoot the clustereware problem. When they
experience a cluster node reboot event, DBAs or system administrators need to be able to
analyze the events and identify the root cause of the clusterware failure.

Oracle clusterware uses two Cluster Synchronization Service (CSS) heartbeats:


1. network heartbeat (NHB) and
2. disk heartbeat (DHB)
and two CSS misscount values associated with these heartbeats to detect the potential
split brain conditions.

The network heartbeat crosses the private interconnect to establish and confirm valid
node membership in the cluster. The disk heartbeat is between the cluster node and the voting
disk on the shared storage. Both heartbeats have their own maximal misscount values in
seconds called CSS misscount in which the heartbeats must be completed; otherwise a node
eviction will be triggered.

The CSS misscount for the network heartbeat has the following default values
depending on the version of Oracle clusterweare and operating systems:
10g
(R1
&R2 11
OS ) g
Linux 60 30
Unix 30 30
VMS 30 30
Windo
ws 30 30

The CSS misscount for disk heartbeat also varies on the versions of Oracle
clustereware. For oracle 10.2.1 and up, the default value is 200 seconds.

NODE EVICTION DIAGNOSIS CASE STUDY


When a node eviction occurs, Oracle clusterware usually records error messages into various log
files. These logs files provide the evidences and the start points for DBAs and system
administrators to do troubleshooting . The following case study illustrates a troubleshooting
process based on a node eviction which occurred in a 11-node 10g RAC production database. The
symptom was that node 7 of that cluster got automatically rebooted around 11:15am. The
troubleshooting started with examining syslog file /var;log/messages and found the following
error message:

Jul 23 11:15:23 racdb7 logger: Oracle clsomon failed with fatal status 12.
Jul 23 11:15:23 racdb7 logger: Oracle CSSD failure 134.
Jul 23 11:15:23 racdb7 logger: Oracle CRS failure. Rebooting for cluster integrity.

Then examined the OCSSD logfile at $CRS_HOME/log/<hostname>/cssd/ocssd.log file and


found the following error
messages which showed that node 7 network heartbeat didnt complete within the 60 seconds
CSS misscount and triggered a node eviction event:

[ CSSD]2008-07-23 11:14:49.150 [1199618400] >WARNING:


clssnmPollingThread: node racdb7 (7) at 50% heartbeat fatal, eviction in 29.720 seconds
..
clssnmPollingThread: node racdb7 (7) at 90% heartbeat fatal, eviction in 0.550 seconds

[ CSSD]2008-07-23 11:15:19.079 [1220598112] >TRACE:


clssnmDoSyncUpdate: Terminating node 7, racdb7, misstime(60200) state(3)

CRS REBOOTS TROUBLESHOOTING PROCEDURE


Besides of the node eviction caused by the failure of network heartbeat or disk heartbeat,
other events may also cause CRS node reboot. Oracle clusterware provides several processes to
monitor the operations of the clusterware. When certain conditions occurs, to protect the data
integrity, these monitoring process may automatically kill the clusterware, even reboot the node
and leave some critical error messages in their log files The following lists roles of these
clusterware processes in the server reboot and where their logs are located:

Three of clusterware processes OCSSD, OPROCD and OCLSOMON can initiate a CRS reboot
when they run into certain errors:
1. OCSSD ( CSS daemon) monitors inter-node heath, such as the interconnect and
membership of the cluster nodes. Its log file is located in
$CRS_HOME/log/<host>/cssd/ocssd.log
2. OPROCD(Oracle Process Monitor Daemon), introduced in 10.2.0.4, detects hardware
and driver freezes that results in the node eviction, then kills the node to prevent any IO
from accessing the sharing disk. Its log file is /etc/oracle/oprocd/<hostname>. oprocd.log
3. OCLSOMON process monitors the CSS daemon for hangs or scheduling issue. It may
reboot the node if it sees a potential hang. The log file is
$CRS_HOME/log/<host>/cssd/oclsomon/oclsmon.log

And one of the most important log files is the syslog file, On Linux, the syslog file is
/var/log/messages.

The CRS reboot troubleshooting procedure starts with reviewing various logs files to identify
which of three processes above contributes the node reboot and then isolates the root cause of
this process reboot. Figure 6 troubleshooting tree or diagram illustrated the CRS reboot
troubleshooting flowchart.

You might also like