Oracle Active DataGuard – Considerations for the Wide area Network

Oracle customers use Oracle Active Dataguard to create and maintain one or many standby databases that protect their mission critical primary databases from disaster. Typically, in such deployments, the primary databases and standby databases are in geographically separate locations connected via a WAN (Wide Area Network). Log Transport Services, transfers Large volumes of redo logs from the primary location to the standby, using Sql*Net.

We have to ensure that all the components from the source to target are setup correctly to ensure that the data transfer can be done with the best throughput possible. If sufficient network bandwidth is not available with reasonable latencies, then we will start seeing the log transfer and apply,lagging on the standby site (Which is oracle speak for, your primary and standby database is now out of sync from a data perspective).

One key point to keep in mind is that, lower the network round trip time (aka latency), higher your data transfer throughput. Higher the network round trip time (aka latency), lower your data transfer throughput. So it is very important to maintain low round trip times on your Wide area network.

To understand network data transfer throughput, It is important to understand the the concepts of Tcp Window Size and Bandwidth Delay Product (aka BDP).

Tcp Window size is the amount of bytes that can be transmitted without receiving an acknowledgement from the other side. Once Tcp Window size amount of bytes are send, the sender stops sending any more bytes and waits for an acknowledgement from the receiver.

Bandwidth delay product is calculated as the product of the network bandwidth and network round trip time. bdp=network bandwidth*round trip time. This is the amount of data that left the sender before the first acknowledgement was received by the sender. If the senders output bandwidth is stable, and the bandwidth is fully used, then the BDP calculates the number of packets in transit. If we set the Tcp Window size equal to the bandwidth delay product, then in theory we should be able to fully utilize the available bandwidth.

Setup the network

We have to start by setting up the networking components to support the desired/stated bandwidth. So if you have a WAN that is a 10GigE network, all the NIC’s (Network interface cards), ports, switches in the configuration should be configured to support 10GigE full Duplex settings. After setup, run the configuration display utilities and ensure that all these component levels the transfer speeds are set to be 10 GigE. Customers often run into trouble when Auto Negotiation causes some NIC’s to set the transfer speeds to 1GigE because of configuration mismatches.

Use tools like Iperf to test the transfer speeds that your network is capable of achieving.

One important aspect to keep in mind is that it is probable that the WAN is shared by other traffic (e-mail, data replication, san replication). This has two important implications that we should consider.

  • If there is a lot of bandwidth consumption by some of this miscellaneous traffic, round trip times could be degraded periodically on the network.
  • We should be careful in our calculations that we do not completely consume the entire bandwidth for redo transport. (This could impact other processes)
    • So it is important to figure out (Working with the network admins) what the bandwidth entitlements are for redo transport and base our calculations on those numbers.

Caclulate our BDP

Use the following formula to calculate our Bandwidth delay product (BDP)

(bandwidth/8)*rount trip time in seconds.

The network bandwidth is expressed in bits per second, so we divide by 8 to convert to bytes.
Round Trip Time is usually in milli seconds, so we divide by 1000 to convert to seconds.

So for example, if we have a 10Gbit network bandwidth and a 40ms round trip time

BDP=(10000000000/8)*(40/1000) = 50,000,000 bytes.

Setup Sql*Net Parameters

The current recommendations for Dataguard Redo transport are as follows.

Set the SDU size to 65535

  • We can set SDU on a per connection basis using the SDU parameter in the local naming configuration file (TNSNAMES.ORA) and the listener configuration file (LISTENER.ORA)
  • We can set the SDU for all Oracle Net connections with the profile parameter DEFAULT_SDU_SIZE in the SQLNET.ORA file.

Set TCP.NODELAY to YES

To preempt delays in buffer flushing in the TCP protocol stack, disable the TCP Nagle algorithm by setting TCP.NODELAY to YES in the SQLNET.ORA file on both the primary and standby systems.

Setup RECV_BUF_SIZE and SEND_BUF_SIZE

The current recommendation is to set the SEND_BUF_SIZE and RECV_BUF_SIZE parameters (Which are the send and receive socket buffer sizes for SQL*Net) to 3 Times the BDP.

As per the above example we would set them to 50,000,000*3 = 150,000,000

Setup Operating system Kernel Parameters

If you are using the Linux operating system make sure that the values for the following kernel parameters are setup to be higher than the values set for RECV_BUF_SIZE and SEND_BUF_SIZE.

net.core.rmem_max
net.core.wmem_max

Once we have configured the network, operating system and the sql*net, and we have redo transfer, we can perform further network monitoring to see how the network bandwidth is being utilized, and make appropriate adjustments.

Links to helpful Documents

Iperf

How to calcluate Tcp throughput for long distance links (blog)

Oracle Net Performance Tuning (Mos)

Setting Send and Receive Buffer Sizes (Mos)

Tuning Sql*Net peformance (Oracle Docs)

Configuring Oracle Dataguard (Oracle Docs)

Best Practices for Sync Data Transport (White Paper)

OraChk Collection Manager

OraChk (Previously known as Raccheck) is a utility from oracle to perform configuration checks on Oracle database platforms, and report on configurations that do not match oracle’s best practices. OraChk has the ability to upload the results of its checks into an oracle database. Details about this utility can be found in Mos Note 1268927.1

Oracle has now released  OraChk Collection Manager which is a companion application to OraChk, which has an Oracle Application Express, Front End which can be used  as a dashboard, in which customers can track their ORAchk, RACcheck and Exachk collection data in one easy to use interface.

Details about downloading and using “OraChk Collection Manager” can be found in Mos Note 1602329.1

Managing Processes and Memory Consumption on Oracle Exadata database machines.

This post is equally applicable to Rac databases deployed, both on oracle database machines as well as traditional server/storage architectures, running RHEL5 or OEL5 (OEL6 has a new feature called transparent hugepages).

The Best practices for database consolidation on exadata database machine, White Paper outlines a lot of good guidelines. However the best practices seems to be getting occasionally overlooked, in some cases resulting in Node evictions/restarts in the cluster.

High amounts of memory and cpu consumption can result in some real bad things happening (Like for eg: Node evictions). So it is important to configure the operating system and the databases you deploy on the machine optimally, to use the available CPU and memory resources.

Let us first review what those available memory and cpu resources are on an Oracle Exadata Database Machine, on each database node.

[table “1” not found /]

Anytime a workload is deployed on a database server, that exceeds the, operating systems ability to efficiently use the above mentioned, available cpu and memory resources, bad things can happen. (I know i am over simplifying in the prior comment, but you get the idea).

Managing Memory Resources.

Setting up HugePages

First we need to make sure that HugePages on each node, is setup correctly, for the databases you have deployed on the node.
Tanel Poder has explained the reasons for setting up HugePages on Exadata systems.
When the database machines are installed, the HugePages is configured to accomodate the one database that is pre-installed. So if you change the SGA setting for that database or deploy more databases you should now adjust the HugePages setting accordingly. It goes without saying that if you have already configured the hugepages to accomodate all the databases you have, and then you removed some databases from the node, or resized the SGA’s to be smaller, you should again resize your hugepages to free up the excess hugepages.

Once all the database instances are configured and started up you can run the script, hugepages_settings.sh from Mos Note 401749.1, which calculates the number of hugepages you need to setup. I usually recommend adding 10% to the value that hugepages_settings.sh suggests.

With 11.2.0.2 and 11.2.0.3 you should also be setting USE_LARGE_PAGES, preferably to ONLY, following the instructions in Mos Note 1392497.1.

You can then follow the instructions in MOS note 361323.1, under the section “Kernel version 2.6”, to set the value. You need to restart the databases so the SGA gets allocated with the hugepages.

You should also follow instructions from MOS note 1546861.1, and set the value of vm.min_free_kbytes = 524288, in /etc/sysctl.conf, to avoid page allocation failure messages when kernel memory is depleted.

Managing Memory Usage

Ideally for critical implementations, your SGA+PGA+Individual server processes, memory allocations should not exceed 75% of the physical memory on the database node.

The Exadata consolidation white paper above suggests that

For critical Hardware Pools, we recommend an even more conservative approach by not exceeding 75% physical memory per database node.
OLTP applications:
SUM of databases (SGA_TARGET +PGA_AGGREGATE_TARGET) + 4 MB * (Maximum PROCESSES) < Physical Memory per Database Node
DW/BI applications:
SUM of databases (SGA_TARGET + 3 * PGA_AGGREGATE_TARGET) < Physical Memory per Database Node

You can monitor the pga usage a few different ways.

1) SELECT s.inst_id, s.sid, s.serial#, p.spid, s.username, s.program, p.pga_used_mem FROM gv$session s JOIN gv$process p ON p.addr = s.paddr AND p.inst_id = s.inst_id WHERE s.type != ‘BACKGROUND’ and s.program not like ‘%(P%’ and p.pga_used_mem > <APPLICATION_MEMORY_THRESHOLD>
order by s.inst_id, s.sid, s.serial#;

The value you use for APPLICATION_MEMORY_THRESHOLD is dependent on your application and howmuch pga it needs to efficiently run your queries (Performing the sorts in memory using the space for PGA is usually faster than using the TEMP tablespace). However the white paper suggests using 1Gb for OLTP applications and 10Gb for DSS applications.

2) You can monitor the sysmetric “Total PGA Allocated” from dba_hist_sysmetric_summary, to see how much PGA is being allocated.

3) You can use the “PGA Memory Advisory” and “SGA Target Advisory” sections of the awr report, to guage, the impacts of increasing or decreasing the SGA and PGA Allocations.

4) You can monitor the memory usage at the o/s level using the top (Mem: free),free or vmstat (bo (page-out),bi (page-in)) commands.

Managing O/S Processes (And as a consequence CPU usage).

It is important to control the number of processes that are actively executing on each node. Each of these processes consume cpu and memory resources.

The following guidelines are important to follow.
– Use Application connection pooling to appropriately limit the number of sessions connected to the oracle database.

This is important for two reasons.
– Right-Size the resource consumption on each database node.
– Minimize the impact of connection storms.

– Configure the parallel query servers appropriately.
– For all the database’s deployed on a node the total setting for PARALLEL_MAX_SERVERS should be less than or equal to the following

X2-2 or X3-2, <= 240
X2-8 or X3-8, <= 1280

I often see implimentations where parallel_max_servers = 240 (or a high value) in each instance on the machine, which results in not so good consequences. I have occassionaly seen systems where there is sustained 100% cpu usage with load averages of > 300, resulting in node reboots.

– Use Instance Caging and Database Resource Manager to manage cpu allocation on the server.

It would be a good idea to always review the Best practices for database consolidation on exadata database machine, White Paper and follow the guidelines, to have a well performing and  stable database environment.

It would be a good idea to review the performance management screens from Enterprise Manager 12c every day to have a good understanding of the database and server workloads, and/or analyze the oswatcher data to understand the server workload profile.

Oracle Exadata Best Practices Tutorials

There is a set of recorded tutorials available from the oracle maximum availability architecture team, that cover a range of topics.

The tutorials can be accessed at  Oracle Exadata Best Practices Series

The topics include implementing Ebs, Siebel and Peoplesoft, resource manager, migration, DBFS, monitoring, backup and recovery, troubleshooting, patching and healthcheck.

Lots and Lots of great information.

Rman, duplicate from incrementally updated backup

In the last post i addressed how to create and maintain an incrementally updated rman backup, and how to switch to the backup, in case some corruption happens to the production database.

In this post, instead of doing the switch, we will use the incrementally updated backup to create a cloned copy of the original database. The practical use of this method will be that, if you are creating incrementally updated backups every night, you can use the backup as the source to clone the database (ie you dont have to run a whole backup of your database again).

rman has a command named duplicate that allows you to clone an existing database.

In this case we are cloning a database named rk01, to another database named rk02 on the same host, using a incrementally updated backup, with the rman duplicate command.

First follow the steps to create the incrementally updated backup from my previous post.

Enable Archivelog And Block Change Tracking
Run a level 0 full image copy backup of the database
Perform updates in the database
Perform an incremental level 1 backup
Apply the incremental level 1 backup to the level 0 image copy

At this point after the incrementally updated backup is created, in the previous post, i was doing a switch. Instead of that we will do the rman duplicate.

Create a spfile for the rk02 database (I usually make a pfile from spfile from rk01, edit the pfile change all the occurances of rk01 to rk02 and create the spfile for rk02 from this pfile. Also create the necessary directories for bdump,adump etc.)

Create a password file for the rk02 database (orapwd file=$ORACLE_HOME/dbs/orapwrk02 entries=30)

Startup the database rk02 in nomount mode

export ORACLE_SID=rk02

sqlplus / as sysdba

startup nomount

Switch the current logfile in the rk01 database

This is an important step. Miss this and your rman duplicate will likely fail. The reason is that the scn until which the incremental backup is taken (And was applied to the original backup copy), is still in the online redo log. When you do a duplicate rman looks for a backup that has a ckp scn that is < the first_change# in your current online redo log. Once you do a switch, and then run the duplicate, rman will be able to find the incrementally updated backup as the source to clone from.

export ORACLE_SID=rk01

sqlplus / as sysdba

alter system archive log current;

Perform the rman duplicate.

Then issue the rman duplicate command (If you are cloning to a physically separate host, you will have to create appropriate listener.ora entries, tnsnames.ora entries and make the location of the backup copy available to your target host).

export ORACLE_SID=rk02

run the following script.

#!/bin/bash
rman target="sys/manager@rk01" auxiliary / log=/u01/oraback/clonefromiub.log << EOF
RUN
{
  SET NEWNAME FOR DATAFILE 1 TO '/u01/oradata/rk02/system01.dbf';
  SET NEWNAME FOR DATAFILE 2 TO '/u01/oradata/rk02/sysaux01.dbf';
  SET NEWNAME FOR DATAFILE 3 TO '/u01/oradata/rk02/undotbs01.dbf';
  SET NEWNAME FOR DATAFILE 4 TO '/u01/oradata/rk02/users01.dbf';
  SET NEWNAME FOR DATAFILE 5 TO '/u01/oradata/rk02/example01.dbf';
  SET NEWNAME FOR TEMPFILE 1 TO '/u01/oradata/rk02/temp01.dbf';
  DUPLICATE  TARGET DATABASE  TO rk02
    LOGFILE
      GROUP 1 ('/u01/oradata/rk02/redo01a.log',
               '/u01/oradata/rk02/redo01b.log') SIZE 4M REUSE,
      GROUP 2 ('/u01/oradata/rk02/redo02a.log',
               '/u01/oradata/rk02/redo02b.log') SIZE 4M REUSE;
}
EXIT;
EOF

Rman, incrementally updated backups, switch to copy

Some oracle customers use SAN based technologies to keep a backup copy of the database in sync (Think EMC Timefinder and BCV’s). The typical backup strategy tends to be, to keep a full copy of the database, and on a daily basis to update this backup copy of the database (Typically a BCV synchronize  operation), with only the incremental changes that have happened at the disk level.

In case off a loss of file system in the primary disks, they can switch the database to run from another copy which is maintained as a backup by the Storage subsystem.

A similar strategy can be implemented for  oracle databases, using RMAN incrementally updated backups (Without requiring any expensive SAN level software for maintaining these copies). These backup copies can be updated on a daily basis using just the incremental changes in the database. In case of a loss in the database, you can switch to this copy to be used as the database. You can also generate clones from these backups using the rman duplicate command.

At a high level the steps involved are

  • Configure the database to run in archivelog mode
  • Configure block change tracking at the database level
  • Create a level 0 full image copy of the database using rman
  • Create level 1 incremental backups of the database and update the level 0 copy with this incremental backup
  • In case of need, use the rman switch command to switch the database to this incrementally updated backup.

Below is an example of how this is done.

Enable Archivelog and Block Change Tracking

sqlplus / as sysdba
startup mount;
alter database archivelog;
alter system set log_archive_dest_1=’LOCATION=/u01/orarch/rk01′ scope=spfile;
alter database open;
ALTER DATABASE ENABLE BLOCK CHANGE TRACKING
USING FILE ‘/u01/orarch/rk01/rman/rman_change_track.f’ REUSE;

Run a level 0 full image copy backup of the database.

A sample script to run the full image copy backup is below
#!/bin/bash
rman target=”sys/manager@rk01″ log=/u01/oraback/fullcopy.log << EOF
RUN {
ALLOCATE CHANNEL ch1 TYPE
DISK FORMAT ‘/u01/oraback/%b’;
BACKUP INCREMENTAL LEVEL 0 as copy DATABASE TAG ‘fullcopy’;
RELEASE CHANNEL ch1;
}
RUN {
ALLOCATE CHANNEL ch1 TYPE
DISK FORMAT ‘/u01/oraback/%U’;
BACKUP as copy ARCHIVELOG ALL;
BACKUP as copy CURRENT CONTROLFILE;
RELEASE CHANNEL ch1;
}
EXIT;
EOF
In the above script there are two distinct rman scripts.
The first one backs up the database and the second one backs up the control file and archivelogs.
I am using the %b format string for the database backup to keep the database file names intact (In preparation for a switch) (You dont have to keep the datafile names the same, just makes it easy to understand).
You cannot do controlfile and archivelog backups using the %b format string. This is why i have two separate scripts.
I tag the backup with a name ‘fullcopy’, which later helps us with applying the incremental backup to this tag.

Now perform some updates in the database. (This simulates database changes, after which we can run an incremental backup).

#!/bin/bash
sqlplus hr/hr@rk01 <<EOC
set echo on;
select employee_id,first_name,last_name from
employees
where
employee_id = 194
/
update employees set last_name=’Adams’
where
employee_id=194
/
select employee_id,first_name,last_name from
employees
where
employee_id = 194
/
EOC
I updated the last_name for the employee number 194.

Now perform an incremental level 1 backup of the database

#!/bin/bash
rman target=”sys/manager@rk01″ log=/u01/oraback/incr.log << EOF
RUN {
ALLOCATE CHANNEL ch1 TYPE
DISK FORMAT ‘/u01/oraback/%U’;
BACKUP
INCREMENTAL LEVEL 1
FOR RECOVER OF COPY WITH TAG ‘fullcopy’
DATABASE;
RELEASE CHANNEL ch1;
}
EXIT;
EOF

Now Apply the incremental backup to the original level 0 image copy

#!/bin/bash
rman target=”sys/manager@rk01″ log=/u01/oraback/incrapply.log << EOF
RUN {
ALLOCATE CHANNEL ch1 TYPE
DISK FORMAT ‘/u01/oraback/%U’;
RECOVER COPY OF DATABASE WITH TAG ‘fullcopy’;
RELEASE CHANNEL ch1;
}
EXIT;
EOF
Run the following commands to capture the location of the datafiles for the database
SQL> select file_name from dba_data_files;
FILE_NAME
——————————————————————————–
/u01/oradata/rk01/users01.dbf
/u01/oradata/rk01/undotbs01.dbf
/u01/oradata/rk01/sysaux01.dbf
/u01/oradata/rk01/system01.dbf
/u01/oradata/rk01/example01.dbf

Now switch the database to run from the copy

Shutdown and startup and mount the database; (Do not open the database)
oracle@rramads-us2:~$ rman target /
Recovery Manager: Release 11.2.0.2.0 – Production on Sat Nov 6 17:44:40 2010
Copyright (c) 1982, 2009, Oracle and/or its affiliates.  All rights reserved.
connected to target database: RK01 (DBID=2035601811, not open)
RMAN> switch database to copy;
using target database control file instead of recovery catalog
datafile 1 switched to datafile copy “/u01/oraback/system01.dbf”
datafile 2 switched to datafile copy “/u01/oraback/sysaux01.dbf”
datafile 3 switched to datafile copy “/u01/oraback/undotbs01.dbf”
datafile 4 switched to datafile copy “/u01/oraback/users01.dbf”
datafile 5 switched to datafile copy “/u01/oraback/example01.dbf”
RMAN> recover database;
Starting recover at 06-NOV-10
allocated channel: ORA_DISK_1
channel ORA_DISK_1: SID=133 device type=DISK
channel ORA_DISK_1: starting incremental datafile backup set restore
channel ORA_DISK_1: specifying datafile(s) to restore from backup set
destination for restore of datafile 00001: /u01/oraback/system01.dbf
destination for restore of datafile 00002: /u01/oraback/sysaux01.dbf
destination for restore of datafile 00003: /u01/oraback/undotbs01.dbf
destination for restore of datafile 00004: /u01/oraback/users01.dbf
destination for restore of datafile 00005: /u01/oraback/example01.dbf
channel ORA_DISK_1: reading from backup piece /u01/oraback/14lsbd4b_1_1
channel ORA_DISK_1: piece handle=/u01/oraback/14lsbd4b_1_1 tag=FULLCOPY
channel ORA_DISK_1: restored backup piece 1
channel ORA_DISK_1: restore complete, elapsed time: 00:00:07
starting media recovery
media recovery complete, elapsed time: 00:00:01
Finished recover at 06-NOV-10
RMAN> alter database open;
database opened
Now if you query the file_names they will point to the /u01/oraback directory

SQL> select file_name from dba_data_files;
FILE_NAME
——————————————————————————–
/u01/oraback/users01.dbf
/u01/oraback/undotbs01.dbf
/u01/oraback/sysaux01.dbf
/u01/oraback/system01.dbf
/u01/oraback/example01.dbf
Check if the changes we made to the hr.employees table are still there (To ensure that the incremental backup did work).
SQL> select last_name from hr.employees where employee_id = 194;
LAST_NAME
————————-
Adams

Transportable Tablespace from rman backup

Most of oracle’s MAA documentation for transportable tablespaces, seems to recommend that one should put the tablespaces one is transporting, in read-only mode in the source database, before copying the datafiles to the target. This in most cases means application downtime.

In order to minimize the downtime the recommendations seem to be

  • Create a dataguard physical standby database and use this standby database as the source for transport
  • Create a duplicate (aka clone) of your source and use this new duplicate as the source for transport

If you are only transporting a subset of your tablespaces and you want to minimize your downtime, a 3rd good option is to create and use transportable tablespace sets from your already existing rman backups. This process is documented in the Backup and Recovery users guide, Chapter 26 (11gR2 manual).

Since this process uses an existing rman backup you incur no downtime (ie no need to place tablespaces in read-only mode) on your production systems.

Below are the steps to accomplish this (In my example rk01 is the source database and rk02 is the target database).

  • First off, your source database should be running in archivelog mode
  • Take a full database backup from your source database

export ORACLE_SID=rk01
rman target /

run {
allocate channel oem_backup_disk1 type disk format ‘/u01/orarch/rk01/backup/%U’;
backup as BACKUPSET tag ‘%TAG’ database;
backup as BACKUPSET tag ‘%TAG’ archivelog all not backed up;
release channel oem_backup_disk1;
}
run {
allocate channel oem_backup_disk1 type disk format ‘/u01/orarch/rk01/backup/%U’ maxpiecesize 1000 G;
backup as BACKUPSET tag ‘%TAG’ current controlfile;
release channel oem_backup_disk1;
}

exit;

  • Create a transportable tablespace set for the tablespaces you need to transport

export ORACLE_SID=rk01

rman target /

RMAN>  transport tablespace example

2> tablespace destination ‘/u01/orarch/rk01/datafile’

3> auxiliary destination ‘/u01/orarch/rk01/tmp’;

Once the process is complete rman leaves a copy of the datafile (An operating system file, not a backup set file) for the tablespace example , in the directory /u01/orarch/rk01/datafile. It also leaves a export dump file that has the metadata needed for the transport in the same directory /u01/orarch/rk01/datafile.

  • Do endianness conversions on the files, if you need to go cross platform (Use rman convert)
  • Attach the tablespace to your target database

export ORACLE_SID=rk02

sqlplus / as sysdba

create directory tts_dir as ‘/u01/orarch/rk01/datafile’

/

grant all on directory tts_dir to public

/

Exit;

Before you run the next import, make sure that you have created the schema’s (with appropriate privileges) that are in the tablespace you are transporting in the target database rk02, also make sure any roles that are required are created in the target. Eg: The schema HR has objects in the example tablespace. use the create user command to create the HR user with appropriate privileges in the database rk02.

impdp system/manager dumpfile=dmpfile.dmp directory=tts_dir transport_datafiles=/u01/oradata/rk02/example01.dbf logfile=tts_import.log

So As you can see the whole process is executed without shutting down the source database rk01.

Cloud computing definition.

The National Institute of Standards and technology has a good, concise  definition of cloud computing. Sushil kumar of Oracle, was using the same language to define cloud computing in an article the current release of the oracle magazine.

Essential Charachteristics

  • On Demand Self-Service
  • Broad Network Access
  • Resource Pooling
  • Rapid Elasticity
  • Measured Service

Service Models

  • Cloud Software as a Service (SaaS)
  • Cloud Platform as a Service (PaaS)
  • Cloud Infrastructure as a Service (IaaS)

Deployment Models

  • Private Cloud
  • Community Cloud
  • Public Cloud
  • Hybrid Cloud

Grid computing sessions at Oracle Openworld 2009

If you are attending Oracle Openworld 2009, and are interested in learning a lot about oracle Rac and Grid computing, you can find a full list of Oracle Rac and Grid computing events Here (Starting at page 2 of the pdf doc).

In preparation for the event, you could read the following new 11gR2 white papers from Oracle, to understand the latest developments and arm yourself with questions.

Oracle Real Application Clusters 11g Release 2 Technical Overview

Oracle Real Application Clusters 11g Release 2 Overview of SCAN

Oracle Real Application Clusters One Node 11g Release 2 Technical Overview

11gR2 rac installation on 64 bit Linux step by step

Yesterday i completed a 11g Release 2 real application clusters installation on 64 bit Oracle Enterprise Linux 4. The installation process is very similar to the 10g and 11gr1 installations, but much simpler. This was a two node cluster. There are some new concepts that are introduced in 11gR2 real application clusters. Below are some of my notes on 11gr2 new features for Rac and detailed steps that i followed to complete the installation.

Some new concepts in 11gR2 Rac


Oracle clusterware and ASM now are installed into the Same Oracle Home, and is now called the grid infrastructure install.

Raw devices are no longer supported for use for anything (Read oracle cluster registry, voting disk, asm disks), for new installs.

OCR and Voting disk can now be stored in ASM, or a certified cluster file system.

The redundancy level of your ASM diskgroup (That you choose to place voting disk on) determines the number of voting disks you can have.
You can place

  • Only One voting disk on an ASM diskgroup configured as external redundancy
  • Only Three voting disks on an ASM diskgroup configured as normal redundancy
  • Only Five voting disks on an ASM diskgroup configured as high redundancy


The contents of the voting disks are automatically backed up into the OCR

ACFS (Asm cluster file system) is only supported on Oracle Enterprise Linux 5 (And RHEL5), not on OEL4.

There is a new service called cluster time synchronization service that can keep the clocks on all the servers in the cluster synchronized (In case you dont have network time protocol (ntp) configured)

Single Client Access Name (SCAN), is a hostname in the DNS server that will resolve to 3 (or at least one) ip addresses in your public network. This hostname is to be used by client applications to connect to the database (As opposed to the vip hostnames you were using in 10g and 11gr1). SCAN provides location independence to the client connections connecting to the database. SCAN makes node additions and removals transparent to the client application (meaning you dont have to edit your tnsnames.ora entries every time you add or remove a node from the cluster).

Oracle Grid Naming Service (GNS), provides a mechanism to make the allocation and removal of VIP addresses a dynamic process (Using dynamic Ip addresses).

Intelligent Platform Management Interface (IPMI) integration, provides a new mechanism to fence server’s in the cluster, when the server is not responding.

The installer can now check the O/S requirements, report on the requirements that are not met, and give you fixup scripts to fix some of them (like setting kernel parameters).

The installer can also help you setup SSH between the cluster nodes.

There is a new deinstall utility that cleans up a existing or failed install.

And the list goes on an on.

I have broken up the installation process into 3 distinct documents, which can be found below

Installing 11gr2 grid infrastructure

Installing 11gr2 Real Application Clusters

Creating the 11gr2 Clustered database