Category Archives: highAvailability

Oracle Exadata Best Practices Tutorials

There is a set of recorded tutorials available from the oracle maximum availability architecture team, that cover a range of topics.

The tutorials can be accessed at  Oracle Exadata Best Practices Series

The topics include implementing Ebs, Siebel and Peoplesoft, resource manager, migration, DBFS, monitoring, backup and recovery, troubleshooting, patching and healthcheck.

Lots and Lots of great information.

Rman, duplicate from incrementally updated backup

In the last post i addressed how to create and maintain an incrementally updated rman backup, and how to switch to the backup, in case some corruption happens to the production database.

In this post, instead of doing the switch, we will use the incrementally updated backup to create a cloned copy of the original database. The practical use of this method will be that, if you are creating incrementally updated backups every night, you can use the backup as the source to clone the database (ie you dont have to run a whole backup of your database again).

rman has a command named duplicate that allows you to clone an existing database.

In this case we are cloning a database named rk01, to another database named rk02 on the same host, using a incrementally updated backup, with the rman duplicate command.

First follow the steps to create the incrementally updated backup from my previous post.

Enable Archivelog And Block Change Tracking
Run a level 0 full image copy backup of the database
Perform updates in the database
Perform an incremental level 1 backup
Apply the incremental level 1 backup to the level 0 image copy

At this point after the incrementally updated backup is created, in the previous post, i was doing a switch. Instead of that we will do the rman duplicate.

Create a spfile for the rk02 database (I usually make a pfile from spfile from rk01, edit the pfile change all the occurances of rk01 to rk02 and create the spfile for rk02 from this pfile. Also create the necessary directories for bdump,adump etc.)

Create a password file for the rk02 database (orapwd file=$ORACLE_HOME/dbs/orapwrk02 entries=30)

Startup the database rk02 in nomount mode

export ORACLE_SID=rk02

sqlplus / as sysdba

startup nomount

Switch the current logfile in the rk01 database

This is an important step. Miss this and your rman duplicate will likely fail. The reason is that the scn until which the incremental backup is taken (And was applied to the original backup copy), is still in the online redo log. When you do a duplicate rman looks for a backup that has a ckp scn that is < the first_change# in your current online redo log. Once you do a switch, and then run the duplicate, rman will be able to find the incrementally updated backup as the source to clone from.

export ORACLE_SID=rk01

sqlplus / as sysdba

alter system archive log current;

Perform the rman duplicate.

Then issue the rman duplicate command (If you are cloning to a physically separate host, you will have to create appropriate listener.ora entries, tnsnames.ora entries and make the location of the backup copy available to your target host).

export ORACLE_SID=rk02

run the following script.

#!/bin/bash
rman target="sys/manager@rk01" auxiliary / log=/u01/oraback/clonefromiub.log &lt;&lt; EOF
RUN
{
  SET NEWNAME FOR DATAFILE 1 TO '/u01/oradata/rk02/system01.dbf';
  SET NEWNAME FOR DATAFILE 2 TO '/u01/oradata/rk02/sysaux01.dbf';
  SET NEWNAME FOR DATAFILE 3 TO '/u01/oradata/rk02/undotbs01.dbf';
  SET NEWNAME FOR DATAFILE 4 TO '/u01/oradata/rk02/users01.dbf';
  SET NEWNAME FOR DATAFILE 5 TO '/u01/oradata/rk02/example01.dbf';
  SET NEWNAME FOR TEMPFILE 1 TO '/u01/oradata/rk02/temp01.dbf';
  DUPLICATE  TARGET DATABASE  TO rk02
    LOGFILE
      GROUP 1 ('/u01/oradata/rk02/redo01a.log',
               '/u01/oradata/rk02/redo01b.log') SIZE 4M REUSE,
      GROUP 2 ('/u01/oradata/rk02/redo02a.log',
               '/u01/oradata/rk02/redo02b.log') SIZE 4M REUSE;
}
EXIT;
EOF

Rman, incrementally updated backups, switch to copy

Some oracle customers use SAN based technologies to keep a backup copy of the database in sync (Think EMC Timefinder and BCV’s). The typical backup strategy tends to be, to keep a full copy of the database, and on a daily basis to update this backup copy of the database (Typically a BCV synchronize  operation), with only the incremental changes that have happened at the disk level.

In case off a loss of file system in the primary disks, they can switch the database to run from another copy which is maintained as a backup by the Storage subsystem.

A similar strategy can be implemented for  oracle databases, using RMAN incrementally updated backups (Without requiring any expensive SAN level software for maintaining these copies). These backup copies can be updated on a daily basis using just the incremental changes in the database. In case of a loss in the database, you can switch to this copy to be used as the database. You can also generate clones from these backups using the rman duplicate command.

At a high level the steps involved are

  • Configure the database to run in archivelog mode
  • Configure block change tracking at the database level
  • Create a level 0 full image copy of the database using rman
  • Create level 1 incremental backups of the database and update the level 0 copy with this incremental backup
  • In case of need, use the rman switch command to switch the database to this incrementally updated backup.

Below is an example of how this is done.

Enable Archivelog and Block Change Tracking

sqlplus / as sysdba
startup mount;
alter database archivelog;
alter system set log_archive_dest_1=’LOCATION=/u01/orarch/rk01′ scope=spfile;
alter database open;
ALTER DATABASE ENABLE BLOCK CHANGE TRACKING
USING FILE ‘/u01/orarch/rk01/rman/rman_change_track.f’ REUSE;

Run a level 0 full image copy backup of the database.

A sample script to run the full image copy backup is below
#!/bin/bash
rman target=”sys/manager@rk01″ log=/u01/oraback/fullcopy.log << EOF
RUN {
ALLOCATE CHANNEL ch1 TYPE
DISK FORMAT ‘/u01/oraback/%b’;
BACKUP INCREMENTAL LEVEL 0 as copy DATABASE TAG ‘fullcopy’;
RELEASE CHANNEL ch1;
}
RUN {
ALLOCATE CHANNEL ch1 TYPE
DISK FORMAT ‘/u01/oraback/%U’;
BACKUP as copy ARCHIVELOG ALL;
BACKUP as copy CURRENT CONTROLFILE;
RELEASE CHANNEL ch1;
}
EXIT;
EOF
In the above script there are two distinct rman scripts.
The first one backs up the database and the second one backs up the control file and archivelogs.
I am using the %b format string for the database backup to keep the database file names intact (In preparation for a switch) (You dont have to keep the datafile names the same, just makes it easy to understand).
You cannot do controlfile and archivelog backups using the %b format string. This is why i have two separate scripts.
I tag the backup with a name ‘fullcopy’, which later helps us with applying the incremental backup to this tag.

Now perform some updates in the database. (This simulates database changes, after which we can run an incremental backup).

#!/bin/bash
sqlplus hr/hr@rk01 <<EOC
set echo on;
select employee_id,first_name,last_name from
employees
where
employee_id = 194
/
update employees set last_name=’Adams’
where
employee_id=194
/
select employee_id,first_name,last_name from
employees
where
employee_id = 194
/
EOC
I updated the last_name for the employee number 194.

Now perform an incremental level 1 backup of the database

#!/bin/bash
rman target=”sys/manager@rk01″ log=/u01/oraback/incr.log << EOF
RUN {
ALLOCATE CHANNEL ch1 TYPE
DISK FORMAT ‘/u01/oraback/%U’;
BACKUP
INCREMENTAL LEVEL 1
FOR RECOVER OF COPY WITH TAG ‘fullcopy’
DATABASE;
RELEASE CHANNEL ch1;
}
EXIT;
EOF

Now Apply the incremental backup to the original level 0 image copy

#!/bin/bash
rman target=”sys/manager@rk01″ log=/u01/oraback/incrapply.log << EOF
RUN {
ALLOCATE CHANNEL ch1 TYPE
DISK FORMAT ‘/u01/oraback/%U’;
RECOVER COPY OF DATABASE WITH TAG ‘fullcopy’;
RELEASE CHANNEL ch1;
}
EXIT;
EOF
Run the following commands to capture the location of the datafiles for the database
SQL> select file_name from dba_data_files;
FILE_NAME
——————————————————————————–
/u01/oradata/rk01/users01.dbf
/u01/oradata/rk01/undotbs01.dbf
/u01/oradata/rk01/sysaux01.dbf
/u01/oradata/rk01/system01.dbf
/u01/oradata/rk01/example01.dbf

Now switch the database to run from the copy

Shutdown and startup and mount the database; (Do not open the database)
oracle@rramads-us2:~$ rman target /
Recovery Manager: Release 11.2.0.2.0 – Production on Sat Nov 6 17:44:40 2010
Copyright (c) 1982, 2009, Oracle and/or its affiliates.  All rights reserved.
connected to target database: RK01 (DBID=2035601811, not open)
RMAN> switch database to copy;
using target database control file instead of recovery catalog
datafile 1 switched to datafile copy “/u01/oraback/system01.dbf”
datafile 2 switched to datafile copy “/u01/oraback/sysaux01.dbf”
datafile 3 switched to datafile copy “/u01/oraback/undotbs01.dbf”
datafile 4 switched to datafile copy “/u01/oraback/users01.dbf”
datafile 5 switched to datafile copy “/u01/oraback/example01.dbf”
RMAN> recover database;
Starting recover at 06-NOV-10
allocated channel: ORA_DISK_1
channel ORA_DISK_1: SID=133 device type=DISK
channel ORA_DISK_1: starting incremental datafile backup set restore
channel ORA_DISK_1: specifying datafile(s) to restore from backup set
destination for restore of datafile 00001: /u01/oraback/system01.dbf
destination for restore of datafile 00002: /u01/oraback/sysaux01.dbf
destination for restore of datafile 00003: /u01/oraback/undotbs01.dbf
destination for restore of datafile 00004: /u01/oraback/users01.dbf
destination for restore of datafile 00005: /u01/oraback/example01.dbf
channel ORA_DISK_1: reading from backup piece /u01/oraback/14lsbd4b_1_1
channel ORA_DISK_1: piece handle=/u01/oraback/14lsbd4b_1_1 tag=FULLCOPY
channel ORA_DISK_1: restored backup piece 1
channel ORA_DISK_1: restore complete, elapsed time: 00:00:07
starting media recovery
media recovery complete, elapsed time: 00:00:01
Finished recover at 06-NOV-10
RMAN> alter database open;
database opened
Now if you query the file_names they will point to the /u01/oraback directory

SQL> select file_name from dba_data_files;
FILE_NAME
——————————————————————————–
/u01/oraback/users01.dbf
/u01/oraback/undotbs01.dbf
/u01/oraback/sysaux01.dbf
/u01/oraback/system01.dbf
/u01/oraback/example01.dbf
Check if the changes we made to the hr.employees table are still there (To ensure that the incremental backup did work).
SQL> select last_name from hr.employees where employee_id = 194;
LAST_NAME
————————-
Adams

Transportable Tablespace from rman backup

Most of oracle’s MAA documentation for transportable tablespaces, seems to recommend that one should put the tablespaces one is transporting, in read-only mode in the source database, before copying the datafiles to the target. This in most cases means application downtime.

In order to minimize the downtime the recommendations seem to be

  • Create a dataguard physical standby database and use this standby database as the source for transport
  • Create a duplicate (aka clone) of your source and use this new duplicate as the source for transport

If you are only transporting a subset of your tablespaces and you want to minimize your downtime, a 3rd good option is to create and use transportable tablespace sets from your already existing rman backups. This process is documented in the Backup and Recovery users guide, Chapter 26 (11gR2 manual).

Since this process uses an existing rman backup you incur no downtime (ie no need to place tablespaces in read-only mode) on your production systems.

Below are the steps to accomplish this (In my example rk01 is the source database and rk02 is the target database).

  • First off, your source database should be running in archivelog mode
  • Take a full database backup from your source database

export ORACLE_SID=rk01
rman target /

run {
allocate channel oem_backup_disk1 type disk format ‘/u01/orarch/rk01/backup/%U’;
backup as BACKUPSET tag ‘%TAG’ database;
backup as BACKUPSET tag ‘%TAG’ archivelog all not backed up;
release channel oem_backup_disk1;
}
run {
allocate channel oem_backup_disk1 type disk format ‘/u01/orarch/rk01/backup/%U’ maxpiecesize 1000 G;
backup as BACKUPSET tag ‘%TAG’ current controlfile;
release channel oem_backup_disk1;
}

exit;

  • Create a transportable tablespace set for the tablespaces you need to transport

export ORACLE_SID=rk01

rman target /

RMAN>  transport tablespace example

2> tablespace destination ‘/u01/orarch/rk01/datafile’

3> auxiliary destination ‘/u01/orarch/rk01/tmp’;

Once the process is complete rman leaves a copy of the datafile (An operating system file, not a backup set file) for the tablespace example , in the directory /u01/orarch/rk01/datafile. It also leaves a export dump file that has the metadata needed for the transport in the same directory /u01/orarch/rk01/datafile.

  • Do endianness conversions on the files, if you need to go cross platform (Use rman convert)
  • Attach the tablespace to your target database

export ORACLE_SID=rk02

sqlplus / as sysdba

create directory tts_dir as ‘/u01/orarch/rk01/datafile’

/

grant all on directory tts_dir to public

/

Exit;

Before you run the next import, make sure that you have created the schema’s (with appropriate privileges) that are in the tablespace you are transporting in the target database rk02, also make sure any roles that are required are created in the target. Eg: The schema HR has objects in the example tablespace. use the create user command to create the HR user with appropriate privileges in the database rk02.

impdp system/manager dumpfile=dmpfile.dmp directory=tts_dir transport_datafiles=/u01/oradata/rk02/example01.dbf logfile=tts_import.log

So As you can see the whole process is executed without shutting down the source database rk01.

Cloud computing definition.

The National Institute of Standards and technology has a good, concise  definition of cloud computing. Sushil kumar of Oracle, was using the same language to define cloud computing in an article the current release of the oracle magazine.

Essential Charachteristics

  • On Demand Self-Service
  • Broad Network Access
  • Resource Pooling
  • Rapid Elasticity
  • Measured Service

Service Models

  • Cloud Software as a Service (SaaS)
  • Cloud Platform as a Service (PaaS)
  • Cloud Infrastructure as a Service (IaaS)

Deployment Models

  • Private Cloud
  • Community Cloud
  • Public Cloud
  • Hybrid Cloud

Grid computing sessions at Oracle Openworld 2009

If you are attending Oracle Openworld 2009, and are interested in learning a lot about oracle Rac and Grid computing, you can find a full list of Oracle Rac and Grid computing events Here (Starting at page 2 of the pdf doc).

In preparation for the event, you could read the following new 11gR2 white papers from Oracle, to understand the latest developments and arm yourself with questions.

Oracle Real Application Clusters 11g Release 2 Technical Overview

Oracle Real Application Clusters 11g Release 2 Overview of SCAN

Oracle Real Application Clusters One Node 11g Release 2 Technical Overview

11gR2 rac installation on 64 bit Linux step by step

Yesterday i completed a 11g Release 2 real application clusters installation on 64 bit Oracle Enterprise Linux 4. The installation process is very similar to the 10g and 11gr1 installations, but much simpler. This was a two node cluster. There are some new concepts that are introduced in 11gR2 real application clusters. Below are some of my notes on 11gr2 new features for Rac and detailed steps that i followed to complete the installation.

Some new concepts in 11gR2 Rac


Oracle clusterware and ASM now are installed into the Same Oracle Home, and is now called the grid infrastructure install.

Raw devices are no longer supported for use for anything (Read oracle cluster registry, voting disk, asm disks), for new installs.

OCR and Voting disk can now be stored in ASM, or a certified cluster file system.

The redundancy level of your ASM diskgroup (That you choose to place voting disk on) determines the number of voting disks you can have.
You can place

  • Only One voting disk on an ASM diskgroup configured as external redundancy
  • Only Three voting disks on an ASM diskgroup configured as normal redundancy
  • Only Five voting disks on an ASM diskgroup configured as high redundancy


The contents of the voting disks are automatically backed up into the OCR

ACFS (Asm cluster file system) is only supported on Oracle Enterprise Linux 5 (And RHEL5), not on OEL4.

There is a new service called cluster time synchronization service that can keep the clocks on all the servers in the cluster synchronized (In case you dont have network time protocol (ntp) configured)

Single Client Access Name (SCAN), is a hostname in the DNS server that will resolve to 3 (or at least one) ip addresses in your public network. This hostname is to be used by client applications to connect to the database (As opposed to the vip hostnames you were using in 10g and 11gr1). SCAN provides location independence to the client connections connecting to the database. SCAN makes node additions and removals transparent to the client application (meaning you dont have to edit your tnsnames.ora entries every time you add or remove a node from the cluster).

Oracle Grid Naming Service (GNS), provides a mechanism to make the allocation and removal of VIP addresses a dynamic process (Using dynamic Ip addresses).

Intelligent Platform Management Interface (IPMI) integration, provides a new mechanism to fence server’s in the cluster, when the server is not responding.

The installer can now check the O/S requirements, report on the requirements that are not met, and give you fixup scripts to fix some of them (like setting kernel parameters).

The installer can also help you setup SSH between the cluster nodes.

There is a new deinstall utility that cleans up a existing or failed install.

And the list goes on an on.

I have broken up the installation process into 3 distinct documents, which can be found below

Installing 11gr2 grid infrastructure

Installing 11gr2 Real Application Clusters

Creating the 11gr2 Clustered database

Rac Starter Kit

Oracle has for a while had a rac assurance team (A team within oracle support, in the HA/Rac support team) that engages proactively with new rac customers. The Rac Assurance team used to provide the new customers with a “starter kit” of documents that  include

  1. A Rac best practices document
  2. A Step by Step installation guide
  3. Recommended patches
  4. A test plan.

If the customer follows these best practices, it sets them up with a solid foundation to be successful with their new  Rac implementation.

Now these test kits are public, you can access them by accessing metlink note 810394.1

You can also log a tar in metalink and ask support for the “Rac Starter Kit”, and they will give you the platform specific starter kit that includes the list of recommended patches.

Rac how to determine interconnect speed

During a Recent  Oracle 11g Rac installation on Solaris, i ran into  the following issue. After installing and configuring oracle clusterware, when we were trying to create the ASM instance, the ASM instance would only stay alive on one node of the cluster. The customer had configured the private interconnect to be a 100 base T connection (As opposed to GiGE). Once the customer re-configured the interconnect to be a GiGE, the ASM instance came up properly. Oracle recommends that you have a GiGE connection for your private interconnect.

Before starting your installation you can check if the interface you are using for the private interconnect, is configured to be a GIGE connection.

On Redhat or Oracle Enterprise Linux

Install the rpm ethtool

ethtool <interfacename> | grep Speed ,will give you the speed of the interface

On Solaris



kstat <interfacename> | grep link_speed  ,will give you the speed of the interface

ORA-00845 Memory_Target Not supported on this system

I was working on testing some 11g streams configurations today. I needed to startup 3 databases instances on the same server. I was using AMM (Automatic memory management). When i was trying to startup the 3rd database, i kept getting the error message “ORA-00845: MEMORY_TARGET not supported on this system”. I also had error messages in the alert log.

This is because, the space allocated for /dev/shm is not sufficient to allocate the SGA+PGA’s for all the 3 database instances (When using the initialization parameter memory_target). The space allocated needs to be >= the total SGA+PGA size of all the 3 instances together.

You can increase the space allocated using the command  “mount -t tmpfs shmfs -o size=2000m /dev/shm” ( I had 3 instances 600mb each SGA+PGA). You can persist this allocation across reboots by adding it to the /etc/fstab.