Managing Processes and Memory Consumption on Oracle Exadata database machines.

This post is equally applicable to Rac databases deployed, both on oracle database machines as well as traditional server/storage architectures, running RHEL5 or OEL5 (OEL6 has a new feature called transparent hugepages).

The Best practices for database consolidation on exadata database machine, White Paper outlines a lot of good guidelines. However the best practices seems to be getting occasionally overlooked, in some cases resulting in Node evictions/restarts in the cluster.

High amounts of memory and cpu consumption can result in some real bad things happening (Like for eg: Node evictions). So it is important to configure the operating system and the databases you deploy on the machine optimally, to use the available CPU and memory resources.

Let us first review what those available memory and cpu resources are on an Oracle Exadata Database Machine, on each database node.

[table “1” not found /]

Anytime a workload is deployed on a database server, that exceeds the, operating systems ability to efficiently use the above mentioned, available cpu and memory resources, bad things can happen. (I know i am over simplifying in the prior comment, but you get the idea).

Managing Memory Resources.

Setting up HugePages

First we need to make sure that HugePages on each node, is setup correctly, for the databases you have deployed on the node.
Tanel Poder has explained the reasons for setting up HugePages on Exadata systems.
When the database machines are installed, the HugePages is configured to accomodate the one database that is pre-installed. So if you change the SGA setting for that database or deploy more databases you should now adjust the HugePages setting accordingly. It goes without saying that if you have already configured the hugepages to accomodate all the databases you have, and then you removed some databases from the node, or resized the SGA’s to be smaller, you should again resize your hugepages to free up the excess hugepages.

Once all the database instances are configured and started up you can run the script, hugepages_settings.sh from Mos Note 401749.1, which calculates the number of hugepages you need to setup. I usually recommend adding 10% to the value that hugepages_settings.sh suggests.

With 11.2.0.2 and 11.2.0.3 you should also be setting USE_LARGE_PAGES, preferably to ONLY, following the instructions in Mos Note 1392497.1.

You can then follow the instructions in MOS note 361323.1, under the section “Kernel version 2.6”, to set the value. You need to restart the databases so the SGA gets allocated with the hugepages.

You should also follow instructions from MOS note 1546861.1, and set the value of vm.min_free_kbytes = 524288, in /etc/sysctl.conf, to avoid page allocation failure messages when kernel memory is depleted.

Managing Memory Usage

Ideally for critical implementations, your SGA+PGA+Individual server processes, memory allocations should not exceed 75% of the physical memory on the database node.

The Exadata consolidation white paper above suggests that

For critical Hardware Pools, we recommend an even more conservative approach by not exceeding 75% physical memory per database node.
OLTP applications:
SUM of databases (SGA_TARGET +PGA_AGGREGATE_TARGET) + 4 MB * (Maximum PROCESSES) < Physical Memory per Database Node
DW/BI applications:
SUM of databases (SGA_TARGET + 3 * PGA_AGGREGATE_TARGET) < Physical Memory per Database Node

You can monitor the pga usage a few different ways.

1) SELECT s.inst_id, s.sid, s.serial#, p.spid, s.username, s.program, p.pga_used_mem FROM gv$session s JOIN gv$process p ON p.addr = s.paddr AND p.inst_id = s.inst_id WHERE s.type != ‘BACKGROUND’ and s.program not like ‘%(P%’ and p.pga_used_mem > <APPLICATION_MEMORY_THRESHOLD>
order by s.inst_id, s.sid, s.serial#;

The value you use for APPLICATION_MEMORY_THRESHOLD is dependent on your application and howmuch pga it needs to efficiently run your queries (Performing the sorts in memory using the space for PGA is usually faster than using the TEMP tablespace). However the white paper suggests using 1Gb for OLTP applications and 10Gb for DSS applications.

2) You can monitor the sysmetric “Total PGA Allocated” from dba_hist_sysmetric_summary, to see how much PGA is being allocated.

3) You can use the “PGA Memory Advisory” and “SGA Target Advisory” sections of the awr report, to guage, the impacts of increasing or decreasing the SGA and PGA Allocations.

4) You can monitor the memory usage at the o/s level using the top (Mem: free),free or vmstat (bo (page-out),bi (page-in)) commands.

Managing O/S Processes (And as a consequence CPU usage).

It is important to control the number of processes that are actively executing on each node. Each of these processes consume cpu and memory resources.

The following guidelines are important to follow.
– Use Application connection pooling to appropriately limit the number of sessions connected to the oracle database.

This is important for two reasons.
– Right-Size the resource consumption on each database node.
– Minimize the impact of connection storms.

– Configure the parallel query servers appropriately.
– For all the database’s deployed on a node the total setting for PARALLEL_MAX_SERVERS should be less than or equal to the following

X2-2 or X3-2, <= 240
X2-8 or X3-8, <= 1280

I often see implimentations where parallel_max_servers = 240 (or a high value) in each instance on the machine, which results in not so good consequences. I have occassionaly seen systems where there is sustained 100% cpu usage with load averages of > 300, resulting in node reboots.

– Use Instance Caging and Database Resource Manager to manage cpu allocation on the server.

It would be a good idea to always review the Best practices for database consolidation on exadata database machine, White Paper and follow the guidelines, to have a well performing and  stable database environment.

It would be a good idea to review the performance management screens from Enterprise Manager 12c every day to have a good understanding of the database and server workloads, and/or analyze the oswatcher data to understand the server workload profile.

11gR2 rac installation on 64 bit Linux step by step

Yesterday i completed a 11g Release 2 real application clusters installation on 64 bit Oracle Enterprise Linux 4. The installation process is very similar to the 10g and 11gr1 installations, but much simpler. This was a two node cluster. There are some new concepts that are introduced in 11gR2 real application clusters. Below are some of my notes on 11gr2 new features for Rac and detailed steps that i followed to complete the installation.

Some new concepts in 11gR2 Rac


Oracle clusterware and ASM now are installed into the Same Oracle Home, and is now called the grid infrastructure install.

Raw devices are no longer supported for use for anything (Read oracle cluster registry, voting disk, asm disks), for new installs.

OCR and Voting disk can now be stored in ASM, or a certified cluster file system.

The redundancy level of your ASM diskgroup (That you choose to place voting disk on) determines the number of voting disks you can have.
You can place

  • Only One voting disk on an ASM diskgroup configured as external redundancy
  • Only Three voting disks on an ASM diskgroup configured as normal redundancy
  • Only Five voting disks on an ASM diskgroup configured as high redundancy


The contents of the voting disks are automatically backed up into the OCR

ACFS (Asm cluster file system) is only supported on Oracle Enterprise Linux 5 (And RHEL5), not on OEL4.

There is a new service called cluster time synchronization service that can keep the clocks on all the servers in the cluster synchronized (In case you dont have network time protocol (ntp) configured)

Single Client Access Name (SCAN), is a hostname in the DNS server that will resolve to 3 (or at least one) ip addresses in your public network. This hostname is to be used by client applications to connect to the database (As opposed to the vip hostnames you were using in 10g and 11gr1). SCAN provides location independence to the client connections connecting to the database. SCAN makes node additions and removals transparent to the client application (meaning you dont have to edit your tnsnames.ora entries every time you add or remove a node from the cluster).

Oracle Grid Naming Service (GNS), provides a mechanism to make the allocation and removal of VIP addresses a dynamic process (Using dynamic Ip addresses).

Intelligent Platform Management Interface (IPMI) integration, provides a new mechanism to fence server’s in the cluster, when the server is not responding.

The installer can now check the O/S requirements, report on the requirements that are not met, and give you fixup scripts to fix some of them (like setting kernel parameters).

The installer can also help you setup SSH between the cluster nodes.

There is a new deinstall utility that cleans up a existing or failed install.

And the list goes on an on.

I have broken up the installation process into 3 distinct documents, which can be found below

Installing 11gr2 grid infrastructure

Installing 11gr2 Real Application Clusters

Creating the 11gr2 Clustered database

11g Rac Installation Checklists

In my numerous engagements with customers, assisting them with Oracle Rac installations, i have found that providing the customer with a checklist of tasks to be performed before the installation, and reviewing it with their Database/Server/Storage/Network  Administrator’s  help in a smooth RAC installation.The checklist only lists the tasks, but does not go into step by step details regarding how to perform the task. For the details, i then point them to the Oracle Installation and Configuration Guides (Links to which can be found in the checklist itself) and relavent metalink notes.

Below are links to the checklists that I use for 11g Installations on Linux and Solaris. All pointers to errors and improvements are welcome.

Oracle Real Application Clusters, 11g Installation Checklist for Redhat Enterprise Linux 5 and Oracle Enterprise Linux 5

Oracle Real Application Clusters, 11g Installation Checklist for Solaris

Rac how to determine interconnect speed

During a Recent  Oracle 11g Rac installation on Solaris, i ran into  the following issue. After installing and configuring oracle clusterware, when we were trying to create the ASM instance, the ASM instance would only stay alive on one node of the cluster. The customer had configured the private interconnect to be a 100 base T connection (As opposed to GiGE). Once the customer re-configured the interconnect to be a GiGE, the ASM instance came up properly. Oracle recommends that you have a GiGE connection for your private interconnect.

Before starting your installation you can check if the interface you are using for the private interconnect, is configured to be a GIGE connection.

On Redhat or Oracle Enterprise Linux

Install the rpm ethtool

ethtool <interfacename> | grep Speed ,will give you the speed of the interface

On Solaris



kstat <interfacename> | grep link_speed  ,will give you the speed of the interface