Script to compare tkprof output files

I often use the oracle 10046 event tracing mechanism to capture sql’s from a session to identify why certain transactions are running slower in different env’s or at different points in time. Oracle does have a mechanism where you can save the trace information in database tables. One can use the INSERT parameter in tkprof to store the trace information into a database table. Once in the table you can write sql’s that compare multiple runs or multiple statements.

I wrote a python program that compares two different tkprof output files. The files are compared, and the following aspects of each of the sqlid’s in the tkprof output file,s are printed side by side. The output is sorted by the Difference in Elapsed Time, in Descending order.

  • Sql text
  • Plan Hash Value
  • Total Elapsed time
  • Total Logical Reads
  • Total Rows processed
  • Difference in Elapsed Time
  • Difference in Number of Rows processed
  • Difference in Logical reads.

Other columns can be added to this, if you desire.
I use this script output as a quick way to see which sql’s are running slower and are probably candidates for further analysis/tuning.

The sqlid’s from the file provided as the first argument to the script (referred to as the left file) are compared to the same sqlid’s in the file provided as the second argument to the script (referred to as the right file). The following columns are displayed.

sqlid                           sqlid being compared
text                             First 20 chars of the sql text
lplan                           Plan hash value from the left file
rplan                          Plan hash value from the right file
lela                             Total Elapsed time from the left file
rela                            Total Elapsed time from the right file
llreads                       Total Logical reads (query+current) from the left file
rlreads                      Total Logical reads (query+current) from the right file
lrows                         Total rows processed from the left file
rrows                        Total rows processed from the right file
eladiff                        Lela – Rela
rowsdiff                    Lrows – Rrows
lreadsdiff                  Llreads – rlreads

Here is a sample syntax for running the script. (You need the python pandas package to be installed for this to execute successfully)

python ./difftk.py /u01/tkprofout/Newplans.prf /u01/tkprofout/Stage.prf

Here is a sample output

difftk

Click on the image to view a larger version.

The full script is below

#Python script to list differences between sql executions in two tkprof output files
#useful if comparing tkprof from prod and dev for example
#Author : rajeev.ramdas

import sys
import os
import pandas as pd
from pandas import DataFrame

# Define a class to hold the counters for each sqlid
class sqliddet:
     def init(self):
            sqlid=''
            text=''
            plan_hash=''
            tcount=0
            tcpu=0
            tela=0
            tdisk=0
            tquery=0
            tcurr=0
            trows=0

# Define 2 empty dictionaries to store info about each input file
leftsqliddict={}
rightsqliddict={}

# Process each file and add one row per sqlid to the dictionary
# We want to add the row to the dictionary only after the SQLID row and the total row has been read
# So the firstsqlid flag is used to make sure that we do not insert before total is read for the first row.

def processfile(p_file,p_sqliddict):

    myfile=open(p_file,"r")
    line=myfile.readline()
    firstsqlid=True
    linespastsqlid=99
    while line:
        linespastsqlid+=1
        line=myfile.readline()
        if line.startswith('SQL ID'):
            linespastsqlid=0
            if firstsqlid==True:
                firstsqlid=False
            else:
                p_sqliddict[currsqlid.sqlid]=[currsqlid.plan_hash,currsqlid.tcount,currsqlid.tcpu,currsqlid.tela,currsqlid.tdisk,currsqlid.tquery
                            ,currsqlid.tcurr,currsqlid.trows,currsqlid.text]
            currsqlid=sqliddet()
            currsqlid.sqlid=line.split()[2]
            currsqlid.plan_hash=line.split()[5]
        if linespastsqlid==2:
            currsqlid.text=line[0:20]
        if line.startswith('total'):
            a,currsqlid.tcount,currsqlid.tcpu,currsqlid.tela,currsqlid.tdisk,currsqlid.tquery,currsqlid.tcurr,currsqlid.trows=line.split()
        if line.startswith('OVERALL'):
            p_sqliddict[currsqlid.sqlid]=[currsqlid.plan_hash,currsqlid.tcount,currsqlid.tcpu,currsqlid.tela,currsqlid.tdisk,currsqlid.tquery
                       ,currsqlid.tcurr,currsqlid.trows,currsqlid.text]
        continue
    myfile.close()

# Main portion of script
if len(sys.argv) != 3:
   print('Syntax : python ./difftk.py tkprof1.out tkprof2.out')
   sys.exit()

if not os.path.isfile(sys.argv[1]) or not os.path.isfile(sys.argv[2]):
   print ("File Does not Exist")
   sys.exit()

processfile(sys.argv[1],leftsqliddict)
processfile(sys.argv[2],rightsqliddict)

t_difftk_lst=[]

# Match the sqlid's from the file on the left to the file on the right
# Gather up the statistics form both sides, insert into a list
# Transfer the list to a pandas dataframe, add some computed columns

for sqlid,stats in leftsqliddict.items():
    l_totlogical=int(stats[5])+int(stats[6])
    if sqlid in rightsqliddict:
       t_difftk_lst.append([sqlid,stats[8].rstrip(),stats[0],rightsqliddict[sqlid][0],float(stats[3])
                            ,float(rightsqliddict[sqlid][3]),float(l_totlogical),float(rightsqliddict[sqlid][5])+float(rightsqliddict[sqlid][6])
                            ,float(stats[7]),float(rightsqliddict[sqlid][7])
                           ])
    else:
       t_difftk_lst.append([sqlid,stats[8].rstrip(),stats[0],0,float(stats[3]),0
                            ,float(l_totlogical),0,float(stats[7]),0
                           ])

difftk_df=DataFrame(t_difftk_lst,columns=['sqlid','sqltext','lplan','rplan','lela','rela','llreads','rlreads','lrows','rrows'])
difftk_df['eladiff']=difftk_df['lela']-difftk_df['rela']
difftk_df['rowsdiff']=difftk_df['lrows']-difftk_df['rrows']
difftk_df['lreadsdiff']=difftk_df['llreads']-difftk_df['rlreads']

pd.set_option('display.width',1000)
print (difftk_df.sort(columns='eladiff',ascending=False))

Using Pandas for CSV Analysis

As Oracle Dba’s we often come across situations where we are handed CSV (Comma separated values) files, by our managers, or customers, as Raw data, based on which we need to do some work. The first task would be to analyze the file and come up with some summary satistics, so we can quantify the amount of work involved.

When faced with such circumstances, my favorite method is to use sqlloader to upload the file into a database table, and then run sql statements on the data to produce summary info. Most people would probably use excel, formulas, macros and pivot tables to achieve similar results.

In this blog post, i present an alternate method that i’ve been using recently, for csv file summarization.

Pandas is a library written for the Python language for data manipulation and analysis. In order to proceed, first install Python and then install the Python package named ‘pandas’. Pandas is a real good alternative to the R programming language.See my previous post on how to install python and pandas.

For the examples in this post, i am using a Csv file, that has NFL game, play by play statistics for 2014.

Start by invoking the python interactive interpreter.

 

     $ python3
Python 3.4.2 (default, Dec 18 2014, 14:18:16) 
[GCC 4.8.2] on linux
Type "help", "copyright", "credits" or "license" for more information.

First import the following libraries that we are going to use.

     
import pandas as pd
import numpy as np

Read the csv file into a Pandas DataFrame

 

df=pd.DataFrame(pd.read_csv('pbp-2014.csv',header=0))

Check how many rows the dataframe has

 
df.size
2056275

List the columns in the DataFrame

 

list(df)
['GameId', 'GameDate', 'Quarter', 'Minute', 'Second', 'OffenseTeam', 'DefenseTeam', 'Down', 'ToGo', 'YardLine', 'Unnamed: 10', 'SeriesFirstDown', 'Unnamed: 12', 'NextScore', 'Description', 'TeamWin', 'Unnamed: 16', 'Unnamed: 17', 'SeasonYear', 'Yards', 'Formation', 'PlayType', 'IsRush', 'IsPass', 'IsIncomplete', 'IsTouchdown', 'PassType', 'IsSack', 'IsChallenge', 'IsChallengeReversed', 'Challenger', 'IsMeasurement', 'IsInterception', 'IsFumble', 'IsPenalty', 'IsTwoPointConversion', 'IsTwoPointConversionSuccessful', 'RushDirection', 'YardLineFixed', 'YardLineDirection', 'IsPenaltyAccepted', 'PenaltyTeam', 'IsNoPlay', 'PenaltyType', 'PenaltyYards']

Check how many games the dataset covers. The GameId column is a unique identifier that identifies each game. The nunique method returns the number of unique elements in that object.

 

df.GameId.nunique()
256

List details of all games played by the New England Patriots. This command shows how you can provide a filter condition. The filter specifies that all the rows, where the OffensiveTeam is NE or DefensiveTeam is NE be listed.

 

df[(df['OffenseTeam'] == 'NE') | (df['DefenseTeam'] == 'NE')]

Subset a specific set of columns

 

df[['GameId','PlayType','Yards']]

           GameId     PlayType  Yards
0      2014090400     KICK OFF      0
1      2014090400         RUSH      6
2      2014090400         RUSH      3
3      2014090400         RUSH     15
4      2014090400         RUSH      2
5      2014090400         PASS     -2
...
...

Display all Pass and Rush plays Executed by New England. Here we are using a and filter condition to limit the rows to those of New England and the PlayType is either a PASS or a RUSH.

 

df[['GameId','PlayType','Yards']][(df['OffenseTeam'] == 'NE') & (df['PlayType'].isin(['PASS','RUSH']))]   
           GameId PlayType  Yards
1092   2014090705     RUSH      2
1093   2014090705     PASS      4
1094   2014090705     PASS      0
1102   2014090705     PASS      8
1103   2014090705     PASS      8
1104   2014090705     RUSH      4
     

Display the Number of Plays, Total Yards Gained, and Average Yards gained per PASS and RUSH play, per game.

 

df[['GameId','PlayType','Yards']][(df['OffenseTeam'] == 'NE') & (df['PlayType'].isin(['PASS','RUSH']))].groupby(['GameId','PlayType']).agg({'Yards': [np.sum,np.mean],'GameId':[np.size]}) 
                    Yards           GameId
                      sum      mean   size
GameId     PlayType                       
2014090705 PASS       277  4.540984     61
           RUSH       109  5.190476     21
2014091404 PASS       209  8.038462     26
           RUSH       158  4.157895     38
2014092105 PASS       259  6.641026     39
           RUSH        91  2.935484     31
2014092900 PASS       307  9.903226     31
           RUSH        75  4.687500     16
2014100512 PASS       301  7.921053     38
           RUSH       223  5.309524     42
2014101201 PASS       407  9.465116     43
           RUSH        60  2.307692     26
2014101600 PASS       267  6.675000     40
           RUSH        65  4.062500     16
2014102605 PASS       376  9.400000     40
           RUSH       121  3.781250     32
2014110208 PASS       354  6.210526     57
           RUSH        66  2.640000     25
2014111611 PASS       267  8.343750     32
           RUSH       248  6.048780     41
2014112306 PASS       393  6.894737     57
           RUSH        90  4.285714     21
2014113010 PASS       245  6.621622     37
           RUSH        85  5.000000     17
2014120713 PASS       317  6.604167     48
           RUSH        80  3.333333     24
2014121408 PASS       287  7.972222     36
           RUSH        92  3.407407     27
2014122105 PASS       189  5.250000     36
           RUSH        78  4.333333     18
2014122807 PASS       188  5.222222     36
           RUSH        92  4.181818     22


From the above example’s you can see how easy it is to read a csv file, apply filters and summarize the data set using pandas.

Oracle Database In-Memory an introduction Part 2 – What do i need to do, to use the Oracle In-Memory Database ?

Step 1) Define the INMEMORY_SIZE

Customer has to setup the correct value for a database initialization parameter , INMEMORY_SIZE. This parameter specifies the amount of memory, from the SGA, that is to be used for the In-Memory column store. This is a static pool (ie Automatic memory management cannot extend or shrink this area), which means that you have to restart the database if any changes to this parameter needs to take effect. The In-Memory area is sub-divided into two pools: a 1MB pool used to store the actual column formatted data populated into memory, and a 64K pool used to store metadata about the objects that are populated into the IM column store.

 

sho parameter inmemory_size

NAME				     TYPE	 VALUE
------------------------------------ ----------- ------------------------------
inmemory_size			     big integer 500M

select pool,alloc_bytes,used_bytes,populate_status from v$inmemory_area;

POOL			   ALLOC_BYTES USED_BYTES POPULATE_STATUS
-------------------------- ----------- ---------- --------------------------
1MB POOL		     418381824		0 DONE
64KB POOL		     100663296		0 DONE

Step 2) Mark the performance critical objects in your database, with the attribute INMEMORY

select partition_name,bytes/(1024),inmemory,inmemory_compression from dba_segments where
owner = 'SH' and segment_name = 'SALES'

PARTITION_NAME		       BYTES/(1024) INMEMORY INMEMORY_COMPRESS
------------------------------ ------------ -------- -----------------
SALES_Q1_1998			       8192 DISABLED

ALTER TABLE SH.SALES MODIFY PARTITION SALES_Q1_1998 INMEMORY PRIORITY CRITICAL MEMCOMPRESS FOR QUERY HIGH;

Table altered.

select partition_name,bytes/(1024),inmemory,inmemory_compression from dba_segments where
owner = 'SH' and segment_name = 'SALES'  2  ;

PARTITION_NAME		       BYTES/(1024) INMEMORY INMEMORY_COMPRESS
------------------------------ ------------ -------- -----------------
SALES_Q1_1998			       8192 ENABLED  FOR QUERY HIGH


Step 3) Populate the In-Memory datastore

Objects are populated into the In-Memory Datastore, in a prioritized list, immediately after the database is opened, or after they are scanned for the first time. There are 7 levels for the keyword PRIORITY (CRITICAL, HIGH, MEDIUM, LOW, NONE).

The IM column store is populated by a set of background processes referred to as worker processes (ora_w001_orcl). The database is fully active / accessible while this occurs.Each worker process is given a subset of database blocks from the object to populate into the IM column store. Population is a streaming mechanism, simultaneously columnizing and compressing the data. There is a new IMCO (In memory co-ordinator) background process which wakes up every 2 minutes and checks to see if there are any population tasks that need to be completed. Eg: A new object has been marked as InMemory with a PRIORITY other than None.

select v.owner,v.segment_name,v.partition_name,v.bytes orig_size,v.inmemory_size in_mem_size

OWNER	   SEGMENT_NA PARTITION_NAME		      ORIG_SIZE IN_MEM_SIZE
---------- ---------- ------------------------------ ---------- -----------
SH	   SALES      SALES_Q1_1998			8388608     1179648

select * from 
(
select /*+ full(sales) */ channel_id,count(*)
from sh.sales partition (sales_q1_1998)
group by channel_id
order by count(*) desc
)
where rownum < 6

CHANNEL_ID   COUNT(*)
---------- ----------
	 3	32796
	 2	 6602
	 4	 3926
	 9	  363

Elapsed: 00:00:00.09

select * from table (dbms_xplan.display_cursor());

PLAN_TABLE_OUTPUT
--------------------------------------------------------------------------------------------------------
SQL_ID	40pjk921r3jrc, child number 0
-------------------------------------
select * from ( select /*+ full(sales) */ channel_id,count(*) from
sh.sales partition (sales_q1_1998) group by channel_id order by
count(*) desc ) where rownum < 6

Plan hash value: 2962696457

---------------------------------------------------------------------------------------------------------
| Id  | Operation			| Name	| Rows	| Bytes | Cost (%CPU)| Time	| Pstart| Pstop |
---------------------------------------------------------------------------------------------------------

PLAN_TABLE_OUTPUT
---------------------------------------------------------------------------------------------------------
|   0 | SELECT STATEMENT		|	|	|	|    12 (100)|		|	|	|
|*  1 |  COUNT STOPKEY			|	|	|	|	     |		|	|	|
|   2 |   VIEW				|	|     4 |   104 |    12  (34)| 00:00:01 |	|	|
|*  3 |    SORT ORDER BY STOPKEY	|	|     4 |    12 |    12  (34)| 00:00:01 |	|	|
|   4 |     HASH GROUP BY		|	|     4 |    12 |    12  (34)| 00:00:01 |	|	|
|   5 |      PARTITION RANGE SINGLE	|	| 43687 |   127K|     9  (12)| 00:00:01 |     5 |     5 |
|   6 |       TABLE ACCESS INMEMORY FULL| SALES | 43687 |   127K|     9  (12)| 00:00:01 |     5 |     5 |
---------------------------------------------------------------------------------------------------------

For much more in-depth technical details of the Oracle Database In-Memory, please see this whitepaper.

Oracle Database In-Memory an introduction Part 1 – What is Oracle Database In-Memory, and how is it different ?

On July 22nd 2014, Oracle corporation, announced Oracle Database 12c’s latest patch release 12.1.0.2. This latest patch release includes the new Oracle Database In-Memory functionality.
The Oracle Database In-Memory enables a single database to efficiently support mixed workloads. It uses a “dual-format” architecture, that retains the Record-Setting OLTP performance of the oracle databases, while simultaneously supporting real-time analytics and reporting. This is achieved by retaining the traditional oracle memory architecture, but adding a new purely in-memory column format (Automatically created and maintained by oracle), optimized for analytical processing.

So now you have the data stored in the Oracle database in your database files, in a row format, and for any of the objects marked as INMEMORY, oracle creates an In-Memory column store, where the data resides in a column format.The IM column store does not replace the buffer cache, but acts as a supplement, so that data can now be stored in memory in both a row and a column format.

Ok why the dual format, one would ask ?

The Row format is retained as is, so that there is no compromise/degradation in the OLTP performance of the database. In a Row format database each row is made up of multiple columns, with each column representing an attribute about that record. A column format database stores each of the attributes about the transaction in a separate column structure. A column format is ideal for Analytics, but is not very efficient in processing DML requests like insert, update and deletes (Which operates on the whole row). Oracle Database In-Memory (Database In-Memory) provides the best of both worlds by allowing data to be simultaneously populated in both an in-memory row format (the buffer cache) and a new in-memory column format (The In-Memory Store).

cncpt_vm_379

The picture above shows the In-Memory area in the SGA

cncpt_vm_378

The above picture shows an example of the Column Store.

No changes are necessary to your existing applications in order to take advantage of the Oracle Database In-Memory option. Any query that will benefit from the In-Memory column store will be automatically directed there, by the Optimizer. The In-Memory store is kept transactionally consistent with the buffer cache. There are numerous optimizations that have been implemented that speed up this data access in the In-Memory store. All the database functionality that Oracle has built over the last 30 years, continues to work in this new version.

It is normal that folks would go out and compare, Oracle Databaes In-Memory, with other In Memory Database products available in the Market today. So let us explore the differences with some of those products next.

TimesTen

Oracle TimesTen In-Memory Database is a,row oriented, memory optimized relational database, that resides entirely in the physical memory of a server. Oracle TimesTen In-Memory Database operates on databases that fit entirely in physical memory using standard SQL interfaces. In contrast, the Oracle Database In-Memory, only stores selected objects, in memory, in a column format (As opposed to a row format in Oracle TimesTen), the remaining objects continue to exist on the storage subsystem.

SAP HANA

SAP HANA is an in-memory, column-oriented relational database. So the entire database has to reside in physical memory of the server. As mentioned earlier in this article, OLTP transactions have some disadvantages while operating on column stores.

Todays databases can be in the 10’s or 100’s of TeraBytes. Storing this entire data in physical memory can be an expensive proposition, not to mention that, it is quite prevalent that ,only a small subset of this data, ends up having daily usage. This is where Oracle’s approach of storing only selected objects in memory, has significant benefits.

Here is a link to the Oracle Database In-Memory Launch.

Exadata Database Machine Specifications – Quick Reference

With the different combinations of the Oracle Exadata database machines and storage expansion racks available to customers, it is hard for me to remember, each system’s specifications.

So i have created a JavaScript page, that can be used to quickly lookup the specifications of the current generation (x3-8 and x4-2) of Exadata database machines and storage expansion racks.

The page can be accessed from a web browser from your desktop,laptop, or favorite mobile device at http://dbastreet.com/exaspecs

You can narrow the results down by the exadata rack size, type of disk, and type of specification (cpu, storage or performance).

Here is a screenshot

exaspecs

Setting up a Python 2.7.6 Virtual Env for Python development

Python is an excellent language to learn, for DBA’s who want to automate all the repetitive tasks, they need to perform. Once you start using python it is likely that you want to setup multiple environments with different versions of Python and libraries, based on the project you are working on. Virtualenv is a tool to create isolated python environments.

Below are the steps that i followed, to install a brand new working Python 2.7.6 environment with the following packages.

 
           SQLAlchemy    - Object Relational Mapper and SQL Toolkit for Python
           numpy         - Fundamental package for scientific computing
           matplotlib    - Python 2D plotting library
           ipython       - Interactive Python Shell
           pandas        - Python Data Analysis Library 
           Flask         - An easy to use Python Lightweight Micro Framework

This installation is performed on Ubuntu Linux, and i have already installed the libsqlite3-dev package and the oracle instant client.

Install Python 2.7.6

Download Python-2.7.6.tgz from http://www.python.org/getit

Install python 2.7.6 to your directory of choice.

		tar -xvf Python-2.7.6.tgz
		cd Python-2.7.6/

		./configure --prefix=/u01/Rk/Apps/Python/Python276
		make
		make install

Now you have python 2.7.6 installed into the /u01/Rk/Apps/Python/Python276 directory. (You will have a python binary in /u01/Rk/Apps/Python/Python276/bin)

Download virtualenv

curl -O https://pypi.python.org/packages/source/v/virtualenv/virtualenv-1.11.1.tar.gz

Install virtualenv

tar -xzvf virtualenv-1.11.1.tar.gz
cd virtualenv-1.11.1/
/u01/Rk/Apps/Python/Python276/bin/python virtualenv.py /u01/Rk/Apps/Python/p276env1

Activate the virtualenv

           . /u01/Rk/Apps/Python/p276env1/bin/activate

Install the additional Python Modules you need

	   pip install SQLAlchemy
	   pip install numpy
	   pip install matplotlib
	   pip install ipython
	   pip install pyzmq
	   pip install tornado
	   pip install jinja2
	   pip install pandas
	   pip install Flask
	   pip install Flask-SQLAlchemy

Install cx_Oracle

Ensure that the oracle instant client is installed, and the environment variables ORACLE_HOME and LD_LIBRARY_PATH are setup correctly.

Download cx_Oracle (a python extension module that allows access to oracle.) source from from http://cx-oracle.sourceforge.net/

tar -xzvf cx_Oracle-5.1.2.tar.gz
cd  cx_Oracle-5.1.2/
python setup.py install

Setup an alias (In your .bash_profile) to simplify invoking the virtualenv every time you want to use it.

alias p276env1='. /u01/Rk/Apps/Python/p276env1/bin/activate'

Now, anytime you want to execute a python program in this environment, you can invoke the Linux command line and

p276env1
python

OraChk Collection Manager

OraChk (Previously known as Raccheck) is a utility from oracle to perform configuration checks on Oracle database platforms, and report on configurations that do not match oracle’s best practices. OraChk has the ability to upload the results of its checks into an oracle database. Details about this utility can be found in Mos Note 1268927.1

Oracle has now released  OraChk Collection Manager which is a companion application to OraChk, which has an Oracle Application Express, Front End which can be used  as a dashboard, in which customers can track their ORAchk, RACcheck and Exachk collection data in one easy to use interface.

Details about downloading and using “OraChk Collection Manager” can be found in Mos Note 1602329.1

Exadata X4-2 Whats New – A Technical Review

 

On Dec 11 2013 Oracle corporation announced the general availability of, the 5th generation of the Oracle Exadata Database Machine X4. This new version of the database machine introduces new hardware and software to accelerate performance, increase capacity, and improve efficiency and quality of service for database deployments.

In this blog post i’ll review in detail the enhancements in the Oracle Exadata x4-2 database machine.

The 2 socket version of the Database machine, X4-2 gets new database servers and new storage servers. The 8 socket version of the database machine, still called the x3-8, gets the new storage servers but keeps the same database serves from the previous release. Hence the Lack of change in Name to the X4-8. The name stays as X3-8.

Improvements in the database servers (X4-2)

Cpu Improvements

The database servers in the x4-2 are the Sun X4-2  servers. They come with the 2 Twelve-Core Intel® Xeon® E5-2697 v2 Processors (Ivy Bridge) (2.7GHz), with turbo boost enabled by default. This cpu at times can clock upto 3.5Ghz.

Comparing this with the previous version x3-2, this is a 50% increase in number of cores per database server. The x3-2 had 16 cores per node and the x4-2 has 24 cores per node. In a x3-2 full rack there were 128 cores and in the x4-2 full rack there are 192 cores. That gives 64 more cores in the x4-2 full rack compared to the x3-2 full rack.

Memory Improvements

Each database server has 256Gb of Dram. This is optionally expandable to 512Gb per node.So you can have either 2Tb or 4Tb of Ram in a full rack of x4-2.

Storage Improvements

The database servers now have 4, 600Gb hard disks in them (Internal storage, used for O/S, Oracle Binaries, Logs etc). The x3-2 used to have 300Gb disks. The disk controller batteries are online replaceable

Network Improvements

The database servers have 2 X Infiniband 4 X QDR (40Gb/S) Ports (PCIe 3.0). Both ports are Active. The improvements in the x4-2 are that these are now PCIe 3.0, and that the ports are used active active. (The ports were used Active/Passive in the previous release)

Improvements in the storage servers (Exadata cells x4-2)

Cpu Improvements

Each x4-2 exadata cells come with 2, Six-Core Intel® Xeon® E5-2630 v2 Processors (2.6 GHz).

Memory Improvements

Each x4-2 cell, now has 96Gb of Ram. The x3-2 used to have 64 Gb of Ram.

Flash Cache Improvements

The Exadata x4-2 cells, now have the F80 PCIe Flash cards. There are 4, 800Gb F80 flash cards in each cell. So each cell has 3.2Tb of flash cache. In an Oracle Exadata database machine x4-2 full rack there is 44TB of flash cache (Used to be 22Tb in the x3-2). The Oracle Exadata database machine x4-2 full rack now provides, 2.66Million read iops , 1.96Million write iops from the flash cache. This is 70% more flash iops than the previous generation.

Storage Improvements

The Exadata cell x4-2 High Performance version, now uses 1.2Tb 10k Rpm High Performance SAS disks.

  • A full rack of High Performance disks gives 200TB of raw space, 90TB of usable space with normal mirroring and 60TB of usable space with High Mirroring

The Exadata cell x4-2 High Capacity version, now uses 4Tb 7.2k Rpm High Capacity SAS disks.

  • A full rack of High Capacity disks gives 672TB of raw space, 300TB of usable space with normal mirroring and 200TB of usable space with High Mirroring

The Disk controller batteries are online replacable.

Network Improvements

The Exadata storage servers have 2 X Infiniband 4 X QDR (40Gb/S) Ports (PCIe 3.0). Both ports are Active. The improvements in the x4-2 are that these are now PCIe 3.0, and that the ports are used active active. (The ports were used Active/Passive in the previous release)

 

Software Improvement Highlights

The new x4-2’s get a new version of the exadata storage server software version 11.2.3.3.0 (This version of the cell software can be installed on the v2,x2,x3 and x4 platforms).

Exadata Flash Cache Improvements

Data written to the flash cache is now compressed. This compression is done by the controller (Hence near Zero overhead). This data is automatically decompressed when it is read from the flash cache. Up to 2X more data fits in smart flash cache (If you get 2x compression of the data, you could fit upto 80TB of data on the flash cache), so flash hit rates will improve and performance will improve for large data sets. This feature can be turned on in both x3-2 and x4-2 storage servers. Customers need to license the Advanced Compression option to use this new feature.

Enhancements to the smart flash cache software, enables exadata software to understand database table and partition scans and automatically caches them when it makes sense (This will help eliminate the need to specify CELL FLASH CACHE KEEP).

Exadata network resource management

With a new version of the Infiniband switch firmware 2.1.3-4, Exadata network resource management now prioritizes messages through the entire infiniband fabric. Latency sensitive messages like redo log writes are prioritized over batch, reporting and backup messages.

Infiniband Active Active Ports

Double-ported infiniband PCIe-3.0 Cards used in the database servers and storage servers, implement active-active mode usage of the infiniband ports, providing a 80GigaBits Per Second network bandwidth (Used to be 40Gigbits ber second on the x3-2, since it was Active Passive bonding).

The Rdbms software and the clusterware software already had the ability to send packets via multipe interfaces. Enhancements have been done to the RDS kernel drivers, which now have the ability to sense if one of the ports is down, and route the network traffic through the surviving port. On the x4-2’s when active active Infiniband networking is setup you will not see the bondib0 interface, instead you will see a ib0 and ib1.

Miscellaneious Info

  • The power,cooling and airflow requirements remain similar to that of the x3-2.
  • The storage expansion racks have also been refreshed and provides the increased flash cache and increased disk space.
  • A single Database Machine configuration can have servers and storage from different generations V2, X2, X3, X4.
  • Databases and Clusters can span across multiple hardware generations.
  • The half and full x4-2 racks, do not ship the spine switch anymore. The storage expansion racks still ship with the spine switch.
  • A new One Command available through patch 17784784 has the new Exadata Deployment assistant that supports the x4-2’s.

Datasheets and White Papers

Datasheet: Oracle Exadata database machine x4-2

Datasheet: Oracle Exadata database machine x3-8

Datasheet: Oracle Exadata storage expansion Rack x4-2

Datasheet: Oracle Exadata storage server x4-2

A Technical Overview of the Oracle Database Machine and Exadata Storage Server (Updated for x4-2)

Exadata Smart Flash Cache (Updated For X4-2)

 

Managing Processes and Memory Consumption on Oracle Exadata database machines.

This post is equally applicable to Rac databases deployed, both on oracle database machines as well as traditional server/storage architectures, running RHEL5 or OEL5 (OEL6 has a new feature called transparent hugepages).

The Best practices for database consolidation on exadata database machine, White Paper outlines a lot of good guidelines. However the best practices seems to be getting occasionally overlooked, in some cases resulting in Node evictions/restarts in the cluster.

High amounts of memory and cpu consumption can result in some real bad things happening (Like for eg: Node evictions). So it is important to configure the operating system and the databases you deploy on the machine optimally, to use the available CPU and memory resources.

Let us first review what those available memory and cpu resources are on an Oracle Exadata Database Machine, on each database node.

[table “1” not found /]

Anytime a workload is deployed on a database server, that exceeds the, operating systems ability to efficiently use the above mentioned, available cpu and memory resources, bad things can happen. (I know i am over simplifying in the prior comment, but you get the idea).

Managing Memory Resources.

Setting up HugePages

First we need to make sure that HugePages on each node, is setup correctly, for the databases you have deployed on the node.
Tanel Poder has explained the reasons for setting up HugePages on Exadata systems.
When the database machines are installed, the HugePages is configured to accomodate the one database that is pre-installed. So if you change the SGA setting for that database or deploy more databases you should now adjust the HugePages setting accordingly. It goes without saying that if you have already configured the hugepages to accomodate all the databases you have, and then you removed some databases from the node, or resized the SGA’s to be smaller, you should again resize your hugepages to free up the excess hugepages.

Once all the database instances are configured and started up you can run the script, hugepages_settings.sh from Mos Note 401749.1, which calculates the number of hugepages you need to setup. I usually recommend adding 10% to the value that hugepages_settings.sh suggests.

With 11.2.0.2 and 11.2.0.3 you should also be setting USE_LARGE_PAGES, preferably to ONLY, following the instructions in Mos Note 1392497.1.

You can then follow the instructions in MOS note 361323.1, under the section “Kernel version 2.6”, to set the value. You need to restart the databases so the SGA gets allocated with the hugepages.

You should also follow instructions from MOS note 1546861.1, and set the value of vm.min_free_kbytes = 524288, in /etc/sysctl.conf, to avoid page allocation failure messages when kernel memory is depleted.

Managing Memory Usage

Ideally for critical implementations, your SGA+PGA+Individual server processes, memory allocations should not exceed 75% of the physical memory on the database node.

The Exadata consolidation white paper above suggests that

For critical Hardware Pools, we recommend an even more conservative approach by not exceeding 75% physical memory per database node.
OLTP applications:
SUM of databases (SGA_TARGET +PGA_AGGREGATE_TARGET) + 4 MB * (Maximum PROCESSES) < Physical Memory per Database Node
DW/BI applications:
SUM of databases (SGA_TARGET + 3 * PGA_AGGREGATE_TARGET) < Physical Memory per Database Node

You can monitor the pga usage a few different ways.

1) SELECT s.inst_id, s.sid, s.serial#, p.spid, s.username, s.program, p.pga_used_mem FROM gv$session s JOIN gv$process p ON p.addr = s.paddr AND p.inst_id = s.inst_id WHERE s.type != ‘BACKGROUND’ and s.program not like ‘%(P%’ and p.pga_used_mem > <APPLICATION_MEMORY_THRESHOLD>
order by s.inst_id, s.sid, s.serial#;

The value you use for APPLICATION_MEMORY_THRESHOLD is dependent on your application and howmuch pga it needs to efficiently run your queries (Performing the sorts in memory using the space for PGA is usually faster than using the TEMP tablespace). However the white paper suggests using 1Gb for OLTP applications and 10Gb for DSS applications.

2) You can monitor the sysmetric “Total PGA Allocated” from dba_hist_sysmetric_summary, to see how much PGA is being allocated.

3) You can use the “PGA Memory Advisory” and “SGA Target Advisory” sections of the awr report, to guage, the impacts of increasing or decreasing the SGA and PGA Allocations.

4) You can monitor the memory usage at the o/s level using the top (Mem: free),free or vmstat (bo (page-out),bi (page-in)) commands.

Managing O/S Processes (And as a consequence CPU usage).

It is important to control the number of processes that are actively executing on each node. Each of these processes consume cpu and memory resources.

The following guidelines are important to follow.
– Use Application connection pooling to appropriately limit the number of sessions connected to the oracle database.

This is important for two reasons.
– Right-Size the resource consumption on each database node.
– Minimize the impact of connection storms.

– Configure the parallel query servers appropriately.
– For all the database’s deployed on a node the total setting for PARALLEL_MAX_SERVERS should be less than or equal to the following

X2-2 or X3-2, <= 240
X2-8 or X3-8, <= 1280

I often see implimentations where parallel_max_servers = 240 (or a high value) in each instance on the machine, which results in not so good consequences. I have occassionaly seen systems where there is sustained 100% cpu usage with load averages of > 300, resulting in node reboots.

– Use Instance Caging and Database Resource Manager to manage cpu allocation on the server.

It would be a good idea to always review the Best practices for database consolidation on exadata database machine, White Paper and follow the guidelines, to have a well performing and  stable database environment.

It would be a good idea to review the performance management screens from Enterprise Manager 12c every day to have a good understanding of the database and server workloads, and/or analyze the oswatcher data to understand the server workload profile.

More about Oracle 12c identity columns

Oracle 12c has introduced the ability to define a column in a database table as a “Identity” Column. This results in the database automatically using a sequence to generate the values that get inserted into this column. The Oracle-Base blog has an article on how to use this functionality and some performance comparisons between this and trigger based methods.

Below is a bit more detail regarding this functionality.

As stated in the Oracle-Base article, oracle auto-creates a sequence and uses it to generate the values. There seems to be a new data dictionary view, USER_TAB_IDENTITY_COLS, which shows you which columns in the table are defined as the Datatype IDENTITY.

The query below can be used to, identify the sequence name, that oracle generated, for a given column, defined as an IDENTITY datatype.

SQL> set long 999999999
SQL> select data_default from user_tab_columns where table_name = 'T1'    
  2  and column_name = 'COL1';

DATA_DEFAULT
--------------------------------------------------------------------------------
"SCOTT"."ISEQ$$_92434".nextval

SEQUENCE_NAME	     INCREMENT_BY CACHE_SIZE
-------------------- ------------ ----------
ISEQ$$_92434			1	  20

The sequence it uses, gets created with a default of 20 for the cache size. This will likely be unacceptable in Real Application Clusters environments. If you try to modify the cache size of the sequence after the creation you will get a ORA-32793 error “cannot alter a system-generated sequence”.

In such cases it would be wise to use the syntax below during table creation, and specify the CACHE_SIZE and other sequence parameters you would want to change.

SQL> create table t1 (col1 number generated as identity (start with 100 increment by 2 cache 1000), col2 varchar2(50));

Table created.

Once the table is re-created, you can check the cache size again to verify.

SQL> select sequence_name,cache_size from user_sequences;

SEQUENCE_NAME	     CACHE_SIZE
-------------------- ----------
ISEQ$$_92436		   1000