Consolidated database replay

Real Application Testing option has been enhanced to support “Consolidated Database Replay” functionality to help assess the impact of database and workload consolidation to a single database. Consolidated Database Replay allows concurrent replay of multiple captured production workloads from the same or different systems to a single database. Customers can use this functionality to validate the consolidation strategy recommended by Consolidation Planner and to accurately predict the capacity and sizing of their database infrastructure before production deployment. Here are more details regarding this announcement:

  • Consolidated Database Replay support starting from Oracle Database Release 11.2.0.2 and above. Multiple workload captures (including from  pre-11.2.0.2 database releases) can be replayed concurrently in database release 11.2.0.2 and above
  • MOS Doc ID 1453789.1 provides more details on the required patches, how to use this functionality and associated best practices

Sql to extract awr data for analysis

Awr captures a wealth of database performance statistics and metrics. Enterprise manager is usually the preferred tool to interpret and analyze this data. However lot of times i resort to using sql statements to graph and look for trends in the awr data. Below are some of the sql’s that i use (Tested only in 11gr2) to extract this information. Please keep in mind that you need the “Database tuning pack” license to access the dba_hist views.

The full script can be downloaded here.

Awr Snapshot info by DbId, by Day

select d.dbid,d.db_name,to_char(s.begin_interval_time,'DD-MON-RR') begintime,min(s.snap_id) minsnap,max(s.snap_id)  maxsnap
              from dba_hist_snapshot s,dba_hist_database_instance d where s.instance_number = 1
              and s.instance_number = d.instance_number and s.dbid = d.dbid
              group by d.dbid,d.db_name,to_char(s.begin_interval_time,'DD-MON-RR') order by 1
/

	       DBID DB_NA BEGINTIME		 MINSNAP   MAXSNAP
------------------- ----- -------------------- --------- ---------
	  220594996 QAD   16-NOV-11		    5205      5217
	  220594996 QAD   17-NOV-11		    5218      5220
	 2085202933 RK01  15-MAY-12		       3	 4

Database version and platform

select distinct version,platform_name from dba_hist_database_instance where dbid=&dbid
/

VERSION           PLATFORM_NAME
----------------- ---------------------------
11.2.0.2.0        Linux x86 64-bit

Cpu’s/Sockets/Cores/Load Average

select * from (
                  select  stat_name,value
                  from  dba_hist_osstat
                  where dbid = &dbid
                  and instance_number = 1
                  and snap_id = &esnap
                  )
                  pivot (sum(value) for stat_name in ('NUM_CPUS','NUM_CPU_SOCKETS','NUM_CPU_CORES','PHYSICAL_MEMORY_BYTES','LOAD'))
/

'NUM_CPUS' 'NUM_CPU_SOCKETS' 'NUM_CPU_CORES' 'PHYSICAL_MEMORY_BYTES'     'LOAD'
---------- ----------------- --------------- ----------------------- ----------
        24                 2              12              1.0122E+11 .209960938

O/S Cpu Usage

select
to_char(begintime,'DD-MON-YY HH24:MI:SS') begintime,
to_char(endtime,'DD-MON-YY HH24:MI:SS') endtime,
inst,
snapid,
round((utdiff/(utdiff+itdiff+stdiff+iowtdiff+ntdiff))*100)  utpct,
round((ntdiff/(utdiff+itdiff+stdiff+iowtdiff+ntdiff))*100)  ntpct,
round((stdiff/(utdiff+itdiff+stdiff+iowtdiff+ntdiff))*100)  stpct,
round((iowtdiff/(utdiff+itdiff+stdiff+iowtdiff+ntdiff))*100)  iowtpct,
(100-
(
 round((utdiff/(utdiff+itdiff+stdiff+iowtdiff+ntdiff))*100)+
 round((ntdiff/(utdiff+itdiff+stdiff+iowtdiff+ntdiff))*100)+
 round((stdiff/(utdiff+itdiff+stdiff+iowtdiff+ntdiff))*100)+
 round((iowtdiff/(utdiff+itdiff+stdiff+iowtdiff+ntdiff))*100)
)) itpct
from
(
select begintime,endtime,(extract(Minute from endtime-begintime)*60+extract(Second from endtime-begintime)) secs,
snapid,inst,
ut-(nvl(lag(ut) over (partition by inst order by inst,snapid),0)) utdiff,
bt-(nvl(lag(bt) over (partition by inst order by inst,snapid),0)) btdiff,
it-(nvl(lag(it) over (partition by inst order by inst,snapid),0)) itdiff,
st-(nvl(lag(st) over (partition by inst order by inst,snapid),0)) stdiff,
iowt-(nvl(lag(iowt) over (partition by inst order by inst,snapid),0)) iowtdiff,
nt-(nvl(lag(nt) over (partition by inst order by inst,snapid),0)) ntdiff,
vin-(nvl(lag(vin) over (partition by inst order by inst,snapid),0)) vindiff,
vout-(nvl(lag(vout) over (partition by inst order by inst,snapid),0)) voutdiff
from
(
select sn.begin_interval_time begintime,
     sn.end_interval_time EndTime,oss.snap_id SnapId,oss.instance_number Inst,
     sum(decode(oss.stat_name,'USER_TIME',value,0)) ut,
     sum(decode(oss.stat_name,'BUSY_TIME',value,0)) bt,
     sum(decode(oss.stat_name,'IDLE_TIME',value,0)) it,
     sum(decode(oss.stat_name,'SYS_TIME',value,0)) st,
     sum(decode(oss.stat_name,'IOWAIT_TIME',value,0)) iowt,
     sum(decode(oss.stat_name,'NICE_TIME',value,0)) nt,
     sum(decode(oss.stat_name,'VM_IN_BYTES',value,0)) vin,
     sum(decode(oss.stat_name,'VM_OUT_BYTES',value,0)) vout
from dba_hist_osstat oss,dba_hist_snapshot sn
where oss.dbid=&dbid
and   oss.dbid = sn.dbid
and   oss.instance_number =  sn.instance_number
and   oss.snap_id = sn.snap_id
and   oss.snap_id between &bsnap - 1 and &esnap
and   oss.stat_name in (
'USER_TIME',
'BUSY_TIME',
'IDLE_TIME',
'SYS_TIME',
'IOWAIT_TIME',
'NICE_TIME',
'VM_IN_BYTES',
'VM_OUT_BYTES'
)
group by sn.begin_interval_time,sn.end_interval_time,oss.snap_id,oss.instance_number
order by oss.instance_number,oss.snap_id
)
)
where snapid between &bsnap and &esnap
order by inst,snapid
/

BEGINTIME            ENDTIME               in     SNAPID  UTPCT  NTPCT  STPCT IOWTPCT  ITPCT
-------------------- -------------------- --- ---------- ------ ------ ------ ------- ------
16-NOV-11 21:00:12   16-NOV-11 21:20:16     1       5209   1.00    .00    .00     .00  99.00
16-NOV-11 21:20:16   16-NOV-11 21:40:12     1       5210   4.00    .00   1.00     .00  95.00
16-NOV-11 21:40:12   16-NOV-11 22:00:03     1       5211  31.00    .00   2.00     .00  67.00
16-NOV-11 22:00:03   16-NOV-11 22:20:05     1       5212  58.00    .00   2.00     .00  40.00

Elapsed Time/DB Time/Concurrent Active Users

select
to_char(begintime,'DD-MON-YY HH24:MI:SS') begintime,
to_char(endtime,'DD-MON-YY HH24:MI:SS') endtime,
inst,
snapid,
round(dbtdiff/(1000000*60),2) dbt,
round(secs/60) mins,
round(dbtdiff/(1000000*60*round(secs/60))) concactive
from
(
select begintime,endtime,(extract(Minute from endtime-begintime)*60+extract(Second from endtime-begintime)) secs,
snapid,inst,
dbt-(nvl(lag(dbt) over (partition by inst order by inst,snapid),0)) dbtdiff
from
(
select sn.begin_interval_time begintime,
     sn.end_interval_time EndTime,tm.snap_id SnapId,tm.instance_number Inst,
     sum(decode(tm.stat_name,'DB time',value,0)) dbt
from dba_hist_sys_time_model tm,dba_hist_snapshot sn
where tm.dbid=&dbid
and   tm.dbid = sn.dbid
and   tm.instance_number =  sn.instance_number
and   tm.snap_id = sn.snap_id
and   tm.snap_id between &bsnap - 1 and &esnap
and   tm.stat_name in (
'DB time'
)
group by sn.begin_interval_time,sn.end_interval_time,tm.snap_id,tm.instance_number
order by tm.instance_number,tm.snap_id
)
)
where snapid between &bsnap and &esnap
order by inst,snapid
/

BEGINTIME            ENDTIME               in     SNAPID       DBT    MINS CONCACTIVE
-------------------- -------------------- --- ---------- --------- ------- ----------
16-NOV-11 21:00:12   16-NOV-11 21:20:16     1       5209         1      20          0
16-NOV-11 21:20:16   16-NOV-11 21:40:12     1       5210       319      20         16
16-NOV-11 21:40:12   16-NOV-11 22:00:03     1       5211       657      20         33
16-NOV-11 22:00:03   16-NOV-11 22:20:05     1       5212       972      20         49
16-NOV-11 22:20:05   16-NOV-11 22:40:06     1       5213       457      20         23
16-NOV-11 22:40:06   16-NOV-11 23:00:08     1       5214       736      20         37
16-NOV-11 23:00:08   16-NOV-11 23:20:25     1       5215         3      20          0

Top 5 Foreground Waits

The percentages i get here, do not seem to match up exactly, with the percentages you get, when you run awrrpt.sql, to generate the corresponding awr report. However it always seems to be within + or – 3% of the awrrpt.sql value. I think that the variation is because of the way that the awrrpt.sql might be rounding values, and probably how it accouts for Idle events.

 

with se as (
     select sn.begin_interval_time begintime,
        sn.end_interval_time EndTime,se.snap_id SnapId,se.instance_number Inst,
        se.event_name stat,se.time_waited_micro_fg value,
        nvl(lag(se.time_waited_micro_fg) over(partition by se.instance_number,se.event_name
        order by se.instance_number,se.snap_id,se.event_name),0) prevval,
        se.time_waited_micro_fg-
        nvl(lag(se.time_waited_micro_fg) over(partition by se.instance_number,se.event_name
        order by se.instance_number,se.snap_id,se.event_name),0) valuediff
     from dba_hist_system_event se,dba_hist_snapshot sn
     where se.dbid=&dbid
     and   se.dbid = sn.dbid
     and   se.instance_number =  sn.instance_number
     and   se.snap_id = sn.snap_id
     and   se.snap_id between &bsnap-1 and &esnap
     and   se.wait_class != 'Idle'
     order by se.snap_id,se.instance_number,se.event_name
     ) ,
     sdbcpu as (
         select sn.begin_interval_time begintime,sn.end_interval_time EndTime,
         stm.snap_id snapid,stm.instance_number inst,stm.stat_name stat
         ,stm.value value
         ,nvl(lag(stm.value) over(partition by stm.instance_number order by stm.instance_number,stm.snap_id),0) prevval
         ,stm.value-
         nvl(lag(stm.value) over(partition by stm.instance_number order by stm.instance_number,stm.snap_id),0) valuediff
         from dba_hist_sys_time_model stm,dba_hist_snapshot sn
         where
         stm.stat_name = ('DB CPU')
         and stm.dbid = &dbid
         and stm.snap_id between  &bsnap-1 and &esnap
         and stm.dbid = sn.dbid
         and stm.instance_number = sn.instance_number
         and stm.snap_id = sn.snap_id
         order by stm.snap_id,stm.instance_number
     ) ,
     sunion as (
         select begintime,endtime,snapid,inst,stat,valuediff from se
         union all
         select begintime,endtime,snapid,inst,stat,valuediff from sdbcpu
         order by 3,4
     ),
     spct as (
     select begintime,endtime,snapid,inst,stat,valuediff,
     round(ratio_to_report(valuediff) over (partition by snapid,inst),4) as pct
     from sunion
     order by 3,4 asc,7  desc
     )
     select * from (
     select to_char(begintime,'DD-MON-RR HH24:MI:SS') begintime
     ,to_char(endtime,'DD-MON-RR HH24:MI:SS') endtime,snapid,inst,stat,valuediff,round(pct*100,2) pct,
     row_number() over (partition by snapid,inst order by snapid,inst asc,pct desc) as rnum
     from spct
     )
     where rnum < 6 and snapid between &bsnap and &esnap
/

BEGINTIME            ENDTIME                  SNAPID  in STAT                               VALUEDIFF    PCT
-------------------- -------------------- ---------- --- ------------------------------ ------------- ------
16-NOV-11 21:00:12   16-NOV-11 21:20:16         5209   1 DB CPU                              28856557  83.24
16-NOV-11 21:00:12   16-NOV-11 21:20:16         5209   1 name-service call wait               2073440   5.98
16-NOV-11 21:00:12   16-NOV-11 21:20:16         5209   1 control file sequential read          843201   2.43
16-NOV-11 21:00:12   16-NOV-11 21:20:16         5209   1 enq: PS - contention                  634127   1.83
16-NOV-11 21:00:12   16-NOV-11 21:20:16         5209   1 undo segment extension                423219   1.22
16-NOV-11 21:00:12   16-NOV-11 21:20:16         5209   2 DB CPU                              34408715  89.73
16-NOV-11 21:00:12   16-NOV-11 21:20:16         5209   2 enq: PS - contention                  852207   2.22
16-NOV-11 21:00:12   16-NOV-11 21:20:16         5209   2 reliable message                      351905    .92
16-NOV-11 21:00:12   16-NOV-11 21:20:16         5209   2 control file sequential read          323355    .84
16-NOV-11 21:00:12   16-NOV-11 21:20:16         5209   2 row cache lock                        286882    .75

Physical and Logical I/O

select to_char(begintime,'DD-MON-RR HH24:MI') begintime,to_char(endtime,'DD-MON-RR HH24:MI') endtime
,(extract(Minute from endtime-begintime)*60+extract(Second from endtime-begintime)) secs,
snapid,inst,
prd-nvl(lag(prd) over (partition by inst order by inst,snapid),0) prddiff,
pwrt-nvl(lag(pwrt) over (partition by inst order by inst,snapid),0) pwrtdiff,
iordreq-nvl(lag(iordreq) over (partition by inst order by inst,snapid),0) iorddiff,
iowrtreq-nvl(lag(iowrtreq) over (partition by inst order by inst,snapid),0) iowrtdiff,
prmbr-nvl(lag(prmbr) over (partition by inst order by inst,snapid),0) prmbrdiff,
cgets-nvl(lag(cgets) over (partition by inst order by inst,snapid),0) cgetsdiff,
dbgets-nvl(lag(dbgets) over (partition by inst order by inst,snapid),0) dbgetsdiff
from
(
select sn.begin_interval_time begintime,
     sn.end_interval_time EndTime,ss.snap_id SnapId,ss.instance_number Inst,
     sum(decode(ss.stat_name,'physical read total bytes',value,0)) prd,
     sum(decode(ss.stat_name,'physical write total bytes',value,0)) pwrt,
     sum(decode(ss.stat_name,'physical read total IO requests',value,0)) iordreq,
     sum(decode(ss.stat_name,'physical write total IO requests',value,0)) iowrtreq,
     sum(decode(ss.stat_name,'physical read total multi block requests',value,0)) prmbr,
     sum(decode(ss.stat_name,'consistent gets',value,0)) cgets,
     sum(decode(ss.stat_name,'db block gets',value,0)) dbgets
from dba_hist_sysstat ss,dba_hist_snapshot sn
where ss.dbid=&dbid
and   ss.dbid = sn.dbid
and   ss.instance_number =  sn.instance_number
and   ss.snap_id = sn.snap_id
and   ss.snap_id between &bsnap-1 and &esnap
and   ss.stat_name in (
'physical read total bytes',
'physical write total bytes',
'physical read total IO requests',
'physical write total IO requests',
'physical read total multi block requests',
'consistent gets',
'db block gets'
)
group by sn.begin_interval_time,sn.end_interval_time,ss.snap_id,ss.instance_number
order by ss.instance_number,ss.snap_id
)
where snapid between &bsnap and &esnap
order by 4,5
/

BEGINTIME            ENDTIME                    SECS     SNAPID  in          PRDDIFF         PWRTDIFF         IORDDIFF        IOWRTDIFF        PRMBRDIFF        CGETSDIFF       DBGETSDIFF
-------------------- -------------------- ---------- ---------- --- ---------------- ---------------- ---------------- ---------------- ---------------- ---------------- ----------------
16-NOV-11 20:40      16-NOV-11 21:00        1210.766       5208   1     309967486976     544388304896         17660147         36335142          1139579       4468788730       2328671039
16-NOV-11 20:40      16-NOV-11 21:00        1210.782       5208   2     165472269312     510302864896         16065810         34894618           144948       4168326257       2064355182
16-NOV-11 20:42      16-NOV-11 21:00        1079.081       5208   3       3122675712        153791488            56249             6006            10713          1307859            60272
16-NOV-11 20:43      16-NOV-11 21:00        1028.123       5208   4       1318486016         59018752            26009             5436             7229          1199275            56027

Top 10 sql statements By Elapsed Time

select * from (
select ss.snap_id snapid,ss.instance_number inst,ss.sql_id  sqlid
       ,round(sum(ss.elapsed_time_delta)) elapsed
       ,nvl(round(sum(ss.executions_delta)),1) execs
       ,round(sum(ss.buffer_gets_delta)) gets
       ,round(sum(ss.rows_processed_delta)) rowsp
       ,round(sum(ss.disk_reads_delta)) reads
       ,dense_rank() over(partition by snap_id,instance_number order by sum(ss.elapsed_time_delta) desc) sql_rank
from
dba_hist_sqlstat ss
where
ss.dbid = &dbid  and
ss.snap_id between &bsnap and &esnap
group by ss.snap_id,ss.instance_number,ss.sql_id
)
where sql_rank < 11 and snapid between &bsnap and &esnap
/

   SNAPID  in SQLID                                ELAPSED      EXECS           GETS          ROWSP          READS   SQL_RANK
---------- --- ------------------------------ ------------- ---------- -------------- -------------- -------------- ----------
      5209   1 1zr2ujm5sa5gc                        4547685          1         224434              1              0          1
      5209   1 3g2ugdp1af2h8                        4478848         13         221134             13              0          2
      5209   1 5k5v1ah25fb2c                        3965629         39         360232             39              0          3
      5209   1 0r1zf55mxaujd                        2959144          1         237437              1              0          4
      5209   1 8vwv6hx92ymmm                        2633353      51062         360669          51062              0          5
      5209   1 2w1s3wu1zw63a                        1958993          2         153110              4              0          6
      5209   1 f1y8kbhh6v9sv                        1709282          1                                                       7
      5209   1 6pw8uk8k0dv0q                        1699402          1                                                       8

Using Python 3

I have been writing some python scripts for awr analysis and trending. Since python 2.7 is no longer being enhanced, i have now switched to using python 3. Lot of python applications and frameworks still does not support python 3 (Notably the Django framework). Good news is that cx_oracle works with python 3.

The steps to install cx_oracle with python 3 are very similar to the steps that i had outlined in my previous post on installing cx_oracle with python 2.7.

The difference is that

– You have to first install python3 and python3-dev (On ubuntu, you can just use the ubuntu software center to do this)

– Then download the cx_oracle 5.1.1 source code only tar ball from http://cx-oracle.sourceforge.net/

– login as root, untar the tar file, cd to the cx_Oracle-5.1.1 directory

– Then run /usr/bin/python3 setup.py install

That does it and now oracle connectivity is in place.

I’ve also been using the matplotlib library along with Python to plot graphs with the awr and oswatcher data files. matplotlib also works with python 3.

– You have to first install libpng, libpng-dev, libfreetype6, libfreetype6-dev (Use the ubuntu software center)

– Download the numpy source code tar ball.

– Extract the tar file, login as root, cd to the directory and run /usr/bin/python3 setup.py install

– Installing matplotlib Ref :

– Download the matplotlib source code tar file

– Login as root, cd to the directory

– /usr/bin/python3 setup.py build

– /usr/bin/python3 setup.py install

Now you should have matplotlib working with python3

Enjoy your python scripting


Moving awr data from 10gR2 to 11gR2 using awrextr.sql and awrload.sql

I have been wondering for a while whether we could export awr data from a 10gr2 database (awrextr.sql) and load it into an 11gr2 database (awrload.sql). I got to test this yesterday and it works fine. I was able to do the awrload.sql with No errors. Afterwords i even ran the 11gr2 awrrpti.sql script to generate awr reports using this data. It produces awr reports, but as one would expect, the portions where the data is not available in 10gR2 (eg: wait event histograms), the report just says “no data exists for this section of the report”.

VirtualBox command line start, stop and clone

I find it convenient to install the oracle 11gr2 rdbms on Oracle Enterprise Linux 5, running as a virtual host using  VirtualBox  VM (That way i dont have to go through the pains of installing it on Ubuntu Linux, which is unsupported). I use this database on a daily basis for testing different oracle functionality.

I use the following virtualbox command line commands to start and stop the virtual machine. It is much faster that booting up the operating system and then starting oracle.

To stop the virtual machine(I have named the VM OEL7) , in its current state (With the oracle rdbms running)

$ VBoxManage  controlvm OEL7 savestate

To start it

$VBoxManage  startvm OEL7

It takes about 10 seconds to start it.

To make a clone of the Virtual Machine to a entirely new location. First shutdown the virtual machine and then

$VBoxManage clonevm OEL7 –options keepallmacs –name OEL7E –basefolder “/media/ExternalDrive/vimage2”

The –basefolder can be specified to be a new location on disk to create the new image.

Installing Ruby 1.9.2 And Rails 3.1.1 with Oracle 11.2.0.3 on Ubuntu 11.10 Oneiric

Here are the steps to install and configure Ruby on rails with oracle 11.2.0.3 on 32 bit Ubuntu 11.10 Oneiric.

First install the pacakges needed by oracle

sudo apt-get install x11-utils rpm ksh lsb-rpm libaio1
sudo ln -s /usr/include/i386-linux-gnu/sys /usr/include/sys

Download the oracle instant client

Download the following .zip files from the oracle instant client download site.

instantclient-basiclite-linux-11.2.0.3.0.zip
instantclient-sqlplus-linux-11.2.0.3.0.zip
instantclient-sdk-linux-11.2.0.3.0.zip

Install the oracle InstantClient

Create a directory /u01/11gr2

cd /u01/11gr2

unzip the above 3 .zip files into this directory

You will have a new subdirectory named instantclient_11_2

Create a softlink to libclntsh.so

cd /u01/11gr2/instantclient_11_2
ln -s libclntsh.so.11.1 libclntsh.so

 

Setup the Oracle environment

Add the following to your .bashrc file (And source the file, . ./.bashrc)

export ORACLE_HOME=/u01/11gr2/instantclient_11_2
export LD_LIBRARY_PATH=$ORACLE_HOME/lib:$LD_LIBRARY_PATH
export TNS_ADMIN=$ORACLE_HOME

 

Create the tnsnames.ora file in /u01/11gr2/instantclient_11_2

Add service name entry for your oracle database  to the tnsnames.ora

RK01 =
  (DESCRIPTION =
    (ADDRESS = (PROTOCOL = TCP)(HOST = burl5vb1)(PORT = 1521))
    (CONNECT_DATA =
      (SERVER = DEDICATED)
      (SERVICE_NAME = rk01)
    )
  )

 

Install Ruby 1.9.2

sudo apt-get install curl
sudo apt-get install git-core

git config --global user.name "YourNameHere"
git config --global user.emailbash YourEmailHere

bash << (curl -s https://rvm.beginrescueend.com/install/rvm)

sudo apt-get install build-essential bison openssl libreadline6 libreadline6-dev curl git-core zlib1g zlib1g-dev libssl-dev libyaml-dev libsqlite3-0 libsqlite3-dev sqlite3libxml2-dev libxslt-dev autoconf libc6-dev ncurses-dev

rvm install 1.9.2
rvm --default use 1.9.2

Install Rails 3.1.1

gem install rails

Installing and using the oracle driver.

You can include the download and install of the oracle-advanced driver and the oci8 driver in the Gemfile for your application.

So that when you do the bundle install, it will install those gems for you.

Example shown below.

rails new testora
cd testora

 

Add the following lines to your Gemfile (In the application base directory)

gem 'activerecord-oracle_enhanced-adapter', :git => 'git://github.com/rsim/oracle-enhanced.git'
gem 'ruby-oci8', '~> 2.0.6'

Save and quit from Gemfile

Run the following command to install all the gems you need for the application

bundle install

Remove all the other entroes and add the database connection entry to your database.yml file (Under testora/config).

development:
  adapter: oracle_enhanced
  database: rk01
  username: scott
  password: tiger

Create your application and run it

rails generate scaffold purchase name:string cost:float
rake db:migrate
rails server

You can access the application from the following URL.
http://localhost:3000/purchases

 

Now you should be able to run your application.

How to influence the execution plan without modifying the sql from the application

It is likely that all of us have encountered the following situation. Your company runs a packaged application, you get some poorly performing sql, root cause happens to be the bad execution plan. You could battle statistics and see if you can get the optimizer to pick a better plan. One of the options, is to put this sql through the sql tuning advisor and see if it comes up with a sql profile that improves the execution plan, and if it does, to accept the profile.

Some of these performance issues could be critical, and require urgent resolution. In such urgent situations, you might catch yourself thinking, “I wish i could hint this query to get it to pick a better execution plan”. However, this being a packaged application, your hands are tied.

This year at oracle openworld 2011, Maria Colgan and Mohamed Zait, presented a session  “Oracle Database Optimizer : Tips for preventing suboptimal execution plans”. In there, towards the end, was a gem, which illustrated how to get a sql to pick a different execution plan (preferably generated by a modified version of the sql, hinted to pick a better plan), without actually modifying the application sql. This technique uses sql plan management.

This blog post is just reproducing the exact same method, with the exact same example they used, with a little bit more illustration of the execution plan, hopefully to benefit folks, who have not used this procedure before, and did not attend the openworld session.

The original sql

SQL> connect sh/sh
Connected.

SQL> variable sup_id number;
SQL> exec :sup_id := 1;

select p.prod_name,sum(s.amount_sold) amt
from sales s,products p
where s.prod_id = p.prod_id
and p.supplier_id = :sup_id
group by p.prod_name
/ 

Envoy External 6X CD-ROM			    645586.12
Model SM26273 Black Ink Cartridge		    617732.28
Model K8822S Cordless Phone Battery		    582640.54
Bounce						    244595.65
Smash up Boxing 				    260436.75
Comic Book Heroes				     101214.6

The original Plan

Note the full table scan on the products table

SQL> select * from table (dbms_xplan.display_cursor(null,null,'TYPICAL'))
/
  2
SQL_ID	fmjmws8askq3j, child number 0
-------------------------------------
select p.prod_name,sum(s.amount_sold) amt from sales s,products p where
s.prod_id = p.prod_id and p.supplier_id = :sup_id group by p.prod_name

Plan hash value: 504757596

----------------------------------------------------------------------------------------------------
| Id  | Operation		| Name	   | Rows  | Bytes | Cost (%CPU)| Time	   | Pstart| Pstop |
----------------------------------------------------------------------------------------------------
|   0 | SELECT STATEMENT	|	   |	   |	   |   587 (100)|	   |	   |	   |
|   1 |  HASH GROUP BY		|	   |	71 |  3550 |   587  (12)| 00:00:08 |	   |	   |
|*  2 |   HASH JOIN		|	   |	72 |  3600 |   586  (12)| 00:00:08 |	   |	   |
|   3 |    VIEW 		| VW_GBC_5 |	72 |  1224 |   583  (12)| 00:00:07 |	   |	   |
|   4 |     HASH GROUP BY	|	   |	72 |   648 |   583  (12)| 00:00:07 |	   |	   |
|   5 |      PARTITION RANGE ALL|	   |   918K|  8075K|   533   (3)| 00:00:07 |	 1 |	28 |
|   6 |       TABLE ACCESS FULL | SALES    |   918K|  8075K|   533   (3)| 00:00:07 |	 1 |	28 |
|*  7 |    TABLE ACCESS FULL	| PRODUCTS |	72 |  2376 |	 3   (0)| 00:00:01 |	   |	   |
----------------------------------------------------------------------------------------------------

Predicate Information (identified by operation id):
---------------------------------------------------

   2 - access("ITEM_1"="P"."PROD_ID")
   7 - filter("P"."SUPPLIER_ID"=:SUP_ID)

26 rows selected.

Create the plan baseline for this sql

Note that we are using the sql_id we got from the output of dbms_xplan in the previous step.

SQL> variable cnt number;
SQL> execute :cnt := dbms_spm.load_plans_from_cursor_cache(sql_id=>'fmjmws8askq3j');

PL/SQL procedure successfully completed.

Verify that the plan baseline was created by checking dba_sql_plan_baselines (Notice that the plan is enabled by default when you created it)

SQL> select sql_handle,sql_text,plan_name,enabled
from
dba_sql_plan_baselines
where
sql_text like '%select p.prod_name%'  2    3    4    5
  6  /

SQL_10ed3803a09c8fe1	       select p.prod_name,sum(s.amount_sold) amt     SQL_PLAN_11v9s0fh9t3z1c47b6be0 YES
			       from sales s,products p
			       where s.prod_i

Disable the original plan baseline from being used. (Since we’d be substituting this with a new hinted plan)

Note that we are using the plan_name and sql_handle that we got from the previous query.

SQL> exec :cnt := dbms_spm.alter_sql_plan_baseline(sql_handle =>'SQL_10ed3803a09c8fe1',-
					      plan_name => 'SQL_PLAN_11v9s0fh9t3z1c47b6be0',-
					      attribute_name=>'enabled',-
					      attribute_value=>'NO');
/

Check dba_sql_plan_baselines to ensure the baseline is now disabled.

  1  select sql_handle,sql_text,plan_name,enabled
  2  from
  3  dba_sql_plan_baselines
  4  where
  5* sql_text like '%select p.prod_name%'
SQL> /
SQL_10ed3803a09c8fe1	       select p.prod_name,sum(s.amount_sold) amt     SQL_PLAN_11v9s0fh9t3z1c47b6be0 NO
			       from sales s,products p
			       where s.prod_i

Rerun the sql with an Index Hint

Note that we are using a index hint (/*+ index(p) */ , to illustrate the point that, the execution plan has changed, and now it picks an index as opposed to a full table scan from the original query. In this specific case, there is no index on the supplier_id on the table products, so it picks the primary key on the table and does a full index scan. But you can see how, hinting your queries, with the proper index names can help your query.

SQL> select /*+ index(p) */ p.prod_name,sum(s.amount_sold) amt
from sales s,products p
where s.prod_id = p.prod_id
and p.supplier_id = :sup_id
group by p.prod_name
/   2    3    4    5    6
Envoy External 6X CD-ROM			    645586.12
Model SM26273 Black Ink Cartridge		    617732.28
Model K8822S Cordless Phone Battery		    582640.54
Bounce						    244595.65
Smash up Boxing 				    260436.75
Comic Book Heroes				     101214.6
Finding Fido					     78881.08

Check the new execution plan

Notice that the new plan uses the index products_pk on the table products.

SQL> select * from table (dbms_xplan.display_cursor(null,null,'TYPICAL'))
/  2
SQL_ID	gtdunv6qmpqqw, child number 0
-------------------------------------
select /*+ index(p) */ p.prod_name,sum(s.amount_sold) amt from sales
s,products p where s.prod_id = p.prod_id and p.supplier_id = :sup_id
group by p.prod_name

Plan hash value: 4089802669

-------------------------------------------------------------------------------------------------------------
| Id  | Operation		      | Name	    | Rows  | Bytes | Cost (%CPU)| Time     | Pstart| Pstop |
-------------------------------------------------------------------------------------------------------------
|   0 | SELECT STATEMENT	      | 	    |	    |	    |	587 (100)|	    |	    |	    |
|   1 |  HASH GROUP BY		      | 	    |	 71 |  3550 |	587  (12)| 00:00:08 |	    |	    |
|*  2 |   HASH JOIN		      | 	    |	 72 |  3600 |	586  (12)| 00:00:08 |	    |	    |
|   3 |    VIEW 		      | VW_GBC_5    |	 72 |  1224 |	583  (12)| 00:00:07 |	    |	    |
|   4 |     HASH GROUP BY	      | 	    |	 72 |	648 |	583  (12)| 00:00:07 |	    |	    |
|   5 |      PARTITION RANGE ALL      | 	    |	918K|  8075K|	533   (3)| 00:00:07 |	  1 |	 28 |
|   6 |       TABLE ACCESS FULL       | SALES	    |	918K|  8075K|	533   (3)| 00:00:07 |	  1 |	 28 |
|*  7 |    TABLE ACCESS BY INDEX ROWID| PRODUCTS    |	 72 |  2376 |	  3   (0)| 00:00:01 |	    |	    |
|   8 |     INDEX FULL SCAN	      | PRODUCTS_PK |	 72 |	    |	  1   (0)| 00:00:01 |	    |	    |
-------------------------------------------------------------------------------------------------------------

Predicate Information (identified by operation id):
---------------------------------------------------

   2 - access("ITEM_1"="P"."PROD_ID")
   7 - filter("P"."SUPPLIER_ID"=:SUP_ID)

28 rows selected.

Switch the execution plan for the original, unhinted sql

From the dbms_xplan output above we know the new sql_id (gtdunv6qmpqqw) and the new plan_hash_value (4089802669) (For the plan that is using an index).

We can then use dbms_spm to associate this new execution plan, to the sql_handle we created, for the original un-hinted sql, in the sql plan baseline. We use the new sql_id and plan_hash_value, from our hinted plan and we associate it to the sql_handle from the original query.

Note that the sql_handle we are using here, is the sql_handle for the original unhinted sql (We get this value from the step “Verify that the plan base line was created”) .

SQL> execute :cnt := dbms_spm.load_plans_from_cursor_cache(sql_id => 'gtdunv6qmpqqw',-
						      plan_hash_value => 4089802669,-
						      sql_handle=>'SQL_10ed3803a09c8fe1');
/

PL/SQL procedure successfully completed.

Check that a new plan has been added to the baseline

Note that the new plan is enabled by default.

SQL> select sql_handle,sql_text,plan_name,enabled
from
dba_sql_plan_baselines
where
sql_text like '%select p.prod_name%'
/  

SQL_10ed3803a09c8fe1	       select p.prod_name,sum(s.amount_sold) amt     SQL_PLAN_11v9s0fh9t3z1c47b6be0 NO
			       from sales s,products p
			       where s.prod_i

SQL_10ed3803a09c8fe1	       select p.prod_name,sum(s.amount_sold) amt     SQL_PLAN_11v9s0fh9t3z1d20e849e YES
			       from sales s,products p
			       where s.prod_i

Re-Run the original sql

select p.prod_name,sum(s.amount_sold) amt
from sales s,products p
where s.prod_id = p.prod_id
and p.supplier_id = :sup_id
group by p.prod_name
/ 

Envoy External 6X CD-ROM			    645586.12
Model SM26273 Black Ink Cartridge		    617732.28
Model K8822S Cordless Phone Battery		    582640.54
Bounce						    244595.65
Smash up Boxing 				    260436.75
Comic Book Heroes				     101214.6

Check the new execution plan

You can see that the original unhinted sql statement is now using the plan hash value of the hinted query and hence is using the primary key index on the products table (As opposed to the full table scan on the original table).

SQL> select * from table (dbms_xplan.display_cursor(null,null,'TYPICAL'));
SQL_ID	fmjmws8askq3j, child number 1
-------------------------------------
select p.prod_name,sum(s.amount_sold) amt from sales s,products p where
s.prod_id = p.prod_id and p.supplier_id = :sup_id group by p.prod_name

Plan hash value: 4089802669

-------------------------------------------------------------------------------------------------------------
| Id  | Operation		      | Name	    | Rows  | Bytes | Cost (%CPU)| Time     | Pstart| Pstop |
-------------------------------------------------------------------------------------------------------------
|   0 | SELECT STATEMENT	      | 	    |	    |	    |	587 (100)|	    |	    |	    |
|   1 |  HASH GROUP BY		      | 	    |	 71 |  3550 |	587  (12)| 00:00:08 |	    |	    |
|*  2 |   HASH JOIN		      | 	    |	 72 |  3600 |	586  (12)| 00:00:08 |	    |	    |
|   3 |    VIEW 		      | VW_GBC_5    |	 72 |  1224 |	583  (12)| 00:00:07 |	    |	    |
|   4 |     HASH GROUP BY	      | 	    |	 72 |	648 |	583  (12)| 00:00:07 |	    |	    |
|   5 |      PARTITION RANGE ALL      | 	    |	918K|  8075K|	533   (3)| 00:00:07 |	  1 |	 28 |
|   6 |       TABLE ACCESS FULL       | SALES	    |	918K|  8075K|	533   (3)| 00:00:07 |	  1 |	 28 |
|*  7 |    TABLE ACCESS BY INDEX ROWID| PRODUCTS    |	 72 |  2376 |	  3   (0)| 00:00:01 |	    |	    |
|   8 |     INDEX FULL SCAN	      | PRODUCTS_PK |	 72 |	    |	  1   (0)| 00:00:01 |	    |	    |
-------------------------------------------------------------------------------------------------------------

Predicate Information (identified by operation id):
---------------------------------------------------

   2 - access("ITEM_1"="P"."PROD_ID")
   7 - filter("P"."SUPPLIER_ID"=:SUP_ID)

Note
-----
   - SQL plan baseline SQL_PLAN_11v9s0fh9t3z1d20e849e used for this statement

What’s new in Oracle Linux oow2011 a summary

There were some significant new features of Oracle Linux, Announced during oracle openworld 2011. Some of them were in a single slide of a presentation done by Edward Screven, who is the chief corporate Architect at Oracle. Easy to have missed. So here are some of the details of the new features.

DTrace is now available for Oracle Linux 6

DTrace has been a favorite of Solaris users for ages. Now oracle brings the coolness and functionality of DTrace to Linux.  It provides very granular level information about cpu, memory, filesystem and network usage at each process level, in addition to system calls, arguments used etc. It is kind of sort of like strace on linux, but has very rich amount of details, with very little instrumentation overhead.

Wim Coekaerts , has written a detailed article on how to enable this functionality.

Kernel upgrades can now be done while the system is running (Ksplice available since beginning of Sep 2011)

This one was really not an openworld announcement (But was touched upon, in Edward Screven’s presentation) . It has been available since early september 2011. Oracle acquired a company named Ksplice in July 2011. Now Oracle Linux users can perform kernel updates without having to stop the system, or having to reboot the system.

Wim Coekaerts , has written a detailed article on this new functionality and the article has a link to a white paper on how to enable this functionality.

Unbreakable Enterprise Kernel version 2 now available with major new features

The Unbreakable Enterprise Kernel version 2 was released and brings with it lot of new features. Couple of the highlights are below

 Btrfs is now Production

Btrfs stands for Better file system and/or B-Tree file system. Although existing ext3 and ext4 file systems are great, it’s kind of old technology and lacks key enterprise features. Btrfs brings new features like ability to snapshot, online defragmentation, volume growth and shrinking, checksum’s of data and metadata etc.

 Linux Containers

Solaris has had zones and containers for a while that enables virtualization of physical Sun servers that run Solaris. Now oracle is bringing this functionality to Linux. The key difference between Linux Containers and Linux Virtual Machines (Like Oracle Virtual Machine) is that, Linux Containers can run instructions native to the core cpu without any interpretation mechanisms and hence provides good performance for the virtualized hosts.

There are also numerous enhancements to improve performance of oracle products on Oracle Linux, in this new release of the kernel.

Wim Coekaerts, has posted a good article on how to get started with using the Unbreakable kernel version 2.

 Simple example to enable linux containers .

 

Auto DOP and calibrate_io

I was posed the following question by a colleague recently. I am using auto DOP in 11gr2 and I am setting parallel_degree_threshold=12. But when i execute my query, my query is still running with  a parallelism of 48 (ie it seems to be ignoring my setting for parallel_degree_threshold). The problem turned out  to be that for Auto DOP to work,  you need to have dbms_workload_manager.calibrate_io run and the table resource_io_calibrate$ populated.

This requirement is explicitly stated in the oracle 11gr2 documentation at http://download.oracle.com/docs/cd/E11882_01/server.112/e25523/parallel002.htm#CIHEFJGC

 

“When PARALLEL_DEGREE_POLICY is set to AUTO, Oracle Database determines whether the statement should run in parallel based on the cost of the operations in the execution plan and the hardware characteristics. The hardware characteristics include I/O calibration statistics so these statistics must be gathered otherwise Oracle Database does not use the automatic degree policy feature.”

Automatic DOP, where oracle figures out the degree of parallelism to use for a given query, gets turned on by setting the initialization parameter PARALLEL_DEGREE_POLICY to either AUTO or LIMITED (When it is limited it does NOT turn on parallel query queuing and in memory parallel query). So the requirement to gather calibrate_io output is applicable to both the setting AUTO and LIMITED.

When you are using Auto Dop, and you want to limit the maximum parallelism that a query can get, you can use the parameter parallel_degree_limit=cpu/io/<integer>, to limit the DOP of a query.

When you create a new database using a DBCA (Database configuration assistant) template, in the new database, the calibrate_io statistics is not present. You can verify this by running  the queries below.

SQL> select * from resource_io_calibrate$;

no rows selected
SQL> select * from v$io_calibration_status;

STATUS
-------------
CALIBRATION_TIME
---------------------------------------------------------------------------
NOT AVAILABLE

Below is an example.

  • I have created a new table named sales_rk that has 58 million rows in it.
  • There are no indexes, or primary keys on this table.
  • The table is decorated with parallel degree DEFAULT
  • parallel_degree_policy=LIMITED
  • parallel_degree_limit=2
  • cpu_count=2
  • parallel_threads_per_cpu=2
  • At this time there are no rows in resource_io_calibrate$

I ran the query “select count(*) from sales_rk”

I would have expected auto dop to have kicked in and parallel_degree_limit to have limited the DOP of the query to 2.

However that is not what happened.

The query ran with a DOP of 4.

So oracle just picked the default DOP (cpu_count x parallel_threads_per_cpu).

Further evidence that Auto DOP did not kick in can be found by examining the dbms_xplan output of the statement.

 

SQL> select count(*) from sales_rk;

  COUNT(*)
----------
  58805952

SQL> select * from table(dbms_xplan.display_cursor(null,null,'TYPICAL'));

PLAN_TABLE_OUTPUT
--------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------
SQL_ID	1bzaqj7scjp7p, child number 0
-------------------------------------
select count(*) from sales_rk

Plan hash value: 2302347944

--------------------------------------------------------------------------------------------------------
| Id  | Operation	       | Name	  | Rows  | Cost (%CPU)| Time	  |    TQ  |IN-OUT| PQ Distrib |
--------------------------------------------------------------------------------------------------------
|   0 | SELECT STATEMENT       |	  |	  | 21544 (100)|	  |	   |	  |	       |
|   1 |  SORT AGGREGATE        |	  |	1 |	       |	  |	   |	  |	       |
|   2 |   PX COORDINATOR       |	  |	  |	       |	  |	   |	  |	       |
|   3 |    PX SEND QC (RANDOM) | :TQ10000 |	1 |	       |	  |  Q1,00 | P->S | QC (RAND)  |
|   4 |     SORT AGGREGATE     |	  |	1 |	       |	  |  Q1,00 | PCWP |	       |
|   5 |      PX BLOCK ITERATOR |	  |    58M| 21544   (1)| 00:04:19 |  Q1,00 | PCWC |	       |
|*  6 |       TABLE ACCESS FULL| SALES_RK |    58M| 21544   (1)| 00:04:19 |  Q1,00 | PCWP |	       |
--------------------------------------------------------------------------------------------------------

Predicate Information (identified by operation id):
---------------------------------------------------

   6 - access(:Z>=:Z AND :Z)

Note
-----
   - automatic DOP: skipped because of IO calibrate statistics are missing

The Note section explicitly states that Auto DOP was skipped.

Now to fix the situation, i ran dbms_resource_manager.calibrate_io.

DECLARE
  lat  INTEGER;
  iops INTEGER;
  mbps INTEGER;
BEGIN
  DBMS_RESOURCE_MANAGER.CALIBRATE_IO (1, 10, iops, mbps, lat);
end;
/

 

Beware of calibrate_io generating bad numbers for io mbps. If it does, then follow instructions in MOS note 1269321.1 to delete the contents of resource_io_calibrate$ and populate the table manually.

Bounce the database.

Check whether calibrate_io worked

SQL> l
  1* select * from v$io_calibration_status
SQL> /

STATUS
-------------
CALIBRATION_TIME
---------------------------------------------------------------------------
READY
07-OCT-11 05.56.40.911 PM

 

Now re-run the same query, Auto DOP kicks in, and it executes with a DOP of 2.

Looking at the execution plan, confirms that Auto DOP did kick in.

 

SQL> select * from table(dbms_xplan.display_cursor(null,null,'TYPICAL'));

PLAN_TABLE_OUTPUT
------------------------------------------------------------------------------------------------------------------------
SQL_ID	1bzaqj7scjp7p, child number 1
-------------------------------------
select count(*) from sales_rk

Plan hash value: 2302347944

--------------------------------------------------------------------------------------------------------
| Id  | Operation	       | Name	  | Rows  | Cost (%CPU)| Time	  |    TQ  |IN-OUT| PQ Distrib |
--------------------------------------------------------------------------------------------------------
|   0 | SELECT STATEMENT       |	  |	  | 43087 (100)|	  |	   |	  |	       |
|   1 |  SORT AGGREGATE        |	  |	1 |	       |	  |	   |	  |	       |
|   2 |   PX COORDINATOR       |	  |	  |	       |	  |	   |	  |	       |
|   3 |    PX SEND QC (RANDOM) | :TQ10000 |	1 |	       |	  |  Q1,00 | P->S | QC (RAND)  |
|   4 |     SORT AGGREGATE     |	  |	1 |	       |	  |  Q1,00 | PCWP |	       |
|   5 |      PX BLOCK ITERATOR |	  |    58M| 43087   (1)| 00:00:08 |  Q1,00 | PCWC |	       |
|*  6 |       TABLE ACCESS FULL| SALES_RK |    58M| 43087   (1)| 00:00:08 |  Q1,00 | PCWP |	       |
--------------------------------------------------------------------------------------------------------

Predicate Information (identified by operation id):
---------------------------------------------------

   6 - access(:Z>=:Z AND :Z)

Note
-----
   - automatic DOP: Computed Degree of Parallelism is 2 because of degree limit

Exadata smart flash log

The exadata development team has now released the exadata cell software version 11.2.2.4.0, which includes a new feature called the “Smart Flash Log”.

In a nutshell, this new feature speeds up redo log writes. Exadata smart flash log uses Exadata smart flash cache as a temporary storage to provide low latency redo log writes. With this new feature enabled, oracle writes both to physical disk and the flash cache simultaneously.

So if for some reason, flash cache writes are slow, the writes to the physical disk will provide the good response times. Similarly if the physical disk writes are slow, the flash cache writes will complete faster, providing the good response times.

You can use the “Create flash log”, cell command to turn this feature on. You can use the “Drop flash log” cell command to turn this feature off.

The exadata storage server software, users guide, has been updated with this information.

You have to have Bundle patch 11 (Actually it works from BP9 onwards, but BP11 is recommended) and exadata cell software 11.2.2.4.0 applied to get this functionality.

Please read the section “Exadata Smart Flash Logging : Flash for database logging”, in the oracle technical white paper, Exadata smart flash cache features and the Oracle Exadata Database Machine, for details.

Here is a video tutorial produced by Oracle Education on this topic, smart flash log.