Category Archives: Uncategorized

Cloud Networking Basics for the Non CCNA certified – IP Address

As more and more enterprises are migrating their IT deployments to the Public Cloud, several IT personnel who are not classically trained on networking theory and practice, are routinely having to architect, setup, deploy, manage and support these Cloud based implementations. This blog post is aimed at someone who is a DBA or Developer in an organization who has to support such an implementation (Let us say a migration of on-premises assets to the public cloud), to give them a basic understanding, of networking terminology involved, in a Public Cloud implementation.

The following discussion focuses on IPv4 IP addresses.

What is an Octet ?

Octet is a unit of digital information, that consists of 8 bits.

Eg: 10110001, 11111111

Ip address Basics.

An IP address is a numerical label assigned to each device connected to a computer network. An IPv4 IP address is usually of the form xxx.xxx.xxx.xxx.

Eg: 192.168.5.129

Ip address in Octal.

An IP address is made up of 4 Octets.

Each Bit in the octet represents a power of 2
27,26,25,24,23,22,21,20

128,64,32,16,8,4,2,1

So If you had an octet 11111111, in order to convert it to decimal, you would add up each of the bits place values as shown below.

128+64+32+16+8+4+2+1 = 255

Let us convert an IP address, into its Binary Octets representation.

Eg: 192.168.5.129
11000000.10101000.00000101.10000001

Let us take a look at the First and second Octet’s
192 = 11000000 = 27 + 26 = 128 + 64

168 = 10101000 = 27 + 0 + 25 + 0 + 2^3 = 128+32+8

You can continue the process with the third and fourth octet.

Private and Public IP Address Ranges

A Public IP address is one that is Routable on the Public Internet, and a Private IP address is one that is not routable on the internet.

RFC 3330 specifies the Public IP address Ranges as follows

1.0.0.0 - 9.255.255.255
11.x.x.x - 126.255.255.255
129.0.0.0 - 169.253.255.255
169.255.0.0 - 172.15.255.255
172.32.0.0 - 191.0.1.255
192.0.3.0 - 192.88.98.255
192.88.100.0 - 192.167.255.255
192.169.0.0 - 198.17.255.255
198.20.0.0 - 223.255.255.255

RFC 1918 specifies the Private IP address Ranges as follows

10.0.0.1 - 10.255.255.254
172.16.0.1 - 172.31.255.254
192.168.0.1 - 192.168.255.254

IP SubNetworks

Ip networks may be subdivided into subnetworks (aka subnets). For this purpose, an IP address is logically recognized as consisting of two parts: the network prefix and the host identifier. The subnet mask or the CIDR prefix, determines how the IP address is divided, into network and host parts.

A detailed explanation of IPv4 subnets can be found here

CIDR Block

CIDR stands for classless Inter Domain Routing. CIDR notation is a way to represent a block of IP addresses. It is represented as a dotted decimal address followed by a slash.

For the folks who just want to understand, what it means to them in the cloud, at the very basic level, a CIDR Block is just a list of IP addresses that represent a subnet. For the curious, who need to know how exactly the IP addresses are derived from a CIDR Block, read the explanation below.

If you prefer a Video Tutorial, you can find an Excellent YouTube video on this Topic by Joshua Butcher here.

Eg: 172.10.85.60/22

If we are given a CIDR block as shown above, we can derive the following.
1) The network ID of the IP address
2) The broadcast/host ID of the IP address
3) The netmask
4) The usable IP addresses in this CIDR block

The calculations are shown below.

Here is the decimal value of each bit in the octet as we showed in the previous section

128,64,32,16,8,4,2,1

/22 tells us that, there are 22 bits in the network portion all turned on

Let us write the /22 in its Octal representation

11111111.11111111.11111100.00000000

The Decimal version of which is

255.255.252.0

So The netmask for 172.10.85.60/22 is 255.255.252.0

The Magic number is determined as follows
– Find the location of the the last bit in the network portion.
– In our case this is the 6th bit in the 3rd octet.
– The value of the 6th bit in the octet is 4, which is he magic number.

So the initial network ID is 172.10.X.0 , where X has to be determined.

The initial broadcast ID is 172.10.Y.255, where Y has to be determined.

Take the 3rd Octet from the IP address (85) and convert it to binary.

01010101 and AND it with the 3rd octet of the netmask
11111100

——–
01010100 = 84. This is the value of X

Add the magic number to this 84+4 = 88 (4 would be the next network Id, so 4-1 would be the last ID usable)

Subtract 88-1 = 87, This is the value Y

So network ID is 172.10.84.0 and Broadcast ID is 172.10.87.255.

The usable IP’s are 172.10.84.1 to 172.10.86.254.

So we have learned that, the given CIDR block 172.10.85.60/22 represents
– A netmask of 255.255.252.0
– A usable IP address range of 172.10.84.1 to 172.10.86.254
– 1024 Usable IP’s in the subnet.

You can use the webpage https://www.ipaddressguide.com/cidr, to translate your CIDR blocks into IP-Address ranges.

The reason we use subnets in the public cloud is that, we can create separate segments of our cloud network, which house different types of hosts (Based on Security and Availability requirements). With different subnets, we have the ability to granularly control, the type of network traffic that is allowed in and out of specific type of hosts (For eg: Databases can be placed in a isolated subnet that allows only TCP/IP traffic on port 1521 or a port of your choice).

12c Adaptive Query Optimization – Getting Started

In Oracle Database 12c, adaptive query optimization is a set of capabilities that enables the optimizer to make run-time adjustments to execution plans and discover additional information that can lead to better statistics. Adaptive optimization is helpful when existing statistics are not sufficient to generate an optimal plan.

aqo

Below are a series of blog posts that I have written, that explains the Basics of how these features work.

Dynamic Statistics – Some Notes

Adaptive Plans – Adaptive Joins

Adaptive Plans – Adaptive Parallel Distribution Methods

AQO – Dynamic Statistics Examples

AQO – ReOptimization

AQP – SQL Plan Directives

 

Below are some oracle white papers and Mos notes that you could read, to familiarize yourselves with these new features.

Understanding Optimizer Statistics with Oracle 12c

Best practices for gathering statistics with Oracle 12c

What to expect with the optimizer, with Oracle 12c.

Adaptive Query Optimization – Mos Note : 2031605.1

Automatic Dynamic Statistics – Mos Note : 2002108.1

Adaptive Execution Plans – Mos Note : 1409636.1

How to Approach Issues Related to Adaptive Optimizations – Mos Note : 2068807.1

Optimizer Statistics – Master Note – Mos Note : 1369591.1

Resolving Issues of Library Cache Pin or Cursor Pin S Wait on X – Mos Note : 1476663.1

Troubleshooting Cursor Pin S Wait on X – Mos Note : 1349387.1

Cursor Pin S Wait on X, Finding Blocker – Mos Note : 786507.1

Parameter Recommendations for Oracle 12c – Part 1 (Upgrade PM Blog)

Parameter Recommendations for Oracle 12c – Part 2 (Upgrade PM Blog)

Things to consider before upgrading to 12.1.0.2 (Upgrade PM Blog)

Adaptive Query Optimization configuration : Parameters, Preferences and Fix controls (Christian Antognini Blog)

Adaptive Query Optimization – SQL Plan Directives

Sql plan directives are automatically created based on information learned via Automatic Reoptimization. The optimizer uses Sql Plan Directives to generate a more optimal execution plan. For eg: When joining two tables that have a data skew in their join columns, a Sql plan directive can direct the optimizer to use dynamic statistics to obtain a more accurate join cardinality estimate.

Key thing to note is that SPD’s are not created on Queries or Objects. They are created on query expressions. So the SPD’s generated by one sql statement can be used by multiple sql statements. It is possible for one Sql statement to use more than one SPD. SPD’s are persisted in the SYSAUX tablespace. Any SPD that is unused for 53 weeks will be automatically purged.

After a SQL directive is used the optimizer decides if the cardinality missestimate could be resolved with a column group. If so it will create the column group, the next time statistics are gathered on the appropriate table.

Let us take a look at an example.

We are using the same sql statement from the example in my post about Adaptive ReOptimization

SQL_ID  faca0yc2h1cdr, child number 0                                                                                                                                                                   
-------------------------------------                                                                                                                                                                   
SELECT /*+ gather_plan_statistics */ count(*)  FROM   customers  WHERE                                                                                                                                  
cust_state_province = 'CA'  AND    country_id=52790                                                                                                                                                     
 
Plan hash value: 296924608                                                                                                                                                                              
 
--------------------------------------------------------------------------------------------------------------------------                                                                              
| Id  | Operation          | Name      | Starts | E-Rows |E-Bytes| Cost (%CPU)| E-Time   | A-Rows |   A-Time   | Buffers |                                                                              
--------------------------------------------------------------------------------------------------------------------------                                                                              
|   0 | SELECT STATEMENT   |           |      1 |        |       |   405 (100)|          |      1 |00:00:00.01 |    1458 |                                                                              
|   1 |  SORT AGGREGATE    |           |      1 |      1 |    16 |            |          |      1 |00:00:00.01 |    1458 |                                                                              
|*  2 |   TABLE ACCESS FULL| CUSTOMERS |      1 |   1115 | 17840 |   405   (1)| 00:00:01 |   3341 |00:00:00.01 |    1458 |                                                                              
--------------------------------------------------------------------------------------------------------------------------                                                                              
 
Query Block Name / Object Alias (identified by operation id):                                                                                                                                           
-------------------------------------------------------------                                                                                                                                           
 
   1 - SEL$1                                                                                                                                                                                            
   2 - SEL$1 / CUSTOMERS@SEL$1                                                                                                                                                                          
 
Predicate Information (identified by operation id):                                                                                                                                                     
---------------------------------------------------                                                                                                                                                     
 
   2 - filter(("CUST_STATE_PROVINCE"='CA' AND "COUNTRY_ID"=52790))                                                                                                                                      
 
Column Projection Information (identified by operation id):                                                                                                                                             
-----------------------------------------------------------                                                                                                                                             
 
   1 - (#keys=0) COUNT(*)[22]                                                                                                                                                                           
   2 - (rowset=906)

Next I will flush the plan directive

BEGIN
DBMS_SPD.FLUSH_SQL_PLAN_DIRECTIVE;
END;
/

Lets check the plan directives that have been created

DIR_ID                                   OWNER OBJECT_NAME               COL_NAME             OBJECT_TYPE               TYPE                    STATE      INST
---------------------------------------- ----- ------------------------- -------------------- ------------------------- ----------------------- ---------- -------------
REASON                               DC                          LU                         
------------------------------------ --------------------------- ---------------------------
1581183467014446692                      SH    CUSTOMERS                 COUNTRY_ID           COLUMN                    DYNAMIC_SAMPLING        USABLE     NEW           
SINGLE TABLE CARDINALITY MISESTIMATE 20-SEP-16 08:32:22                                     
 
1581183467014446692                      SH    CUSTOMERS                 CUST_STATE_PROVINCE  COLUMN                    DYNAMIC_SAMPLING        USABLE     NEW           
SINGLE TABLE CARDINALITY MISESTIMATE 20-SEP-16 08:32:22                                     
 
1581183467014446692                      SH    CUSTOMERS                                      TABLE                     DYNAMIC_SAMPLING        USABLE     NEW           
SINGLE TABLE CARDINALITY MISESTIMATE 20-SEP-16 08:32:22

Then let us a execute a new statement, by just changing one of the predicates.

SQL_ID  aj2w592n68s4s, child number 0                                                                                                                                                                   
-------------------------------------                                                                                                                                                                   
SELECT /*+ gather_plan_statistics */ count(*)  FROM   customers  WHERE                                                                                                                                  
cust_state_province = 'FL'  AND    country_id=52790                                                                                                                                                     
 
Plan hash value: 296924608                                                                                                                                                                              
 
--------------------------------------------------------------------------------------------------------------------------                                                                              
| Id  | Operation          | Name      | Starts | E-Rows |E-Bytes| Cost (%CPU)| E-Time   | A-Rows |   A-Time   | Buffers |                                                                              
--------------------------------------------------------------------------------------------------------------------------                                                                              
|   0 | SELECT STATEMENT   |           |      1 |        |       |   405 (100)|          |      1 |00:00:00.01 |    1457 |                                                                              
|   1 |  SORT AGGREGATE    |           |      1 |      1 |    16 |            |          |      1 |00:00:00.01 |    1457 |                                                                              
|*  2 |   TABLE ACCESS FULL| CUSTOMERS |      1 |   2549 | 40784 |   405   (1)| 00:00:01 |   2438 |00:00:00.01 |    1457 |                                                                              
--------------------------------------------------------------------------------------------------------------------------                                                                              
 
Query Block Name / Object Alias (identified by operation id):                                                                                                                                           
-------------------------------------------------------------                                                                                                                                           
 
   1 - SEL$1                                                                                                                                                                                            
   2 - SEL$1 / CUSTOMERS@SEL$1                                                                                                                                                                          
 
Predicate Information (identified by operation id):                                                                                                                                                     
---------------------------------------------------                                                                                                                                                     
 
   2 - filter(("CUST_STATE_PROVINCE"='FL' AND "COUNTRY_ID"=52790))                                                                                                                                      
 
Column Projection Information (identified by operation id):                                                                                                                                             
-----------------------------------------------------------                                                                                                                                             
 
   1 - (#keys=0) COUNT(*)[22]                                                                                                                                                                           
   2 - (rowset=906)                                                                                                                                                                                     
 
Note                                                                                                                                                                                                    
-----                                                                                                                                                                                                   
   - dynamic statistics used: dynamic sampling (level=2)                                                                                                                                                
   - 1 Sql Plan Directive used for this statement

In the notes section we can see that the plan directive has been used by this new statement.

Adaptive Query Optimization – Adaptive ReOptimization

Adaptive plans can choose between multiple operations during the execution of a sql to pottentially choose better access methods. But it cannot change the join orders. 11gR2 had cardinality feedback which allowed oracle to choose a better plan based on the information gathered at the end of the execution. In 12c Cardinality feedback has been renamed as Statistics Feedback.

Statistics Feedback

During the first execution of the statement, the optimizer generates an execution plan and decides if it should enable statistics feedback monitoring for the cursor. Statistics feedback is enabled in the following cases.

  • Table with no statistics
  • Multiple Conjunctive or Disjunctive predicates
  • Predicates containing complex operators for which optimizer cannot accurately compute cardinality estimates.

At the end of the excution if the actual cardinalities are different from the estimated cardinalities, the actual cardinalities are stored for subsequent use. It also creates sql plan directives, so other sql statements can benefit from this information. The statement is marked as ReOptimizable and will be Re-Parsed at the next execution and creates a new child cursor. The actual cardinalities are used during the Re-Parsing to determine a good execution plan.

Lets look at an example

SQL_ID  faca0yc2h1cdr, child number 0                                                                                                                                                                   
-------------------------------------                                                                                                                                                                   
SELECT /*+ gather_plan_statistics */ count(*)  FROM   customers  WHERE                                                                                                                                  
cust_state_province = 'CA'  AND    country_id=52790                                                                                                                                                     
 
Plan hash value: 296924608                                                                                                                                                                              
 
--------------------------------------------------------------------------------------------------------------------------                                                                              
| Id  | Operation          | Name      | Starts | E-Rows |E-Bytes| Cost (%CPU)| E-Time   | A-Rows |   A-Time   | Buffers |                                                                              
--------------------------------------------------------------------------------------------------------------------------                                                                              
|   0 | SELECT STATEMENT   |           |      1 |        |       |   405 (100)|          |      1 |00:00:00.01 |    1458 |                                                                              
|   1 |  SORT AGGREGATE    |           |      1 |      1 |    16 |            |          |      1 |00:00:00.01 |    1458 |                                                                              
|*  2 |   TABLE ACCESS FULL| CUSTOMERS |      1 |   1115 | 17840 |   405   (1)| 00:00:01 |   3341 |00:00:00.01 |    1458 |                                                                              
--------------------------------------------------------------------------------------------------------------------------                                                                              
 
Query Block Name / Object Alias (identified by operation id):                                                                                                                                           
-------------------------------------------------------------                                                                                                                                           
 
   1 - SEL$1                                                                                                                                                                                            
   2 - SEL$1 / CUSTOMERS@SEL$1                                                                                                                                                                          
 
Predicate Information (identified by operation id):                                                                                                                                                     
---------------------------------------------------                                                                                                                                                     
 
   2 - filter(("CUST_STATE_PROVINCE"='CA' AND "COUNTRY_ID"=52790))                                                                                                                                      
 
Column Projection Information (identified by operation id):                                                                                                                                             
-----------------------------------------------------------                                                                                                                                             
 
   1 - (#keys=0) COUNT(*)[22]                                                                                                                                                                           
   2 - (rowset=906)

Let us check the details of the cursor

select plan_hash_value phv,child_number,parse_calls,executions,is_bind_sensitive ibs
      ,is_bind_aware iba,is_reoptimizable ropt,is_resolved_adaptive_plan aplan 
from v$sql where sql_id = 'faca0yc2h1cdr'
 
       PHV CHILD_NUMBER PARSE_CALLS EXECUTIONS I I R A
---------- ------------ ----------- ---------- - - - -
 296924608	      0 	  1	     1 N N Y

So we can see that the cursor has been flagged as ReOptimizable.

Let’s Execute the Same Sql again.

SQL_ID  faca0yc2h1cdr, child number 1                                                                                                                                                                   
-------------------------------------                                                                                                                                                                   
SELECT /*+ gather_plan_statistics */ count(*)  FROM   customers  WHERE                                                                                                                                  
cust_state_province = 'CA'  AND    country_id=52790                                                                                                                                                     
 
Plan hash value: 296924608                                                                                                                                                                              
 
--------------------------------------------------------------------------------------------------------------------------                                                                              
| Id  | Operation          | Name      | Starts | E-Rows |E-Bytes| Cost (%CPU)| E-Time   | A-Rows |   A-Time   | Buffers |                                                                              
--------------------------------------------------------------------------------------------------------------------------                                                                              
|   0 | SELECT STATEMENT   |           |      1 |        |       |   405 (100)|          |      1 |00:00:00.01 |    1457 |                                                                              
|   1 |  SORT AGGREGATE    |           |      1 |      1 |    16 |            |          |      1 |00:00:00.01 |    1457 |                                                                              
|*  2 |   TABLE ACCESS FULL| CUSTOMERS |      1 |   3341 | 53456 |   405   (1)| 00:00:01 |   3341 |00:00:00.01 |    1457 |                                                                              
--------------------------------------------------------------------------------------------------------------------------                                                                              
 
Query Block Name / Object Alias (identified by operation id):                                                                                                                                           
-------------------------------------------------------------                                                                                                                                           
 
   1 - SEL$1                                                                                                                                                                                            
   2 - SEL$1 / CUSTOMERS@SEL$1                                                                                                                                                                          
 
Predicate Information (identified by operation id):                                                                                                                                                     
---------------------------------------------------                                                                                                                                                     
 
   2 - filter(("CUST_STATE_PROVINCE"='CA' AND "COUNTRY_ID"=52790))                                                                                                                                      
 
Column Projection Information (identified by operation id):                                                                                                                                             
-----------------------------------------------------------                                                                                                                                             
 
   1 - (#keys=0) COUNT(*)[22]                                                                                                                                                                           
   2 - (rowset=906)                                                                                                                                                                                     
 
Note                                                                                                                                                                                                    
-----                                                                                                                                                                                                   
   - statistics feedback used for this statement

Let’s check the details of the cursor

       PHV CHILD_NUMBER PARSE_CALLS EXECUTIONS I I R A
---------- ------------ ----------- ---------- - - - -
 296924608	      0 	  1	     1 N N Y
 296924608	      1 	  1	     1 N N N

We can see that a new child cursor has been created.
The notes secton indicates that Statistics feedback has been used for this statement.

Performance Feedback

When AutoDOP is enabled in Adaptive mode, during first execution of the statement, the optimizer determines whether the statement should execute in parallel, and what DOP should be used. After the first execution, the execution stats is used to compute a DOP. If the computed DOP is different from the original DOP the statement is marked for re-optimization and the execution statistics is stored. When the statement is executed again the executio statistics is used to compute a new DOP.

12c Adaptive Query Optimization – Adaptive Plans – Adaptive Parallel Distribution Methods

In oracle 11gR2, oracle uses one of broadcast, hash, round robin, range or key distribution methods, to redistribute data among  parallel query servers. If the cardinality estimation at a given step happened to be incorrect (eg: the estimate was about a 100 rows and in reality there was 100,000 rows), oracle could end up broadcasting a lot of rows to all the receiving  parallel query servers. Another issue frequently encountered was that there could be skew in the popular values, resulting in hash distribution sending most of the values to one or a small subset of the receiving pq servers (Causing them to do most of the work).

To address such situations in 12cR1, as part of the new Adaptive Query Optimization techniques, oracle introduced the ability to switch the distribution method from hash to broadcast, if the actual number of rows was less than the threshold .  In 12.1 the threshold was 2X degree of parallelism .  This new functionality is called Adaptive Parallel Distribution Methods.

If you check the execution plan you will see that a “Statistics Collector” has been inserted before the send operation , and the send operation is called “PX Send Hybrid Hash”.

Let us look at an example of a simple hybrid hash distribution.

SQL_ID  d1qmm37g288qz, child number 0                                                                                                                                                                   
-------------------------------------                                                                                                                                                                   
select /*+ full(emp) full(dept) parallel(4) */                                                                                                                                                          
dept.deptno,dept.dname,emp.empno,emp.ename from dept,emp where                                                                                                                                          
dept.deptno = emp.deptno                                                                                                                                                                                
 
Plan hash value: 717794318                                                                                                                                                                              
 
------------------------------------------------------------------------------------------------------------------------------------------------                                                        
| Id  | Operation                  | Name     | E-Rows |E-Bytes| Cost (%CPU)| E-Time   |    TQ  |IN-OUT| PQ Distrib |  OMem |  1Mem |  O/1/M   |                                                        
------------------------------------------------------------------------------------------------------------------------------------------------                                                        
|   0 | SELECT STATEMENT           |          |        |       |     4 (100)|          |        |      |            |       |       |          |                                                        
|   1 |  PX COORDINATOR            |          |        |       |            |          |        |      |            | 73728 | 73728 |          |                                                        
|   2 |   PX SEND QC (RANDOM)      | :TQ10002 |     14 |   770 |     4   (0)| 00:00:01 |  Q1,02 | P->S | QC (RAND)  |       |       |          |                                                        
|*  3 |    HASH JOIN BUFFERED      |          |     14 |   770 |     4   (0)| 00:00:01 |  Q1,02 | PCWP |            |  3105K|  2823K|     4/0/0|                                                        
|   4 |     PX RECEIVE             |          |      4 |    88 |     2   (0)| 00:00:01 |  Q1,02 | PCWP |            |       |       |          |                                                        
|   5 |      PX SEND HYBRID HASH   | :TQ10000 |      4 |    88 |     2   (0)| 00:00:01 |  Q1,00 | P->P | HYBRID HASH|       |       |          |                                                        
|   6 |       STATISTICS COLLECTOR |          |        |       |            |          |  Q1,00 | PCWC |            |       |       |          |                                                        
|   7 |        PX BLOCK ITERATOR   |          |      4 |    88 |     2   (0)| 00:00:01 |  Q1,00 | PCWC |            |       |       |          |                                                        
|*  8 |         TABLE ACCESS FULL  | DEPT     |      4 |    88 |     2   (0)| 00:00:01 |  Q1,00 | PCWP |            |       |       |          |                                                        
|   9 |     PX RECEIVE             |          |     14 |   462 |     2   (0)| 00:00:01 |  Q1,02 | PCWP |            |       |       |          |                                                        
|  10 |      PX SEND HYBRID HASH   | :TQ10001 |     14 |   462 |     2   (0)| 00:00:01 |  Q1,01 | P->P | HYBRID HASH|       |       |          |                                                        
|  11 |       PX BLOCK ITERATOR    |          |     14 |   462 |     2   (0)| 00:00:01 |  Q1,01 | PCWC |            |       |       |          |                                                        
|* 12 |        TABLE ACCESS FULL   | EMP      |     14 |   462 |     2   (0)| 00:00:01 |  Q1,01 | PCWP |            |       |       |          |                                                        
------------------------------------------------------------------------------------------------------------------------------------------------

In the execution plan you can see that in steps 5 and 10 the operation is “PX SEND HYBRID HASH”. During execution, the statistics collector  in step 6 buffers the rows being retrieved from DEPT. If the number of rows are below a threshold, then all rows from DEPT are broadcast to each of the PQ servers that perform the join. If the number of rows exceed a threshold then the rows from DEPT are send using a HASH distribution to the PQ servers that perform the join.

Immediately after the query above is executed, from the same session, you can execute the script pqstat to see how the rows were distributed among the producers and consumers.

SQL> @pqstat
 
DFO_NUMBER      TQ_ID SERVER_TYPE                              PROCESS                    NUM_ROWS      BYTES      WAITS   TIMEOUTS AVG_LATENCY
---------- ---------- ---------------------------------------- ------------------------ ---------- ---------- ---------- ---------- -----------
         1          0 Producer                                 P004                              0         96          0          0           0
                                                               P005                             16        324          0          0           0
                                                               P006                              0         96          0          0           0
                                                               P007                              0         96          0          0           0
                      Consumer                                 P000                              4        153        125        119           0
                                                               P001                              4        153        127        119           0
                                                               P002                              4        153        127        119           0
                                                               P003                              4        153        125        119           0
                    2 Producer                                 P000                              0         24          0          0           0
                                                               P001                             14        375          1          0           0
                                                               P002                              0         24          0          0           0
                                                               P003                              0         24          0          0           0
                      Consumer                                 QC                               14        447          2          0           0

 

Now let us take a look at a variation of the hybrid hash distribution operation , which is denoted as “PX SEND HYBRID HASH (SKEW)”

SQL_ID  20fky17tasvn9, child number 0                                                                                                                                                                   
-------------------------------------                                                                                                                                                                   
SELECT /*+ parallel(4) full(e) full(d) gather_plan_statistics */                                                                                                                                        
department_name, sum(salary)   FROM   employees e, departments d                                                                                                                                        
WHERE  d.department_id=e.department_id   GROUP BY department_name                                                                                                                                       
 
Plan hash value: 2940813933                                                                                                                                                                             
 
----------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------    
| Id  | Operation                          | Name        | Starts | E-Rows |E-Bytes| Cost (%CPU)| E-Time   |    TQ  |IN-OUT| PQ Distrib | A-Rows |   A-Time   | Buffers |  OMem |  1Mem |  O/1/M   |    
----------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------    
|   0 | SELECT STATEMENT                   |             |      1 |        |       |     5 (100)|          |        |      |            |     11 |00:00:00.04 |      24 |       |       |          |    
|   1 |  PX COORDINATOR                    |             |      1 |        |       |            |          |        |      |            |     11 |00:00:00.04 |      24 | 73728 | 73728 |          |    
|   2 |   PX SEND QC (RANDOM)              | :TQ10003    |      0 |     27 |   621 |     5  (20)| 00:00:01 |  Q1,03 | P->S | QC (RAND)  |      0 |00:00:00.01 |       0 |       |       |          |    
|   3 |    HASH GROUP BY                   |             |      4 |     27 |   621 |     5  (20)| 00:00:01 |  Q1,03 | PCWP |            |     11 |00:00:00.01 |       0 |  1071K|  1071K|     4/0/0|    
|   4 |     PX RECEIVE                     |             |      4 |     27 |   621 |     5  (20)| 00:00:01 |  Q1,03 | PCWP |            |     12 |00:00:00.01 |       0 |       |       |          |    
|   5 |      PX SEND HASH                  | :TQ10002    |      0 |     27 |   621 |     5  (20)| 00:00:01 |  Q1,02 | P->P | HASH       |      0 |00:00:00.01 |       0 |       |       |          |    
|   6 |       HASH GROUP BY                |             |      4 |     27 |   621 |     5  (20)| 00:00:01 |  Q1,02 | PCWP |            |     12 |00:00:00.03 |      18 |  1063K|  1063K|     4/0/0|    
|*  7 |        HASH JOIN                   |             |      4 |    106 |  2438 |     4   (0)| 00:00:01 |  Q1,02 | PCWP |            |    106 |00:00:00.03 |      18 |  1572K|  1572K|     4/0/0|    
|   8 |         PX RECEIVE                 |             |      4 |     27 |   432 |     2   (0)| 00:00:01 |  Q1,02 | PCWP |            |    108 |00:00:00.02 |       0 |       |       |          |    
|   9 |          PX SEND HYBRID HASH       | :TQ10000    |      0 |     27 |   432 |     2   (0)| 00:00:01 |  Q1,00 | P->P | HYBRID HASH|      0 |00:00:00.01 |       0 |       |       |          |    
|  10 |           STATISTICS COLLECTOR     |             |      4 |        |       |            |          |  Q1,00 | PCWC |            |     27 |00:00:00.01 |       9 |       |       |          |    
|  11 |            PX BLOCK ITERATOR       |             |      4 |     27 |   432 |     2   (0)| 00:00:01 |  Q1,00 | PCWC |            |     27 |00:00:00.01 |       9 |       |       |          |    
|* 12 |             TABLE ACCESS FULL      | DEPARTMENTS |      1 |     27 |   432 |     2   (0)| 00:00:01 |  Q1,00 | PCWP |            |     27 |00:00:00.01 |       9 |       |       |          |    
|  13 |         PX RECEIVE                 |             |      4 |    107 |   749 |     2   (0)| 00:00:01 |  Q1,02 | PCWP |            |    107 |00:00:00.01 |      18 |       |       |          |    
|  14 |          PX SEND HYBRID HASH (SKEW)| :TQ10001    |      4 |    107 |   749 |     2   (0)| 00:00:01 |  Q1,01 | P->P | HYBRID HASH|    107 |00:00:00.01 |      18 |       |       |          |    
|  15 |           PX BLOCK ITERATOR        |             |      4 |    107 |   749 |     2   (0)| 00:00:01 |  Q1,01 | PCWC |            |    107 |00:00:00.01 |      18 |       |       |          |    
|* 16 |            TABLE ACCESS FULL       | EMPLOYEES   |      2 |    107 |   749 |     2   (0)| 00:00:01 |  Q1,01 | PCWP |            |    107 |00:00:00.01 |      18 |       |       |          |    
----------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------

In the execution plans you can see the steps “PX SEND HYBRID HASH” (Step 9) and “PX SEND HYBRID HASH (SKEW)” (Step 14).

“PX SEND HYBRID HASH (SKEW)” is a technique used by 12c parallel query to avoid hash distribution of skewed popular key values. During optimization of the query, if histograms are present on the join expression, oracle finds and keeps the popular values in the cursor. At execution time , when the build table is being read, all values with matches in the popular values, are BROADCAST to all receiving pq serves  and non matching values are distributed by HASH. When the probe table is being read all values with matches in the popular values are send ROUND ROBIN/RANDOM to the receiving pq serves and the non matching values are distributed by HASH.

Please read Randolf Geist’s excellent Articles Linked below for indepth info on this topic.

Parallel Execution Skew – 12c Hybrid Hash Distribution with Skew Detection

12c Hybrid Hash Distribution with Skew Detection / Handling.

AWR – Profiling Database I/O

Oracle Awr (Automatic Workload Repository) statistics, captures and stores fine grained information about file reads and writes (aka i/o), that the database performed, during the course of execution of, application generated database workloads. When analyzing the read and write patterns of the database, it helps a lot to understand what type of activity is generating the reads and writes. With this stored information we can get an indepth understanding of the distribution of random and sequential reads and writes.

I use this information for getting a better understanding of the I/O profile, for my Exadata sizing exercises.

This information can be used to understand clearly how much of the i/o is from Temp activity, Datafile reads and writes, Archivelog writes, log writes, and whether these are small or large reads and writes.

To the best of my understanding the small reads and writes are those < 128k and the large reads and writes are those > 128k.

This information is contained mainly in two awr Views.

Dba_Hist_Iostat_FileType
Dba_Hist_Iostat_Function

Dba_Hist_Iostat_FileType

This view displays the historical i/o statistics by file type. The main filetypes are the following

Archive Log
Archive Log Backup
Control File
Data File
Data File Backup
Data File Copy
Data File Incremental Backup
Data Pump Dump File
Flashback Log
Log File
Other
Temp File

Dba_Hist_Iostat_Function

This view displays the historical i/o statistics by i/o function. The main i/o functions are the following

Recovery
Buffer Cache Reads
Others
RMAN
Streams AQ
Smart Scan
Data Pump
XDB
Direct Writes
DBWR
LGWR
Direct Reads
Archive Manager
ARCH

From everything i have seen sofar, these reads and writes can be directly co-related to the “Physical read total IO requests” and “Physical write total IO requests” system level statistics.

I have written a script that displays information from the above mentioned views and gives a detailed breakdown of i/o gendrated from different aspects of the database activities.
In order to fit in the computer screen real estate, i have actually limited the columns the script displays (So it displays only the file types i am frequently interested in). Please feel free to take the script and modify it to add columns that you want to display.

The full version of the script  awrioftallpct-pub.sql can be found here.

The script accepts the following inputs
– A begin snap id for a snapid range you want to report for
– A End snap id for a snapid range you want to report for
– A Dbid for the database
– The snap interval in seconds (If you have a 30 minute interval input 1800 seconds)

A description of all the column names in the output, broken down by section, is provided in the header section of the script.

There are 6 sections to this script

1) Total Reads + Writes
2) Total Reads
3) Total Writes
4) Read write breakdown for datafiles
5) Data File – Direct Path v/s Buffered Read Write breakdown
6) Read write breakdown for tempfiles

1) Total Reads + Writes

This section displays the number of reads+writes by filetype, and a percentage of reads+writes for each file type, as a percentage of total reads+writes. The last column displays the total reads+writes for all file types. The column DTDP shows the i/o that bypasses flash cache by default and goes directly to spinning disk on Exadata (Temp+Archivelogs+Flashback Logs).

io1-rw

 

Click on the image to see a larger version

2) Total Reads

This section displays the number of reads by filetype, and a percentage of reads for each file type, as a percentage of total reads. The last column displays the total reads for all file types.

io1-r

Click on the image to see a larger version

3) Total Writes

This section displays the number of writes by filetype, and a percentage of writes for each file type, as a percentage of total writes. The last column displays the total writes for all file types.

io1-w

Click on the image to see a larger version

4) Read write breakdown for datafiles

This section displays the I/O information only pertaining to datafile i/o. It displays the small and large reads and writes and a percentage they constitute of the total reads+writes to datafiles, and a percentage they constitute of the total reads or writes to datafiles. It also displays the total small and large reads and writes and a percentage they constitute of the total reads+writes to datafiles.

io1-dfrw

Click on the image to see a larger version

5) Data File – Direct Path v/s Buffered Read Write breakdown

This section provides a breakdown of I/O by function (As opposed to i/o by filetype in the previous sections). The output shows columns that display the direct path small and large reads and writes, buffered small reads and writes, smart scan small and large reads and other small and large reads and writes.

io3-bf

Click on the image to see a larger version

6) Read write breakdown for tempfiles

This section displays the I/O information only pertaining to tempfile i/o. It displays the small and large reads and writes and a percentage they constitute of the total reads+writes to tempfiles, and a percentage they constitute of the total reads or writes to tempfiles. It also displays the total small and large reads and writes and a percentage they constitute of the total reads+writes to tempfiles.

io3-tf

Click on the image to see a larger version

The full version of the script  awrioftallpct-pub.sql can be found here.

Oracle Active DataGuard – Considerations for the Wide area Network

Oracle customers use Oracle Active Dataguard to create and maintain one or many standby databases that protect their mission critical primary databases from disaster. Typically, in such deployments, the primary databases and standby databases are in geographically separate locations connected via a WAN (Wide Area Network). Log Transport Services, transfers Large volumes of redo logs from the primary location to the standby, using Sql*Net.

We have to ensure that all the components from the source to target are setup correctly to ensure that the data transfer can be done with the best throughput possible. If sufficient network bandwidth is not available with reasonable latencies, then we will start seeing the log transfer and apply,lagging on the standby site (Which is oracle speak for, your primary and standby database is now out of sync from a data perspective).

One key point to keep in mind is that, lower the network round trip time (aka latency), higher your data transfer throughput. Higher the network round trip time (aka latency), lower your data transfer throughput. So it is very important to maintain low round trip times on your Wide area network.

To understand network data transfer throughput, It is important to understand the the concepts of Tcp Window Size and Bandwidth Delay Product (aka BDP).

Tcp Window size is the amount of bytes that can be transmitted without receiving an acknowledgement from the other side. Once Tcp Window size amount of bytes are send, the sender stops sending any more bytes and waits for an acknowledgement from the receiver.

Bandwidth delay product is calculated as the product of the network bandwidth and network round trip time. bdp=network bandwidth*round trip time. This is the amount of data that left the sender before the first acknowledgement was received by the sender. If the senders output bandwidth is stable, and the bandwidth is fully used, then the BDP calculates the number of packets in transit. If we set the Tcp Window size equal to the bandwidth delay product, then in theory we should be able to fully utilize the available bandwidth.

Setup the network

We have to start by setting up the networking components to support the desired/stated bandwidth. So if you have a WAN that is a 10GigE network, all the NIC’s (Network interface cards), ports, switches in the configuration should be configured to support 10GigE full Duplex settings. After setup, run the configuration display utilities and ensure that all these component levels the transfer speeds are set to be 10 GigE. Customers often run into trouble when Auto Negotiation causes some NIC’s to set the transfer speeds to 1GigE because of configuration mismatches.

Use tools like Iperf to test the transfer speeds that your network is capable of achieving.

One important aspect to keep in mind is that it is probable that the WAN is shared by other traffic (e-mail, data replication, san replication). This has two important implications that we should consider.

  • If there is a lot of bandwidth consumption by some of this miscellaneous traffic, round trip times could be degraded periodically on the network.
  • We should be careful in our calculations that we do not completely consume the entire bandwidth for redo transport. (This could impact other processes)
    • So it is important to figure out (Working with the network admins) what the bandwidth entitlements are for redo transport and base our calculations on those numbers.

Caclulate our BDP

Use the following formula to calculate our Bandwidth delay product (BDP)

(bandwidth/8)*rount trip time in seconds.

The network bandwidth is expressed in bits per second, so we divide by 8 to convert to bytes.
Round Trip Time is usually in milli seconds, so we divide by 1000 to convert to seconds.

So for example, if we have a 10Gbit network bandwidth and a 40ms round trip time

BDP=(10000000000/8)*(40/1000) = 50,000,000 bytes.

Setup Sql*Net Parameters

The current recommendations for Dataguard Redo transport are as follows.

Set the SDU size to 65535

  • We can set SDU on a per connection basis using the SDU parameter in the local naming configuration file (TNSNAMES.ORA) and the listener configuration file (LISTENER.ORA)
  • We can set the SDU for all Oracle Net connections with the profile parameter DEFAULT_SDU_SIZE in the SQLNET.ORA file.

Set TCP.NODELAY to YES

To preempt delays in buffer flushing in the TCP protocol stack, disable the TCP Nagle algorithm by setting TCP.NODELAY to YES in the SQLNET.ORA file on both the primary and standby systems.

Setup RECV_BUF_SIZE and SEND_BUF_SIZE

The current recommendation is to set the SEND_BUF_SIZE and RECV_BUF_SIZE parameters (Which are the send and receive socket buffer sizes for SQL*Net) to 3 Times the BDP.

As per the above example we would set them to 50,000,000*3 = 150,000,000

Setup Operating system Kernel Parameters

If you are using the Linux operating system make sure that the values for the following kernel parameters are setup to be higher than the values set for RECV_BUF_SIZE and SEND_BUF_SIZE.

net.core.rmem_max
net.core.wmem_max

Once we have configured the network, operating system and the sql*net, and we have redo transfer, we can perform further network monitoring to see how the network bandwidth is being utilized, and make appropriate adjustments.

Links to helpful Documents

Iperf

How to calcluate Tcp throughput for long distance links (blog)

Oracle Net Performance Tuning (Mos)

Setting Send and Receive Buffer Sizes (Mos)

Tuning Sql*Net peformance (Oracle Docs)

Configuring Oracle Dataguard (Oracle Docs)

Best Practices for Sync Data Transport (White Paper)

Script to compare tkprof output files

I often use the oracle 10046 event tracing mechanism to capture sql’s from a session to identify why certain transactions are running slower in different env’s or at different points in time. Oracle does have a mechanism where you can save the trace information in database tables. One can use the INSERT parameter in tkprof to store the trace information into a database table. Once in the table you can write sql’s that compare multiple runs or multiple statements.

I wrote a python program that compares two different tkprof output files. The files are compared, and the following aspects of each of the sqlid’s in the tkprof output file,s are printed side by side. The output is sorted by the Difference in Elapsed Time, in Descending order.

  • Sql text
  • Plan Hash Value
  • Total Elapsed time
  • Total Logical Reads
  • Total Rows processed
  • Difference in Elapsed Time
  • Difference in Number of Rows processed
  • Difference in Logical reads.

Other columns can be added to this, if you desire.
I use this script output as a quick way to see which sql’s are running slower and are probably candidates for further analysis/tuning.

The sqlid’s from the file provided as the first argument to the script (referred to as the left file) are compared to the same sqlid’s in the file provided as the second argument to the script (referred to as the right file). The following columns are displayed.

sqlid                           sqlid being compared
text                             First 20 chars of the sql text
lplan                           Plan hash value from the left file
rplan                          Plan hash value from the right file
lela                             Total Elapsed time from the left file
rela                            Total Elapsed time from the right file
llreads                       Total Logical reads (query+current) from the left file
rlreads                      Total Logical reads (query+current) from the right file
lrows                         Total rows processed from the left file
rrows                        Total rows processed from the right file
eladiff                        Lela – Rela
rowsdiff                    Lrows – Rrows
lreadsdiff                  Llreads – rlreads

Here is a sample syntax for running the script. (You need the python pandas package to be installed for this to execute successfully)

python ./difftk.py /u01/tkprofout/Newplans.prf /u01/tkprofout/Stage.prf

Here is a sample output

difftk

 

Click on the image to view a larger version.

The full script is below

?View Code PYTHON
#Python script to list differences between sql executions in two tkprof output files
#useful if comparing tkprof from prod and dev for example
#Author : rajeev.ramdas
 
import sys
import os
import pandas as pd
from pandas import DataFrame
 
# Define a class to hold the counters for each sqlid
class sqliddet:
     def init(self):
            sqlid=''
            text=''
            plan_hash=''
            tcount=0
            tcpu=0
            tela=0
            tdisk=0
            tquery=0
            tcurr=0
            trows=0
 
# Define 2 empty dictionaries to store info about each input file
leftsqliddict={}
rightsqliddict={}
 
# Process each file and add one row per sqlid to the dictionary
# We want to add the row to the dictionary only after the SQLID row and the total row has been read
# So the firstsqlid flag is used to make sure that we do not insert before total is read for the first row.
 
def processfile(p_file,p_sqliddict):
 
    myfile=open(p_file,"r")
    line=myfile.readline()
    firstsqlid=True
    linespastsqlid=99
    while line:
        linespastsqlid+=1
        line=myfile.readline()
        if line.startswith('SQL ID'):
            linespastsqlid=0
            if firstsqlid==True:
                firstsqlid=False
            else:
                p_sqliddict[currsqlid.sqlid]=[currsqlid.plan_hash,currsqlid.tcount,currsqlid.tcpu,currsqlid.tela,currsqlid.tdisk,currsqlid.tquery
                            ,currsqlid.tcurr,currsqlid.trows,currsqlid.text]
            currsqlid=sqliddet()
            currsqlid.sqlid=line.split()[2]
            currsqlid.plan_hash=line.split()[5]
        if linespastsqlid==2:
            currsqlid.text=line[0:20]
        if line.startswith('total'):
            a,currsqlid.tcount,currsqlid.tcpu,currsqlid.tela,currsqlid.tdisk,currsqlid.tquery,currsqlid.tcurr,currsqlid.trows=line.split()
        if line.startswith('OVERALL'):
            p_sqliddict[currsqlid.sqlid]=[currsqlid.plan_hash,currsqlid.tcount,currsqlid.tcpu,currsqlid.tela,currsqlid.tdisk,currsqlid.tquery
                       ,currsqlid.tcurr,currsqlid.trows,currsqlid.text]
        continue
    myfile.close()
 
# Main portion of script
if len(sys.argv) != 3:
   print('Syntax : python ./difftk.py tkprof1.out tkprof2.out')
   sys.exit()
 
if not os.path.isfile(sys.argv[1]) or not os.path.isfile(sys.argv[2]):
   print ("File Does not Exist")
   sys.exit()
 
processfile(sys.argv[1],leftsqliddict)
processfile(sys.argv[2],rightsqliddict)
 
t_difftk_lst=[]
 
# Match the sqlid's from the file on the left to the file on the right
# Gather up the statistics form both sides, insert into a list
# Transfer the list to a pandas dataframe, add some computed columns
 
for sqlid,stats in leftsqliddict.items():
    l_totlogical=int(stats[5])+int(stats[6])
    if sqlid in rightsqliddict:
       t_difftk_lst.append([sqlid,stats[8].rstrip(),stats[0],rightsqliddict[sqlid][0],float(stats[3])
                            ,float(rightsqliddict[sqlid][3]),float(l_totlogical),float(rightsqliddict[sqlid][5])+float(rightsqliddict[sqlid][6])
                            ,float(stats[7]),float(rightsqliddict[sqlid][7])
                           ])
    else:
       t_difftk_lst.append([sqlid,stats[8].rstrip(),stats[0],0,float(stats[3]),0
                            ,float(l_totlogical),0,float(stats[7]),0
                           ])
 
difftk_df=DataFrame(t_difftk_lst,columns=['sqlid','sqltext','lplan','rplan','lela','rela','llreads','rlreads','lrows','rrows'])
difftk_df['eladiff']=difftk_df['lela']-difftk_df['rela']
difftk_df['rowsdiff']=difftk_df['lrows']-difftk_df['rrows']
difftk_df['lreadsdiff']=difftk_df['llreads']-difftk_df['rlreads']
 
pd.set_option('display.width',1000)
print (difftk_df.sort(columns='eladiff',ascending=False))

Oracle Database In-Memory an introduction Part 2 – What do i need to do, to use the Oracle In-Memory Database ?

Step 1) Define the INMEMORY_SIZE

Customer has to setup the correct value for a database initialization parameter , INMEMORY_SIZE. This parameter specifies the amount of memory, from the SGA, that is to be used for the In-Memory column store. This is a static pool (ie Automatic memory management cannot extend or shrink this area), which means that you have to restart the database if any changes to this parameter needs to take effect. The In-Memory area is sub-divided into two pools: a 1MB pool used to store the actual column formatted data populated into memory, and a 64K pool used to store metadata about the objects that are populated into the IM column store.

 

sho parameter inmemory_size
 
NAME				     TYPE	 VALUE
------------------------------------ ----------- ------------------------------
inmemory_size			     big integer 500M
 
select pool,alloc_bytes,used_bytes,populate_status from v$inmemory_area;
 
POOL			   ALLOC_BYTES USED_BYTES POPULATE_STATUS
-------------------------- ----------- ---------- --------------------------
1MB POOL		     418381824		0 DONE
64KB POOL		     100663296		0 DONE

Step 2) Mark the performance critical objects in your database, with the attribute INMEMORY

select partition_name,bytes/(1024),inmemory,inmemory_compression from dba_segments where
owner = 'SH' and segment_name = 'SALES'
 
PARTITION_NAME		       BYTES/(1024) INMEMORY INMEMORY_COMPRESS
------------------------------ ------------ -------- -----------------
SALES_Q1_1998			       8192 DISABLED
 
ALTER TABLE SH.SALES MODIFY PARTITION SALES_Q1_1998 INMEMORY PRIORITY CRITICAL MEMCOMPRESS FOR QUERY HIGH;
 
Table altered.
 
select partition_name,bytes/(1024),inmemory,inmemory_compression from dba_segments where
owner = 'SH' and segment_name = 'SALES'  2  ;
 
PARTITION_NAME		       BYTES/(1024) INMEMORY INMEMORY_COMPRESS
------------------------------ ------------ -------- -----------------
SALES_Q1_1998			       8192 ENABLED  FOR QUERY HIGH

Step 3) Populate the In-Memory datastore

Objects are populated into the In-Memory Datastore, in a prioritized list, immediately after the database is opened, or after they are scanned for the first time. There are 7 levels for the keyword PRIORITY (CRITICAL, HIGH, MEDIUM, LOW, NONE).

The IM column store is populated by a set of background processes referred to as worker processes (ora_w001_orcl). The database is fully active / accessible while this occurs.Each worker process is given a subset of database blocks from the object to populate into the IM column store. Population is a streaming mechanism, simultaneously columnizing and compressing the data. There is a new IMCO (In memory co-ordinator) background process which wakes up every 2 minutes and checks to see if there are any population tasks that need to be completed. Eg: A new object has been marked as InMemory with a PRIORITY other than None.

select v.owner,v.segment_name,v.partition_name,v.bytes orig_size,v.inmemory_size in_mem_size
 
OWNER	   SEGMENT_NA PARTITION_NAME		      ORIG_SIZE IN_MEM_SIZE
---------- ---------- ------------------------------ ---------- -----------
SH	   SALES      SALES_Q1_1998			8388608     1179648
 
select * from 
(
select /*+ full(sales) */ channel_id,count(*)
from sh.sales partition (sales_q1_1998)
group by channel_id
order by count(*) desc
)
where rownum &lt; 6
 
CHANNEL_ID   COUNT(*)
---------- ----------
	 3	32796
	 2	 6602
	 4	 3926
	 9	  363
 
Elapsed: 00:00:00.09
 
select * from table (dbms_xplan.display_cursor());
 
PLAN_TABLE_OUTPUT
--------------------------------------------------------------------------------------------------------
SQL_ID	40pjk921r3jrc, child number 0
-------------------------------------
select * from ( select /*+ full(sales) */ channel_id,count(*) from
sh.sales partition (sales_q1_1998) group by channel_id order by
count(*) desc ) where rownum &lt; 6
 
Plan hash value: 2962696457
 
---------------------------------------------------------------------------------------------------------
| Id  | Operation			| Name	| Rows	| Bytes | Cost (%CPU)| Time	| Pstart| Pstop |
---------------------------------------------------------------------------------------------------------
 
PLAN_TABLE_OUTPUT
---------------------------------------------------------------------------------------------------------
|   0 | SELECT STATEMENT		|	|	|	|    12 (100)|		|	|	|
|*  1 |  COUNT STOPKEY			|	|	|	|	     |		|	|	|
|   2 |   VIEW				|	|     4 |   104 |    12  (34)| 00:00:01 |	|	|
|*  3 |    SORT ORDER BY STOPKEY	|	|     4 |    12 |    12  (34)| 00:00:01 |	|	|
|   4 |     HASH GROUP BY		|	|     4 |    12 |    12  (34)| 00:00:01 |	|	|
|   5 |      PARTITION RANGE SINGLE	|	| 43687 |   127K|     9  (12)| 00:00:01 |     5 |     5 |
|   6 |       TABLE ACCESS INMEMORY FULL| SALES | 43687 |   127K|     9  (12)| 00:00:01 |     5 |     5 |
---------------------------------------------------------------------------------------------------------

For much more in-depth technical details of the Oracle Database In-Memory, please see this whitepaper.

Oracle Database In-Memory an introduction Part 1 – What is Oracle Database In-Memory, and how is it different ?

On July 22nd 2014, Oracle corporation, announced Oracle Database 12c’s latest patch release 12.1.0.2. This latest patch release includes the new Oracle Database In-Memory functionality.
The Oracle Database In-Memory enables a single database to efficiently support mixed workloads. It uses a “dual-format” architecture, that retains the Record-Setting OLTP performance of the oracle databases, while simultaneously supporting real-time analytics and reporting. This is achieved by retaining the traditional oracle memory architecture, but adding a new purely in-memory column format (Automatically created and maintained by oracle), optimized for analytical processing.

So now you have the data stored in the Oracle database in your database files, in a row format, and for any of the objects marked as INMEMORY, oracle creates an In-Memory column store, where the data resides in a column format.The IM column store does not replace the buffer cache, but acts as a supplement, so that data can now be stored in memory in both a row and a column format.

Ok why the dual format, one would ask ?

The Row format is retained as is, so that there is no compromise/degradation in the OLTP performance of the database. In a Row format database each row is made up of multiple columns, with each column representing an attribute about that record. A column format database stores each of the attributes about the transaction in a separate column structure. A column format is ideal for Analytics, but is not very efficient in processing DML requests like insert, update and deletes (Which operates on the whole row). Oracle Database In-Memory (Database In-Memory) provides the best of both worlds by allowing data to be simultaneously populated in both an in-memory row format (the buffer cache) and a new in-memory column format (The In-Memory Store).

cncpt_vm_379

The picture above shows the In-Memory area in the SGA

cncpt_vm_378

The above picture shows an example of the Column Store.

No changes are necessary to your existing applications in order to take advantage of the Oracle Database In-Memory option. Any query that will benefit from the In-Memory column store will be automatically directed there, by the Optimizer. The In-Memory store is kept transactionally consistent with the buffer cache. There are numerous optimizations that have been implemented that speed up this data access in the In-Memory store. All the database functionality that Oracle has built over the last 30 years, continues to work in this new version.

It is normal that folks would go out and compare, Oracle Databaes In-Memory, with other In Memory Database products available in the Market today. So let us explore the differences with some of those products next.

TimesTen

Oracle TimesTen In-Memory Database is a,row oriented, memory optimized relational database, that resides entirely in the physical memory of a server. Oracle TimesTen In-Memory Database operates on databases that fit entirely in physical memory using standard SQL interfaces. In contrast, the Oracle Database In-Memory, only stores selected objects, in memory, in a column format (As opposed to a row format in Oracle TimesTen), the remaining objects continue to exist on the storage subsystem.

SAP HANA

SAP HANA is an in-memory, column-oriented relational database. So the entire database has to reside in physical memory of the server. As mentioned earlier in this article, OLTP transactions have some disadvantages while operating on column stores.

Todays databases can be in the 10’s or 100’s of TeraBytes. Storing this entire data in physical memory can be an expensive proposition, not to mention that, it is quite prevalent that ,only a small subset of this data, ends up having daily usage. This is where Oracle’s approach of storing only selected objects in memory, has significant benefits.

Here is a link to the Oracle Database In-Memory Launch.