Notes on Oracle Parallel Query – Part 0

Oracle introduced “Parallel Query” in version 7.1 in 1994. A decade and a half later, it still trips some unsuspecting customers up, when they

  • Turn parallelism on via a hint in all queries

Or

  • Turn parallelism on at every table and index level

Doing this, coupled with high concurrent usage of the database, can lead to a lot of query server processes being spawned, leading to very high cpu usage, high load average, high number of processes waiting on the cpu run queue and high amount of i/o requests. Such situations may result in overall degraded performance. Below are some concepts to note when using parallel query. Sometimes Too much of a good thing is indeed a Bad thing.

Parallel query (and/or parallel DML) is a mechanism to levarage the cpu and i/o resources available on an SMP system to break down a database operation (query, insert , update, delete,merge) into multiple small operations, execute the small operations concurrently on the system, completing the database operation significantly faster than if it would have executed in a serial fashion. Database operations that include large table scans, large index scans or large joins usually benefit highly from using parallel query.

Oracle does not turn this feature on by default for all queries. There are specific steps to perform to turn this on as discussed in my previous post.

The total number of parallel query server’s that can be running at any given time is determined by the initialization parameter parallel_max_servers. This parameter defaults to cpu_count*parallel_threads_per_cpu*(2 if pga_aggregate_target > 0, else 1)*5

Oracle’s recommendation is that you set parallel_max_servers to 2 * Degree of Parallelism * Number of concurrent users

The reason for the recommendation 2 * DOP is that, if the operation can use “Inter Operation” parallelism, it could end up using 2 sets of Query Server Processes at the same time.

“Intra Operation” parallelism means that one operation is executed in parallel using multiple Query Server Processes (eg: Full table scan). “Inter Operation” parallelism means that the output of one parallel operation is being pipelined to another parallel operation (eg: When there are two tables being full table scanned followed by a hash join, so here the first set of query servers could be scanning one table and the second set consuming those rows and building the hash table). So there are circumstances where the operation ends up using two sets of Query Server Processes. Hence the recommendation of 2*DOP.

You ideally want to keep the CPU usage under 70% with your load averages equal to your number of cpu’s and average run queues as close to 0 as possible. The idea is to work the server efficiently but not work it to death.

Keep in mind that on systems like sun’s t2000 and t5240 systems, each thread shows up as a cpu in cpu_count .

You can control the use of parallel query server’s by setting the initialization parameter parallel_adaptive_multiuser=true (It defaults to true). When enabled this parameter causes oracle to enable an adaptive algorithm that reduces the degree of parallelism used by queries based on the database load.

One another common mistake is that for statements like insert /*+ append */ ….select…..  users forget to enable parallel dml. When running normally the select part of the insert gets parallelized (as long as dop is set) but the insert runs in a serial fashion. Once you enable parallel dml even the insert is parallelized.

You have to explicitly issue the command ‘alter session enable parallel dml’ to enable parallel dml. You should avoid turning parallel dml ON  for all sessions and all transactions. Turn it selectively on for the sessions you need parallel dml. There are some restrictions for using parallel dml.

Consider increasing parallel_execution_message size to 4k or 8k from the default of 2k, for improved parallel execution performance.

Determining the best degree of parallelism for tables is a tough problem. You need to consider the following factors.

  • What is the required query response time by the end user
  • How many parallel query servers does it take to achieve that query response time
  • Haw many of such large queries will concurrently execute on my server (Consider your application generated queries and ad hoc queries)
  • How many CPU’s do i have on the system.
  • How much i/o volumes can the system handle

Once you have the above information you need to determine a degree of parallelism that gives a reasonable query response time, when your average number of large queries execute concurrently, without clobbering your CPU, memory and i/o subsystems (Alternatively you could just determine the number of cpu’s, amount of memory, i/o channels and physical disks you need to sustain the above workload and go buy and use that hardware :-)).

Also see my previous post on enabling and monitoring parallel query for more parallel query info.

Notes on Oracle Parallel Query – Part I

For a good primer on parallel query in oracle please read the white paper, Oracle Sql Parallel Execution

  • The degree of parallelism for a query can be specified
    • By specifying the parallel clause during table/index creation (Or later using alter)
    • Using a parallel hint in the sql statement
    • Using ‘alter session force parallel query parallel integer’ statement
  • If One of the options in the above statement has not been specified (To enable parallelism) then, irrespective of whether the table or index is partitioned or not, oracle does not use parallelism in the query
  • When using The dedicated server connection model (As opposed to the shared server model), the sessions shadow process acts as the query co-ordinator.
  • The query uses as many “parallel execution servers” as determined by the query optimizer, to execute the query (This is based on a varitey of factors)
  • When there is No parallel degree specified at the table level, in the query hint or at the session level, but parallelism is enabled at the session level (using alter session  force parallel query) then the optimizer uses the default degree of parallelism.
  • The Default Degree of Parellism (DOP) is determined by cpu_count x parallel_threads_per_cpu
  • If you specified parallelism by issueing ‘alter session force parallel query parallel integer’ statement, then the value used for integer is used for the degree of parallelism (If a parallel hint is not specified in the query).
  • All of the above statements assume that there are enough query servers available in the pool and parallel_adaptive_multi_user does not reduce the number of parallel execution servers.
  • If interoperation parallelism can be used then you could end up using double the number of parallel execution servers as the degree of parallelism
  • If for some reason your query cannot be allocated enough parallel execution servers (typically when there are multiple sessions and all the parallel execution servers (gated by parallel_max_servers) are currently in use) and parallel_min_percent > 0 (0 is the default so your query will still get run without parallelism), your query will get an error and you can try again later.

Ok enough basics.

Once a query is running, How do you determine what it is doing ? Things like, is it using parallel query ? What is the degree of parallelism it is using ?

As the query you want to diagnose is still executing, you can run the following query from another session connected with DBA privileges.

SELECT qcsid,
sid,
NVL(server_group,0) server_group,
server_set,
degree,
req_degree
FROM SYS.V_$PX_SESSION
ORDER BY qcsid,
NVL(server_group,0),
server_set;

You will get  an output similar to the table below

QCSID	  SID SERVER_GROUP SERVER_SET	  DEGREE REQ_DEGREE
--------- ---------- ------------ ---------- ---------- ----------
170	  170		 0
170	  136		 1	    1	       5	  5
170	  129		 1	    1	       5	  5
170	  134		 1	    1	       5	  5
170	  130		 1	    1	       5	  5
170	  138		 1	    1	       5	  5
170	  137		 1	    2	       5	  5
170	  133		 1	    2	       5	  5
170	  135		 1	    2	       5	  5
170	  139		 1	    2	       5	  5
170	  131		 1	    2	       5	  5

How do you check how many parallel execution servers are currently in use by running ?

select
statistic,
value
from
v$pq_sysstat
where
statistic like ‘Servers Busy%’;

You will get an output similar to the table below

STATISTIC			    VALUE
------------------------------ ----------
Servers Busy			       10

How do you check, which parallel execution servers are in use, by which session ?

select * from v$px_process;

You will see a result similar to the table below

SERV STATUS	      PID SPID		      SID    SERIAL#
---- --------- ---------- ------------ ---------- ----------
P000 IN USE	       27 9555		      129	  47
P001 IN USE	       29 9557		      131	  14
P003 IN USE	       31 9561		      132	 119
P009 IN USE	       37 9573		      133	  13
P008 IN USE	       36 9571		      134	  10
P004 IN USE	       32 9563		      135	   7
P007 IN USE	       35 9569		      136	   9
P002 IN USE	       30 9559		      137	  10
P005 IN USE	       33 9565		      138	   9
P006 IN USE	       34 9567		      139	   6

How can you monitor the progress of the query ?

SELECT   a.qcsid,
a.sid,
a.server_group,
a.server_set,
substr(b.name,1,20) operation,
a.value
FROM     v$px_sesstat a,
v$statname b
WHERE    a.statistic# = b.statistic#
AND      UPPER(b.name) = ‘PHYSICAL READS’
ORDER BY a.qcsid,
a.server_group,
a.server_set;

You will see results similar to the Table below

     QCSID  SID SERVER_GROUP SERVER_SET OPERATION		  VALUE
---------- ---- ------------ ---------- -------------------- ----------
       170  139 	   1	      1 physical reads		      0
       170  130 	   1	      1 physical reads		      0
       170  134 	   1	      1 physical reads		      0
       170  135 	   1	      1 physical reads		      0
       170  129 	   1	      1 physical reads		      0
       170  131 	   1	      2 physical reads		   6980
       170  136 	   1	      2 physical reads		   7136
       170  137 	   1	      2 physical reads		   6404
       170  133 	   1	      2 physical reads		   6436
       170  138 	   1	      2 physical reads		   5988
       170  170 			physical reads		 766852

How do you check, what the parallel execution servers, are waiting on ?

SELECT px.SID “SID”, p.PID, p.SPID “SPID”, px.INST_ID “Inst”,
px.SERVER_GROUP “Group”, px.SERVER_SET “Set”,
px.DEGREE “Degree”, px.REQ_DEGREE “Req Degree”, w.event “Wait Event”
FROM GV$SESSION s, GV$PX_SESSION px, GV$PROCESS p, GV$SESSION_WAIT w
WHERE s.sid (+) = px.sid AND s.inst_id (+) = px.inst_id AND
s.sid = w.sid (+) AND s.inst_id = w.inst_id (+) AND
s.paddr = p.addr (+) AND s.inst_id = p.inst_id (+)
ORDER BY DECODE(px.QCINST_ID,  NULL, px.INST_ID,  px.QCINST_ID), px.QCSID,
DECODE(px.SERVER_GROUP, NULL, 0, px.SERVER_GROUP), px.SERVER_SET, px.INST_ID;

You will see results like in the table below

 SID  PID SPID	 Inst Group	   Set Degree Req Degree Wait Event
---- ---- ------ ---- ----- ---------- ------ ---------- ------------------------------
 170   24 9473	    1					 PX Deq: Execute Reply
 131   27 10236     1	  1	     1	    5	       5 PX Deq: Execution Msg
 132   29 10238     1	  1	     1	    5	       5 PX Deq: Execution Msg
 129   20 10234     1	  1	     1	    5	       5 PX Deq: Execution Msg
 134   31 10242     1	  1	     1	    5	       5 PX Deq: Execution Msg
 139   30 10240     1	  1	     1	    5	       5 PX Deq: Execution Msg
 135   36 10252     1	  1	     2	    5	       5 resmgr:cpu quantum
 136   34 10248     1	  1	     2	    5	       5 resmgr:cpu quantum
 137   35 10250     1	  1	     2	    5	       5 direct path read
 130   33 10246     1	  1	     2	    5	       5 resmgr:cpu quantum
 133   32 10244     1	  1	     2	    5	       5 direct path read

If you start seeing a lot of waits on the event “PX Deq Credit: send blkd”, you can start troubleshooting this by identifying the consumers who are blocking the send (Once you identify them you can probably drill down into that session using ash reports). Metalink Note 304317.1 has a query that can be used to identify the blocking consumers.

Once a statement has finished execution, how do you check if it used parallel query, and how was the statement parellelized ?

Issue the following statement from the same session that the query executed

break on tq_id on server_type
SELECT dfo_number, tq_id, server_type, process, num_rows, bytes
FROM v$pq_tqstat
ORDER BY dfo_number DESC, tq_id, server_type DESC , process;

You will see results similar to the table below

DFO_NUMBER	TQ_ID SERVER_TYP PROCESS      NUM_ROWS	    BYTES
---------- ---------- ---------- ---------- ---------- ----------
	 1	    0 Producer	 P005		   287	     2582
	 1			 P006		   287	     2582
	 1			 P007		   287	     2582
	 1			 P008		   287	     2582
	 1			 P009		   287	     2582
	 1	      Consumer	 P000		   315	     2825
	 1			 P001		   340	     3035
	 1			 P002		   270	     2450
	 1			 P003		   250	     2280
	 1			 P004		   260	     2320
	 1	    1 Producer	 P000		     1	       32
	 1			 P001		     1	       32
	 1			 P002		     1	       32
	 1			 P003		     1	       32
	 1			 P004		     1	       32
	 1	      Consumer	 QC		     5	      160

How do you review all the parallel activity happening in your environment ?

select name,value from v$sysstat
where upper(name) like ‘%PARALLEL OPERATIONS%’
or upper(name) like ‘%PARALLELIZED%’ or upper(name) like ‘%PX%’;

You will see results similar to the table below

NAME								      VALUE
---------------------------------------------------------------- ----------
queries parallelized							 10
DML statements parallelized						  0
DDL statements parallelized						  0
DFO trees parallelized							 10
Parallel operations not downgraded					 10
Parallel operations downgraded to serial				  0
Parallel operations downgraded 75 to 99 pct				  0
Parallel operations downgraded 50 to 75 pct				  0
Parallel operations downgraded 25 to 50 pct				  0
Parallel operations downgraded 1 to 25 pct				  0
PX local messages sent						       2350
PX local messages recv'd					       2350
PX remote messages sent 						  0
PX remote messages recv'd						  0

This should be a good starting point to start analyzing parallel query behaviour

Controlling Parallel query in 11g Rac

            In a Rac environment, when you execute a parallel query, it is very likely that the parallel execution servers, get executed on all nodes in a Rac cluster. In releases prior to Oracle 11g, if you wanted to ensure that all the parallel execution servers for a single query get executed on the same node (or a group of nodes) you had to set the initialization parameter parallel_instance_groups

           Starting with Oracle 11g, you can just create services which are only active on certain nodes in the Rac cluster . So when you login to the database using such a service name, your parallel query will only spawn parallel execution servers on the nodes where the service is active. Oracle automatically adjusts the values for parallel_instance_groups (Without you having to explicitly set it) based on your service name you used to connect. Simplifies our life.