Tuning High-Volume SQL
Mr and Mrs Oracle at the Supermarket
I was at the supermarket the other day waiting my turn at the checkout behind another guy. The checkout-chick (I'm sure there is a more PC term, I just don't know what it is) just finished scanning his groceries and he asked her to wait until his wife returned with a few last-minute items. I've done it before, so steam didn't start coming out of my ears - yet. Fifteen seconds later she ran up - sorry, sorry - and checked out the last few items, so I wasn't really inconvenienced.
As it turned out, their technique was efficient. If they had returned to the aisles with the shopping trolly to get the last few items, they would almost certainly have been behind me at the checkout; 30 seconds of extra items could have cost them minutes at the checkout.
How does this technique scale though? What if he had walked into the supermarket and reserved a place at the checkout whilst his wife ran back and forth with groceries - one aisle at a time. It is inconceivable that she could fetch groceries as fast as the checkout-chick could check them out, so inevitably there would be considerable waiting time between checking-out each armful of groceries. Also, the collection itself will take longer because of the extra ground covered. Although I haven't performed a study, logic tells me the technique will be slower. By lots!
So, it makes sense if you only need to go back for one or two more things, but not more. It's hardly surprising that you don't see people doing it, right?
But here's the thing. I DO see it all the time. Not at the supermarket with groceries; but in Oracle with data. The database equivalent of running back and forth to the checkout is called Nested I/O. Nested I/O is really efficient for small volumes, but it doesn't scale. It is the cause of almost every poorly tuned high-volume SQL I have ever seen.
More is always more
Consider a sub-optimal SQL (SQL1) that takes 10 minutes to run and its optimal tuned equivalent (SQL2) that takes 1 minute to run. They perform the same logic and return the same results, but one is faster than the other. There can only be one reason why SQL1 is slower: because it is doing more work, and since they are producing the same results, this extra work is clearly redundant.
So the secret to tuning High-Volume SQL (indeed - tuning any SQL) is to eliminate the redundant work, leaving only the minimum effort required to generate the result. In the context of SQL tuning, "More Work" comes in two basic forms:
- Full table scans - A Full Table Scan reads all of the rows in a table or table partition. If you don't need every row to satisfy your SQL, then some of them have been read redundantly. The solution is generally to create an index so that only the required rows are read. This is a characteristic of a low-volume SQL. Low volume SQLs read small proportions of large tables, so full scans that discard heaps of data become very expensive. Conversely, high-volume SQLs tend to read large proportions of large tables; a Full Table Scan discards much less data and becomes pretty efficient.
Indexes are not the solution to high-volume SQL performance problems. That is not a mis-print. Generally speaking, a high-volume SQL will produce its optimal performance without using a single index.
- Nested I/O - Nesting is when you perform the same operation for every element in a data set: For each A do B. Nested I/O is when the nested operation (B) is an I/O operation. Like at the supermarket, it is much more efficient to do all your I/O in one hit rather than bit-by-bit. There's a whole 'nuther article on that concept alone, but to give you some idea consider the following:
EXAMPLE
Consider a nested operation where for each of 1,000,000 SALES transactions, we lookup the customer from CUSTOMER table, which contains 10,000 rows. Assume this is performed in a nested operation. For each Sales Transaction, we must:- Read the root block of the index.
- Navigate to the leaf block of the index, reading intermediate branch blocks along the way. This will vary depending on the size of the indexed column(s) and the size of the table, but an extra 2 blocks is not uncommon.
- Use the ROWID in the leaf block of the index to lookup the customer in the table
That's a total of 4 blocks to process a single sales transaction. Over 1,000,000 sales transactions, that's 4,000,000 block reads just to pick up the customer. Consider that the CUSTOMER table may fit perhaps 50 customers in a single block (probably more, but let's err on the conservative side). This means that the entire table occupies no more than 200 blocks in total. Yes that's right, we read 4 million blocks from a 200 block table!
By nesting the customer lookup, we have increased the number of blocks read 20,000-fold. In other word's 2 million percent more I/O.
OK, this is a bit of an over-dramatisation because much of those 4,000,000 block reads will be cached. But make no mistake, even reading cached blocks is not free. This overhead is real and very expensive.
Eliminating Nested I/O is the key to tuning High-Volume SQL.
Sounds simple, but it doesn't really help. If I have a slow SQL, how do I tune it? I never asked it to perform nested I/O, so how do I ask it NOT to? This is where it gets a little complex because there are a number of common problems that cause nested I/O; there is a separate section below for each case.
Nested Loops Joins
Nested Loops Joins are the most common and straightforward type of nesting in Oracle. When joining two tables, for each row in one table Oracle looks up the matching rows in the other table. Consider the following SQL Trace output from TKProf.
SELECT /*+RULE*/ empno , ename , dname FROM emp JOIN dept USING (deptno) call count cpu elapsed disk query current rows ------- ------ -------- ---------- ---------- ---------- ---------- ---------- Parse 1 0.00 0.00 0 0 0 0 Execute 1 0.00 0.00 0 0 0 0 Fetch 2 0.00 0.00 0 26 0 14 ------- ------ -------- ---------- ---------- ---------- ---------- ---------- total 4 0.00 0.00 0 26 0 14 Misses in library cache during parse: 1 Optimizer mode: RULE Parsing user id: 88 Rows Row Source Operation ------- --------------------------------------------------- 14 NESTED LOOPS (cr=26 pr=0 pw=0 time=0 us) 14 NESTED LOOPS (cr=12 pr=0 pw=0 time=15 us) 14 TABLE ACCESS FULL EMP (cr=8 pr=0 pw=0 time=5 us) 14 INDEX UNIQUE SCAN SYS_C009725 (cr=4 pr=0 pw=0 time=0 us)(object id 70296) 14 TABLE ACCESS BY INDEX ROWID DEPT (cr=14 pr=0 pw=0 time=0 us)
Starting with the inner-most step of the plan, we see a full table scan of EMP
returning 14 rows, and a unique index lookup on DEPT
for each of those rows. So we see some wastage already; we have retrieved 14 rows from DEPT
even though the table contains only 4 rows in total.
So how can we stop this happening? Keen observers will note the /*+RULE*/
hint in the above SQL. The Cost Based Optimiser is not stupid; it knows that nested loops is a wasteful plan for this SQL. The Rule Based Optimiser is stupid however; one of its rules is to use an index if it is available. This problem will fix itself when we remove the hint.
SELECT empno , ename , dname FROM emp JOIN dept USING (deptno) call count cpu elapsed disk query current rows ------- ------ -------- ---------- ---------- ---------- ---------- ---------- Parse 1 0.00 0.00 0 0 0 0 Execute 1 0.00 0.00 0 0 0 0 Fetch 2 0.00 0.00 0 11 0 14 ------- ------ -------- ---------- ---------- ---------- ---------- ---------- total 4 0.00 0.00 0 11 0 14 Misses in library cache during parse: 1 Optimizer mode: ALL_ROWS Parsing user id: 88 Rows Row Source Operation ------- --------------------------------------------------- 14 MERGE JOIN (cr=11 pr=0 pw=0 time=0 us cost=6 size=364 card=14) 4 TABLE ACCESS BY INDEX ROWID DEPT (cr=4 pr=0 pw=0 time=9 us cost=2 size=52 card=4) 4 INDEX FULL SCAN SYS_C009725 (cr=2 pr=0 pw=0 time=4 us cost=1 size=0 card=4)(object id 70296) 14 SORT JOIN (cr=7 pr=0 pw=0 time=2 us cost=4 size=182 card=14) 14 TABLE ACCESS FULL EMP (cr=7 pr=0 pw=0 time=1 us cost=3 size=182 card=14)
This was actually a bit unexpected. Oracle will usually prefer a HASH join over a MERGE for equi-joins. This is almost certainly a result of the very small tables involved. Let's see what Oracle would do in a real case of high-volume SQL:
SELECT /*+ CARDINALITY(emp, 1000000) CARDINALITY(dept,10000)*/ empno , ename , dname FROM emp JOIN dept USING (deptno) call count cpu elapsed disk query current rows ------- ------ -------- ---------- ---------- ---------- ---------- ---------- Parse 1 0.00 0.00 0 0 0 0 Execute 1 0.00 0.00 0 0 0 0 Fetch 2 0.00 0.00 0 15 0 14 ------- ------ -------- ---------- ---------- ---------- ---------- ---------- total 4 0.00 0.00 0 15 0 14 Misses in library cache during parse: 1 Optimizer mode: ALL_ROWS Parsing user id: 88 Rows Row Source Operation ------- --------------------------------------------------- 14 HASH JOIN (cr=15 pr=0 pw=0 time=0 us cost=16515 size=65000000000 card=2500000000) 4 TABLE ACCESS FULL DEPT (cr=7 pr=0 pw=0 time=1 us cost=3 size=130000 card=10000) 14 TABLE ACCESS FULL EMP (cr=8 pr=0 pw=0 time=0 us cost=14 size=13000000 card=1000000)
Now that it thinks there is a million employees and ten-thousand departments, it uses a HASH join. Hash joins are the preferred method for large data volumes, because their nested operations are performed in memory, not on disk, and they scale better than MERGE joins. Interestingly, Oracle was right to choose the MERGE join for the small volumes; note that MERGE performed only 11 buffer reads, whilst HASH performed 15, and NESTED LOOPS performed 26 - the worst of all.
Although the optimiser is very reliable on simple queries such as this, the real world contains much more complex queries that are beyond the capability of the CBO to optimise. In some cases the CBO will wrongly believe that a high-volume join is actually low-volume, and wrongly choose a Nested Loops join.
Tuning such queries is a complex process and outside the scope of this article. To find out if there is a problem, take a look at the Explain Plan output for the query and see if there are any NESTED LOOPS
or FILTER
operations with two child steps (FILTER
steps with one child do not nest). Look at the first child step and determine from your knowlege of the data how many rows that step will return; this is the number of times the second step will be executed. If this number is greater than 10% of the total rows in the table of the 2nd child step, then you almost certainly have a nested I/O problem. Even greater then 1% could be a problem, although not so serious.
Oracle generally makes these mistakes because it thinks the first child step will return fewer rows than it really does. Look at the row estimate in the Explain Plan to see if it is accurate - it almost certainly won't be. Use DBMS_STATS.GATHER_TABLE_STATS()
if statistics are stale. If that doesn't help, use CARDINALITY
hints as shown above to provide a more accurate estimate, or if this fails, use ORDERED
, USE_HASH
or NO_USE_NL
hints to advise the preferred join order and method.
Correlated sub-query expressions in the SELECT clause
Personally, I find sub-query expressions in the SELECT clause insidious and annoying. They serve almost no useful purpose and are the only guaranteed way to kill the performance of a high-volume SQL. There is simply NO WAY to tune them; they must be avoided at all costs.
Consider the following query and TK*Prof output. The query selects all employees from EMP
and picks up the department name from DEPT
in a correlated sub-query expression
SELECT empno , ename , (SELECT dname FROM dept WHERE deptno = emp.deptno) AS dname FROM emp call count cpu elapsed disk query current rows ------- ------ -------- ---------- ---------- ---------- ---------- ---------- Parse 1 0.00 0.00 0 0 0 0 Execute 1 0.00 0.00 0 0 0 0 Fetch 2 0.00 0.00 0 13 0 14 ------- ------ -------- ---------- ---------- ---------- ---------- ---------- total 4 0.00 0.00 0 13 0 14 Misses in library cache during parse: 0 Optimizer mode: ALL_ROWS Parsing user id: 88 Rows Row Source Operation ------- --------------------------------------------------- 3 TABLE ACCESS BY INDEX ROWID DEPT (cr=5 pr=0 pw=0 time=0 us cost=1 size=13 card=1) 3 INDEX UNIQUE SCAN SYS_C009725 (cr=2 pr=0 pw=0 time=0 us cost=0 size=0 card=1)(object id 70296) 14 TABLE ACCESS FULL EMP (cr=8 pr=0 pw=0 time=0 us cost=3 size=182 card=14)
But this is extraordinary, n'est pas? Like the Nested Loops plan above, this plan shows that all 14 rows were read from EMP
, and that a unique index scan was used to pick up matching rows in DEPT
. But look at the row count against DEPT
! 3 rows! How is that possible? There were 14 rows in EMP
and therefore 14 lookups to perform.
Something very different to a standard Nested Loops join is going on here. It is definitely nesting (for each EMP
get DEPT
) but the buffer count of 13 tells us that it is doing less work than the Nested Loops join. Nested Loops has no cleverness whatsoever; if it tries to lookup a row that it has just retrieved a moment earlier, it still reads the blocks. Those blocks may well be in Oracle's buffer-cache (so you don't need to get them from disk), but Oracle must still access the buffer cache and find the desired row amongst the potentially hundreds in the block. This is cheaper than disk I/O, but it's not free. That's why SQL Trace records the buffer cache reads and reports them in TKProf.
What has happened here is that Oracle has performed 14 nested operations to retrieve a DEPT
row, but only 3 of those operations have been recorded. What happened to the other 11? What type of operation did Oracle perform? In my database there are 4 rows in DEPT
of which only 3 are referenced by EMP.DEPTNO
, so this kind of makes sense; the three operations that it recorded were the first lookup of each of the three departments. The other 11 lookups were satisfied by some type of cache other than the buffer-cache. Oracle is a bit cagey about this; I can find no reference to it in the Performance Tuning manual.
So how magical is this other cache? Is it the same as the Hash Area? Hash area retrievals in hash joins are not recorded in SQL Trace, so it could be the same thing. This would be awesome, because the Hash Area can be made quite large, and hits against it are much cheaper that buffer-cache hits. Let's up the ante and find out; the following script creates BIGEMP
and BIGDEPT
10-times larger than EMP
and DEPT
.
SQL> CREATE TABLE bigemp ( 2 empno primary key, ename, job, mgr, hiredate, sal, comm, deptno 3 ) AS 4 SELECT empno+x AS empno, ename, job, mgr, hiredate, sal, comm, deptno+x AS deptno 5 FROM emp 6 CROSS JOIN (select level*10000 AS x FROM dual CONNECT BY level <= 10) 7 / Table created. SQL> SQL> CREATE TABLE bigdept ( 2 deptno PRIMARY KEY, dname, loc 3 ) AS 4 SELECT deptno+x AS deptno, dname, loc 5 FROM dept 6 CROSS JOIN (select level*10000 AS x FROM dual CONNECT BY level <= 10) 7 / Table created. SQL> SQL> exec dbms_stats.gather_table_stats(user, 'BIGEMP') PL/SQL procedure successfully completed. SQL> exec dbms_stats.gather_table_stats(user, 'BIGDEPT') PL/SQL procedure successfully completed. SELECT empno , ename , (SELECT dname FROM bigdept dept WHERE deptno = emp.deptno) AS dname FROM bigemp emp call count cpu elapsed disk query current rows ------- ------ -------- ---------- ---------- ---------- ---------- ---------- Parse 1 0.00 0.00 0 0 0 0 Execute 1 0.00 0.00 0 0 0 0 Fetch 11 0.00 0.00 0 54 0 140 ------- ------ -------- ---------- ---------- ---------- ---------- ---------- total 13 0.00 0.00 0 54 0 140 Misses in library cache during parse: 0 Optimizer mode: ALL_ROWS Parsing user id: 88 Rows Row Source Operation ------- --------------------------------------------------- 30 TABLE ACCESS BY INDEX ROWID BIGDEPT (cr=41 pr=0 pw=0 time=0 us cost=1 size=15 card=1) 30 INDEX UNIQUE SCAN SYS_C0016601 (cr=11 pr=0 pw=0 time=0 us cost=0 size=0 card=1)(object id 84747) 140 TABLE ACCESS FULL BIGEMP (cr=13 pr=0 pw=0 time=1 us cost=3 size=2240 card=140)
It scales perfectly! Although it has performed 140 nested operations to retrieve department information, only 30 of them recorded as table lookups. Let's try a 100-times larger data set:
SELECT empno , ename , (SELECT dname FROM bigdept dept WHERE deptno = emp.deptno) AS dname FROM bigemp emp call count cpu elapsed disk query current rows ------- ------ -------- ---------- ---------- ---------- ---------- ---------- Parse 1 0.00 0.00 0 0 0 0 Execute 1 0.00 0.00 0 0 0 0 Fetch 95 0.00 0.00 0 589 0 1400 ------- ------ -------- ---------- ---------- ---------- ---------- ---------- total 97 0.00 0.00 0 589 0 1400 Misses in library cache during parse: 1 Optimizer mode: ALL_ROWS Parsing user id: 88 Rows Row Source Operation ------- --------------------------------------------------- 388 TABLE ACCESS BY INDEX ROWID BIGDEPT (cr=484 pr=0 pw=0 time=0 us cost=1 size=16 card=1) 388 INDEX UNIQUE SCAN SYS_C0016605 (cr=96 pr=0 pw=0 time=0 us cost=0 size=0 card=1)(object id 84755) 1400 TABLE ACCESS FULL BIGEMP (cr=105 pr=0 pw=0 time=10 us cost=5 size=23800 card=1400)
Woops! We were expecting (hoping?) to see 1400 rows retrieved from BIGEMP
resulting in only 300 lookups on BIGDEPT
, but it has all gone horribly wrong. Instead of 300 lookups there are 388. Our seemingly magical cache has topped out and cannot hold even 300 rows. At around 30 bytes per row, that's around 10Kb at best. The block-size on this database is 8K. Coincidence? Who can tell?
So we don't really know exactly what's going here. Obviously there is some kind of results-cache in action, but equally obvious it that it is not terribly useful for high volume SQL (these tests were performed on 11g R1).
So what's the impact? Let's compare to our preferred high-volume join method: HASH:
SELECT /*+ CARDINALITY(bigemp, 1000000) CARDINALITY(bigdept,10000)*/ empno , ename , dname FROM bigemp JOIN bigdept USING (deptno) call count cpu elapsed disk query current rows ------- ------ -------- ---------- ---------- ---------- ---------- ---------- Parse 1 0.00 0.00 0 0 0 0 Execute 1 0.00 0.00 0 0 0 0 Fetch 95 0.00 0.00 0 109 0 1400 ------- ------ -------- ---------- ---------- ---------- ---------- ---------- total 97 0.00 0.00 0 109 0 1400 Misses in library cache during parse: 1 Optimizer mode: ALL_ROWS Parsing user id: 88 Rows Row Source Operation ------- --------------------------------------------------- 1400 HASH JOIN (cr=109 pr=0 pw=0 time=30 us cost=184 size=825000000 card=25000000) 400 TABLE ACCESS FULL BIGDEPT (cr=4 pr=0 pw=0 time=4 us cost=3 size=160000 card=10000) 1400 TABLE ACCESS FULL BIGEMP (cr=105 pr=0 pw=0 time=8 us cost=16 size=17000000 card=1000000)
Look at that! HASH processed seemingly more rows (400 departments instead of 388) but it read only 109 buffers whereas the subquery expression read 589. That is a 400% overhead. Remember that we are still dealing with relatively small volumes here; both tables and the index are in the buffer-cache. This effect multiplies even further when the buffer-cache fills and the query starts performing some disk I/O.
Lesson: Never, ever, ever use correlated scalar subquery expressions in high-volume SQL.
One final note: un-correlated subquery expressions do not suffer the same problem. Since they do not reference any columns in the outer query, they can be performed once only, and the results applied to every row.
SELECT empno , ename , (SELECT MAX(dname) FROM bigdept dept) AS dname FROM bigemp emp call count cpu elapsed disk query current rows ------- ------ -------- ---------- ---------- ---------- ---------- ---------- Parse 1 0.00 0.00 0 0 0 0 Execute 1 0.00 0.00 0 0 0 0 Fetch 95 0.00 0.00 0 109 0 1400 ------- ------ -------- ---------- ---------- ---------- ---------- ---------- total 97 0.00 0.00 0 109 0 1400 Misses in library cache during parse: 1 Optimizer mode: ALL_ROWS Parsing user id: 88 Rows Row Source Operation ------- --------------------------------------------------- 1 SORT AGGREGATE (cr=4 pr=0 pw=0 time=0 us) 400 TABLE ACCESS FULL BIGDEPT (cr=4 pr=0 pw=0 time=2 us cost=3 size=4000 card=400) 1400 TABLE ACCESS FULL BIGEMP (cr=105 pr=0 pw=0 time=4 us cost=5 size=15400 card=1400)
UPDATE - subqueries in the SET clause
Consider the following SQL that gives everyone a departmental average 4% salary increase
UPDATE emp a SET sal = ( SELECT a.sal + trunc(0.04 * avg(b.sal), 2) FROM emp b WHERE b.deptno = a.deptno ) call count cpu elapsed disk query current rows ------- ------ -------- ---------- ---------- ---------- ---------- ---------- Parse 1 0.00 0.00 0 0 0 0 Execute 1 0.00 0.00 0 91 30 14 Fetch 0 0.00 0.00 0 0 0 0 ------- ------ -------- ---------- ---------- ---------- ---------- ---------- total 2 0.00 0.00 0 91 30 14 Misses in library cache during parse: 1 Optimizer mode: ALL_ROWS Parsing user id: 88 Rows Row Source Operation ------- --------------------------------------------------- 0 UPDATE EMP (cr=91 pr=0 pw=0 time=0 us) 14 TABLE ACCESS FULL EMP (cr=7 pr=0 pw=0 time=2 us cost=3 size=98 card=14) 12 SORT AGGREGATE (cr=84 pr=0 pw=0 time=0 us) 59 TABLE ACCESS FULL EMP (cr=84 pr=0 pw=0 time=10 us cost=3 size=35 card=5)
See what's happening? For each of the 14 rows in EMP
(line 2 of the plan), Oracle performs a FULL scan of EMP
to get the departmental average salary. Look at the 4th line of the plan; 59 rows returned from the sub-query, but that does not tell you haw many rows were scanned and filtered out. The real number is much higher!
Even so, 59 rows read from a 14 row table. There's a bit of wastage going on there.
I know what you're thinking. Why not index EMP.DEPTNO
, that'll fix everything.
SQL> create index emp_dept on emp(deptno) ; Index created. SQL> exec dbms_stats.gather_table_stats(user,'EMP'); PL/SQL procedure successfully completed. UPDATE emp a SET sal = ( SELECT a.sal + trunc(0.04 * avg(b.sal), 2) FROM emp b WHERE b.deptno = a.deptno ) call count cpu elapsed disk query current rows ------- ------ -------- ---------- ---------- ---------- ---------- ---------- Parse 1 0.00 0.00 0 0 0 0 Execute 1 0.00 0.00 0 31 30 14 Fetch 0 0.00 0.00 0 0 0 0 ------- ------ -------- ---------- ---------- ---------- ---------- ---------- total 2 0.00 0.00 0 31 30 14 Misses in library cache during parse: 1 Optimizer mode: ALL_ROWS Parsing user id: 88 Rows Row Source Operation ------- --------------------------------------------------- 0 UPDATE EMP (cr=31 pr=0 pw=0 time=0 us) 14 TABLE ACCESS FULL EMP (cr=7 pr=0 pw=0 time=2 us cost=3 size=98 card=14) 12 SORT AGGREGATE (cr=24 pr=0 pw=0 time=0 us) 59 TABLE ACCESS BY INDEX ROWID EMP (cr=24 pr=0 pw=0 time=25 us cost=2 size=35 card=5) 59 INDEX RANGE SCAN EMP_DEPT (cr=12 pr=0 pw=0 time=6 us cost=1 size=0 card=5)(object id 84768)
Yes, it reduced the total buffers read from the buffer-cache (91 down to 31), but it still performed way too much redundant I/O: 59 rows from the index and the table.
The problem with correlated SET
clause subqueries - like subquery expressions - is that they always nest; they are UNTUNEABLE.
In order to maximise performance of a high-volume update with a correllated SET
sub-query, it is necessary to rewrite the statement. This can be done in two ways:
- Updateable Join Views
MERGE
statement
Solution 1 - Updateable Join Views
The Oracle UPDATE
statement permits updates not only on tables, but also Views and Inline Views. If we can define a view that presents both the current value of a column and the intended new value, then it permits the following syntax:
UPDATE view_name SET curr_col_val = new_col_val WHERE ....
Note that this syntax does not contain a sub-query expression; if the view can be defined in a way that avoids nesting, then the update can be performed without nesting. As we shall see, this technique is limited by a condition called Key Preservation. Let's look at it in action:
Here is a SQL that gives us the raw data required for the update, without nesting
1 SELECT old.empno, old.deptno, old.sal, old.sal + new.incr AS new_sal 2 FROM emp old 3 JOIN ( 4 SELECT deptno, trunc(0.04 * avg(sal), 2) AS incr 5 FROM emp 6 GROUP BY deptno 7 ) new 8* ON old.deptno = new.deptno EMPNO DEPTNO SAL NEW_SAL ---------- ---------- ---------- ---------- 7900 30 950 1012.66 7844 30 1500 1562.66 7654 30 1250 1312.66 7521 30 1250 1312.66 7499 30 1600 1662.66 7698 30 2850 2912.66 7876 20 1100 1187 7369 20 800 887 7902 20 3000 3087 7788 20 3000 3087 7566 20 2975 3062 7934 10 1300 1416.66 7782 10 2450 2566.66 7839 10 5000 5116.66 14 rows selected. ----------------------------------------------------------------------------- | Id | Operation | Name | Rows | Bytes | Cost (%CPU)| Time | ----------------------------------------------------------------------------- | 0 | SELECT STATEMENT | | 65 | 4225 | 8 (25)| 00:00:01 | |* 1 | HASH JOIN | | 65 | 4225 | 8 (25)| 00:00:01 | | 2 | TABLE ACCESS FULL | EMP | 14 | 546 | 3 (0)| 00:00:01 | | 3 | VIEW | | 14 | 364 | 4 (25)| 00:00:01 | | 4 | HASH GROUP BY | | 14 | 364 | 4 (25)| 00:00:01 | | 5 | TABLE ACCESS FULL| EMP | 14 | 364 | 3 (0)| 00:00:01 | -----------------------------------------------------------------------------
What we want to do is to treat this SQL as a view, set SAL
to NEW_SAL
, and have the results applied back to the EMP
table.
1 UPDATE ( 2 SELECT old.empno, old.deptno, old.sal, old.sal + new.incr AS new_sal 3 FROM emp old 4 JOIN ( 5 SELECT deptno, trunc(0.04 * avg(sal), 2) AS incr 6 FROM emp 7 GROUP BY deptno 8 ) new 9 ON old.deptno = new.deptno 10 ) 11* SET sal = new_sal SET sal = new_sal * ERROR at line 11: ORA-01779: cannot modify a column which maps to a non key-preserved table
This is one of the limitations of updateable join views: when updating a table via a join-view, it may only join to other tables using their primary or unique key. This concept is discussed in more detail in the Oracle manuals. The problem here of course is that the EMP
table is joined to an inline view that cannot carry a unique/primary key even though the GROUP BY
guarantees uniquess of deptno
.
One way around this is use a Global Temporary Table instead of an inline view since a GTT can be created with a primary key.
CREATE GLOBAL TEMPORARY TABLE emp_upd ( deptno PRIMARY KEY , incr ) ON COMMIT PRESERVE ROWS AS SELECT deptno, trunc(0.04 * avg(sal), 2) AS incr FROM emp GROUP BY deptno call count cpu elapsed disk query current rows ------- ------ -------- ---------- ---------- ---------- ---------- ---------- Parse 1 0.00 0.00 0 1 0 0 Execute 1 0.00 0.01 0 7 9 3 Fetch 0 0.00 0.00 0 0 0 0 ------- ------ -------- ---------- ---------- ---------- ---------- ---------- total 2 0.00 0.01 0 8 9 3 UPDATE ( SELECT old.empno, old.deptno, old.sal, old.sal + new.incr AS new_sal FROM emp old JOIN emp_upd new ON old.deptno = new.deptno ) SET sal = new_sal call count cpu elapsed disk query current rows ------- ------ -------- ---------- ---------- ---------- ---------- ---------- Parse 1 0.00 0.00 0 2 0 0 Execute 1 0.00 0.00 0 10 27 14 Fetch 0 0.00 0.00 0 0 0 0 ------- ------ -------- ---------- ---------- ---------- ---------- ---------- total 2 0.00 0.00 0 12 27 14 Misses in library cache during parse: 1 Optimizer mode: ALL_ROWS Parsing user id: 66 Rows Row Source Operation ------- --------------------------------------------------- 0 UPDATE EMP (cr=10 pr=0 pw=0 time=713 us) 14 HASH JOIN (cr=10 pr=0 pw=0 time=294 us) 3 TABLE ACCESS FULL EMP_UPD (cr=3 pr=0 pw=0 time=29 us) 14 TABLE ACCESS FULL EMP (cr=7 pr=0 pw=0 time=20 us)
There are some additional overheads to this method not shown above. Whilst the creation of the GTT is quite efficient, it must also build an index and sample the results to generate statistics. Overall, the nested subquery expression is more efficient over such small volumes (14 rows). Let's take a look at a sample 100 times larger:
UPDATE bigemp a SET sal = ( SELECT a.sal + trunc(0.04 * avg(b.sal), 2) FROM bigemp b WHERE b.deptno = a.deptno ) call count cpu elapsed disk query current rows ------- ------ -------- ---------- ---------- ---------- ---------- ---------- Parse 1 0.00 0.00 0 2 0 0 Execute 1 0.09 0.09 4 2505 2919 1400 Fetch 0 0.00 0.00 0 0 0 0 ------- ------ -------- ---------- ---------- ---------- ---------- ---------- total 2 0.10 0.09 4 2507 2919 1400 Misses in library cache during parse: 1 Optimizer mode: ALL_ROWS Parsing user id: 66 Rows Row Source Operation ------- --------------------------------------------------- 0 UPDATE BIGEMP (cr=2521 pr=4 pw=0 time=90826 us) 1400 TABLE ACCESS FULL BIGEMP (cr=12 pr=0 pw=0 time=2829 us) 1200 SORT AGGREGATE (cr=2489 pr=4 pw=0 time=34982 us) 5900 TABLE ACCESS BY INDEX ROWID BIGEMP (cr=2489 pr=4 pw=0 time=26630 us) 5900 INDEX RANGE SCAN BIGEMP_DEPTNO (cr=1217 pr=4 pw=0 time=14675 us)(object id 73563) CREATE GLOBAL TEMPORARY TABLE emp_upd ( deptno PRIMARY KEY , incr ) ON COMMIT PRESERVE ROWS AS SELECT deptno, trunc(0.04 * avg(sal), 2) AS incr FROM bigemp GROUP BY deptno / UPDATE ( SELECT old.empno, old.deptno, old.sal, old.sal + new.incr AS new_sal FROM bigemp old JOIN emp_upd new ON old.deptno = new.deptno ) SET sal = new_sal / OVERALL TOTALS FOR ALL NON-RECURSIVE STATEMENTS call count cpu elapsed disk query current rows ------- ------ -------- ---------- ---------- ---------- ---------- ---------- Parse 3 0.00 0.00 0 3 0 0 Execute 4 0.03 0.06 0 31 1466 1700 Fetch 0 0.00 0.00 0 0 0 0 ------- ------ -------- ---------- ---------- ---------- ---------- ---------- total 7 0.04 0.07 0 34 1466 1700 Misses in library cache during parse: 2 OVERALL TOTALS FOR ALL RECURSIVE STATEMENTS call count cpu elapsed disk query current rows ------- ------ -------- ---------- ---------- ---------- ---------- ---------- Parse 29 0.00 0.00 0 0 0 0 Execute 34 0.01 0.01 1 29 106 12 Fetch 47 0.00 0.00 0 82 0 29 ------- ------ -------- ---------- ---------- ---------- ---------- ---------- total 110 0.02 0.02 1 111 106 41
Remember that 1400 rows is still a relatively small sample; real life updates are much, much larger. The TK*Prof snippet above for the Global Temporary Table shows the session totals in order to account for all of the overheads. Even so, we can see that it performs substantially less IO (note the disk
, query
, and current
columns) than the nested version. This effect is magnified as the sample gets larger.
Solution 2 - MERGE
A brilliant solution for unnesting bulk updates was introduced in Oracle v9i: the MERGE
statement. The idea of MERGE
is to apply a series of transactions to a table by updating rows if they exist and inserting them if they do not. By careful construction of the statement, you can engineer MERGE
to only update.
MERGE INTO bigemp old USING ( SELECT bigemp.empno, bigemp.sal + incr.incr AS sal FROM bigemp JOIN ( SELECT deptno, trunc(0.04 * avg(sal), 2) AS incr FROM bigemp GROUP BY deptno ) incr ON bigemp.deptno = incr.deptno ) new ON ( old.empno = new.empno) WHEN MATCHED THEN UPDATE SET sal = new.sal WHEN NOT MATCHED THEN INSERT (empno) VALUES (1/0) call count cpu elapsed disk query current rows ------- ------ -------- ---------- ---------- ---------- ---------- ---------- Parse 1 0.03 0.05 12 112 0 0 Execute 1 0.03 0.03 0 39 1434 1400 Fetch 0 0.00 0.00 0 0 0 0 ------- ------ -------- ---------- ---------- ---------- ---------- ---------- total 2 0.06 0.09 12 151 1434 1400 Misses in library cache during parse: 1 Optimizer mode: ALL_ROWS Parsing user id: 66 Rows Row Source Operation ------- --------------------------------------------------- 2 MERGE BIGEMP (cr=39 pr=0 pw=0 time=37806 us) 1400 VIEW (cr=39 pr=0 pw=0 time=15085 us) 1400 HASH JOIN OUTER (cr=39 pr=0 pw=0 time=12280 us) 1400 HASH JOIN (cr=26 pr=0 pw=0 time=7658 us) 300 VIEW (cr=13 pr=0 pw=0 time=2515 us) 300 SORT GROUP BY (cr=13 pr=0 pw=0 time=1609 us) 1400 TABLE ACCESS FULL BIGEMP (cr=13 pr=0 pw=0 time=85 us) 1400 TABLE ACCESS FULL BIGEMP (cr=13 pr=0 pw=0 time=41 us) 1400 TABLE ACCESS FULL BIGEMP (cr=13 pr=0 pw=0 time=1431 us)
Note the disk
, query
, and current
columns are all significantly less than the nested example for 1400 rows shown earlier, and are comparable to the Updateable Join View.
The WHEN NOT MATCHED THEN INSERT
clause was made optional in 10g, but is mandatory in v9i. The 1/0
value causes a divide-by-zero error that aborts a badly designed statement and prevents unwanted inserts.
Functions
Extreme care should be taken using stored functions in high-volume SQL. Functions are often used to encapsulate complex code that can then be re-used in many places. The problem with this approach is that when the function is used in a high volume SQL, the complex code is executed for every row encountered. An otherwise simple-looking SQL that returns 100,000 rows will - behind the scenes - execute 100,000 (or more!) SQLs.
Not all functions are problematic:
- Functions called in the
WHERE
clause with no parameters, or with constant values as parameters, will execute only once. The value returned will then be used like a constant or bind variable in the SQL. - Functions called in the
SELECT
clause with no parameters, or with constant values as parameters, will execute only once if they have been defined asDETERMINISTIC
. Non-deterministic functions (see the Oracle PL/SQL reference) in theSELECT
clause will execute once for every row selected. - Functions that contain only PL/SQL code are much more efficient than those that include SQL statements. Such functions are not ideal, but they do not tend to result in nested IO.
Importantly, correllated function calls (where a table column value is passed as a parameter) will execute once for each row encountered. This is somewhat mitigated in 11g with its results cache but is still worthy of caution.
create table bigemp1 AS select * from bigemp; CREATE OR REPLACE FUNCTION dept_sal_avg ( p_deptno IN bigemp.deptno%TYPE ) RETURN bigemp.sal%TYPE IS ret bigemp.sal%TYPE; BEGIN SELECT avg(sal) INTO ret FROM bigemp WHERE deptno = p_deptno; RETURN(ret); END; / UPDATE bigemp1 SET sal = sal + (0.04 * trunc(dept_sal_avg(deptno), 2)) call count cpu elapsed disk query current rows ------- ------ -------- ---------- ---------- ---------- ---------- ---------- Parse 1 0.00 0.01 0 1 1 0 Execute 1 0.11 0.12 0 12 2867 1400 Fetch 0 0.00 0.00 0 0 0 0 ------- ------ -------- ---------- ---------- ---------- ---------- ---------- total 2 0.12 0.13 0 13 2868 1400 Misses in library cache during parse: 1 Optimizer mode: ALL_ROWS Parsing user id: 66 Rows Row Source Operation ------- --------------------------------------------------- 0 UPDATE BIGEMP1 (cr=4353 pr=4 pw=0 time=187756 us) 1400 TABLE ACCESS FULL BIGEMP1 (cr=12 pr=0 pw=0 time=2820 us) ******************************************************************************** SELECT AVG(SAL) FROM BIGEMP WHERE DEPTNO = :B1 call count cpu elapsed disk query current rows ------- ------ -------- ---------- ---------- ---------- ---------- ---------- Parse 1 0.00 0.00 0 0 0 0 Execute 1400 0.04 0.02 0 2 0 0 Fetch 1400 0.02 0.03 0 4302 0 1400 ------- ------ -------- ---------- ---------- ---------- ---------- ---------- total 2801 0.07 0.05 0 4304 0 1400
Keen observers will notice the use of table bigemp1 as a copy of bigemp to avoid mutating updates. I could have demonstrated the principle with a pure SELECT, but this example ties in nicely with the previous section.
Note the 1400 separate calls to the function and the high query
buffer IO. These figures are nearly identical to the nested update shown in the previous section.
Notwithstanding the exceptions above, functions containing SQL statements should never be called in high-volume SQL. The only alternative is to re-code the logic of the function into the SQL statement itself whilst avoiding the other nested IO pitfalls (especially subquery expressions in the SELECT
clause!).
Triggers
Triggers bearing the FOR EACH ROW
clause, not surprisingly, execute once for each row inserted / updated / deleted. The profile of such a statement is exactly the same as that of the correllated function above.
UPDATE bigemp1 SET sal = trunc(1.04 * sal, 2) call count cpu elapsed disk query current rows ------- ------ -------- ---------- ---------- ---------- ---------- ---------- Parse 1 0.00 0.00 0 1 0 0 Execute 1 0.17 0.17 0 13 2880 1400 Fetch 0 0.00 0.00 0 0 0 0 ------- ------ -------- ---------- ---------- ---------- ---------- ---------- total 2 0.17 0.17 0 14 2880 1400 Misses in library cache during parse: 1 Optimizer mode: ALL_ROWS Parsing user id: 66 Rows Row Source Operation ------- --------------------------------------------------- 0 UPDATE BIGEMP1 (cr=4335 pr=4 pw=0 time=254618 us) 1400 TABLE ACCESS FULL BIGEMP1 (cr=12 pr=0 pw=0 time=2824 us) ******************************************************************************** SELECT :B1 + TRUNC(0.04 * AVG(SAL), 2) FROM BIGEMP WHERE DEPTNO = :B2 call count cpu elapsed disk query current rows ------- ------ -------- ---------- ---------- ---------- ---------- ---------- Parse 1 0.00 0.00 0 0 0 0 Execute 1400 0.04 0.03 0 2 0 0 Fetch 1400 0.03 0.04 0 4302 0 1400 ------- ------ -------- ---------- ---------- ---------- ---------- ---------- total 2801 0.08 0.08 0 4304 0 1400 Misses in library cache during parse: 1 Misses in library cache during execute: 1 Optimizer mode: ALL_ROWS Parsing user id: 66 (recursive depth: 1) Rows Row Source Operation ------- --------------------------------------------------- 1400 SORT AGGREGATE (cr=4302 pr=0 pw=0 time=50346 us) 7000 TABLE ACCESS BY INDEX ROWID BIGEMP (cr=4302 pr=0 pw=0 time=40308 us) 7000 INDEX RANGE SCAN BIGEMP_DEPTNO (cr=2818 pr=0 pw=0 time=21291 us)(object id 73554)
As for functions, avoid triggers containing SQL statements on tables subject to bulk DML. Any SQL performed in triggers must be migrated to the application code, taking care to avoid all other nested IO issues described in this article.
Sub-queries
The case of sub-queries and nesting used to be simple: with the exception of uncorrelated IN
subqueries, all sub-queries would nest. Period. This was great for low-volume SQL, rubbish for high-volume SQL. Easy: use joins instead for high-volume SQL.
Things started to change in v7.3 with Anti Joins that permitted NOT IN
to be unnested by the optimiser (effectively converting them into an outer join that discards matched rows). In 8i, Semi-Joins allowed correlated EXISTS
sub-queries to be unnested; but more importantly the /*+UNNEST*/
hint internally converted IN
subqueries into EXISTS
and NOT EXISTS
subqueries into NOT IN
so that Anti- and Semi-Joins could unnest almost any sub-query.
The revolution was completed in 9i with automated unnesting; the optimiser determines whether unnesting would be beneficial (any high-volume SQL) and applies it automatically. No more hints! As a result, sub-queries are almost never a problem for high-volume SQL any more. But when they are a problem they catch you completely off-guard.
So what's the problem if Oracle does it all for you? There are two:
- Oracle will sometimes think your high-volume query is low-volume and won't unnest.
- Oracle cannot unnest every subquery. You need to understand the restrictions.
How to tell if your sub-query is nesting
Run your High Volume SQL through Explain Plan. Look for any NESTED LOOPS
lines or FILTER
lines with two child steps. Nested Loops joins typically relate to join statements, but they can also come from sub-queries re-written by the optimiser. They usually mean that the sub-query is still nesting, even though the optimiser has rewritten (but not unnested) the query. FILTER
steps with one child step are usually related to complex queries with views or aggregation but they are relatively harmless. FILTER
steps with two child steps come from sub-queries; they mean that the sub-query is nesting.
This is an example of a nested sub-query. Note the FILTER
step.
1 select * 2 from emp 3 where deptno = ( 4 select dept.deptno 5 from dept 6 where dname = 'RESEARCH' 7 and deptno = emp.deptno 8* ) -------------------------------------------------------------------------------------------- | Id | Operation | Name | Rows | Bytes | Cost (%CPU)| Time | -------------------------------------------------------------------------------------------- | 0 | SELECT STATEMENT | | 1 | 87 | 5 (0)| 00:00:01 | |* 1 | FILTER | | | | | | | 2 | TABLE ACCESS FULL | EMP | 14 | 1218 | 3 (0)| 00:00:01 | |* 3 | TABLE ACCESS BY INDEX ROWID| DEPT | 1 | 22 | 2 (0)| 00:00:01 | |* 4 | INDEX UNIQUE SCAN | SYS_C009534 | 1 | | 1 (0)| 00:00:01 | -------------------------------------------------------------------------------------------- Predicate Information (identified by operation id): --------------------------------------------------- 1 - filter("DEPTNO"= (SELECT /*+ */ "DEPT"."DEPTNO" FROM "DEPT" "DEPT" WHERE "DEPTNO"=:B1 AND "DNAME"='RESEARCH')) 3 - filter("DNAME"='RESEARCH') 4 - access("DEPTNO"=:B1)
The next example shows a SQL with the same functionality that has been automatically unnested by the optimiser. Note the HASH JOIN SEMI
.
1 select * 2 from emp 3 where exists ( 4 select 1 5 from dept 6 where dname = 'RESEARCH' 7 and deptno = emp.deptno 8* ) --------------------------------------------------------------------------- | Id | Operation | Name | Rows | Bytes | Cost (%CPU)| Time | --------------------------------------------------------------------------- | 0 | SELECT STATEMENT | | 5 | 545 | 7 (15)| 00:00:01 | |* 1 | HASH JOIN SEMI | | 5 | 545 | 7 (15)| 00:00:01 | | 2 | TABLE ACCESS FULL| EMP | 14 | 1218 | 3 (0)| 00:00:01 | |* 3 | TABLE ACCESS FULL| DEPT | 1 | 22 | 3 (0)| 00:00:01 | --------------------------------------------------------------------------- Predicate Information (identified by operation id): --------------------------------------------------- 1 - access("DEPTNO"="EMP"."DEPTNO") 3 - filter("DNAME"='RESEARCH')
Unnesting subqueries
The optimizer can unnest almost any uncorrelated sub-query and even most simple IN
, NOT IN
, EXISTS
, and NOT EXISTS
sub-queries.
If the subquery of a high-volume SQL is simple or uncorrelated and is still nesting, try:
- Gather statistics on all tables and indexes using
DBMS_STATS.GATHER_TABLE_STATS()
- Check you are using the Cost Based Optimizer, not the Rule based Optimizer.
- Try adding an
/*+UNNEST*/
hint to the sub-query.
If your Explain Plan shows a step NESTED LOOPS (SEMI)
step then the optimiser is unnesting a sub-query into a Nested Loops semi-join; this may not improve the situation. The /*+UNNEST*/
hint will not help because the sub-query is already unnested; it may be necessary to add a /*+HASH_SJ*/
(for correlated sub-queries with = predicates) or /*+MERGE_SJ*/
(for correlated sub-queries with > or < predicates) hint.
If your sub-query is complex and correlated, you will have to code it another way. Perhaps materialise some interim results in a Global Temporary Table.
Un-tunable sub-queries
You should never, ever do any of the following in the correlated sub-query of a high-volume SQL:
- Nest two or more levels deep, and correlate a column not in the immediate parent block
- Correllate a HAVING clause
- Use a set operator, analytic function,
MODEL
clause,CONNECT BY
in the sub-query. - Correlate a non-scalar sub-query with a comparison operator (
WHERE col > (SELECT col ... correlated sub-query ...)
). Even = conditions will nest. UseMAX/MIN/AVG
to make the sub-query scalar - even if selecting on a unique key.
These types of correllated sub-queries will always nest.
Conclusion
All we've discussed here is the perils of nesting. Is tuning high-volume SQL that simple? The answer is: "sort of".
It's not quite as cut and dried as I've made out; not all nesting is bad. If you have a query that joins 500,000 rows from a 1,000,000 row table (high volume) to 500,000 rows from a 100,000,000 row table, then nesting might be the way to go. A full scan of 100 million rows is a pretty expensive way of reading the half-million that you need.
Rather that thinking in terms of high- and low-volume queries, you need to evaluate each table on its merits. How many rows are there? How many do I need?
Another exception - as John Watson (OCM) identifies in the comments below - is a nested loop on an index without a table lookup. If the inner loop of a nested loop join can get all the columns it needs from the index, then even in a high-volume scenario, the nested loop will probably out-perform the alternatives.
The simple don'ts in this article are easy to avoid; functions, scalar sub-queries, triggers, etc. It gets much more complex when you eliminate all of these things but are still left with a nested step in the Explain Plan. Like many things, the first step is to recognise that there is a problem; hopefully this article helps in that regard.
- rleishman's blog
- Log in to post comments
Comments
Wow, this is a great article
I was looking for this information. Your descriptions have help me greatly with some work I must do tomorrow, what a co-incidence. Thanks man, Kevin Meade
Never, ever, ever use correlated scalar subqueries?
I'm not sure if one should never use scalar subqueries like:
SELECT empno
, ename
, (SELECT dname FROM bigdept dept WHERE deptno = emp.deptno) AS dname
FROM bigemp emp
Joining maybe faster but what should you choose if you have to outer join?
I think
SELECT empno
, ename
, (SELECT dname FROM bigdept dept WHERE deptno = emp.deptno) AS dname
FROM bigemp emp
is faster than
SELECT empno
, ename
, dname
FROM bigemp emp
, bigdept dept
where emp.deptno = dept.deptno (+)
I repeat - never, ever
Try it and find out. Make sure you trace and use Tkprof. Ensure that you get a hash join in the outer join option, and post the results here.
Outer joins attract some type of mystical fear and loathing that they are somehow less performant than inner joins. I usually have a reasonable insight into these myths, but I have no idea how that one came about.
The truth is that Oracle supports four join methods: Nested Loops, Sort-Merge, Hash, and Cluster. All of them work equally well as inner joins and outer joins. Correllated Scalar Sub-Query Expressions are the functional equivalent of a Nested Loops join, so their use merely restricts your available toolset.
Never, ever, ever use correlated scalar subqueries!
UPDATE - subqueries in the SET clause
What about using analytics ?(Oracle 10gR2)
Good example
Yep, that's nested too. Steps 3-5 of that plan are executed for each row returned by step 2. Increase the volumes and see how it compares to MERGE.
Remarks
1) Correlated sub-query expressions in the SELECT clause
Sorry,
but generally
Is not equivalent to
but rather to
2) Functions
Can't get this affirmation:
Could you explain, please ?
3) Conclusion
I think your rather contradicting yourself. If
means that we can use scalar sub-queries (more precisely correlated scalar-subquery) because, as you are saying, nesting is not always bad.
Responses
1) Agreed, a left outer join is the more appropriate comparison. However the principle remains the same: Oracle is able to Hash Left Outer Join from both directions (use either the inner or outer table as the hash table), so performance-wise the example stands.
2) If my function had referenced the same table that was being updated it would have raised a mutating table error, so I made the function and UPDATE reference different tables. Obviously this is not a real-life example, but it does what I wanted, which is to demonstrate the nature of function calls.
3) Look at the example again. I thought I was being clear, but obviously not. Full Table Sans are faster row-for-row than index scans. Of course, we often scan more rows in a Full Table Scan. Is a full table scan of 1,000,000 rows faster than an indexed scan of 500,000 rows? Absolutely yes! Is a full table scan of 100,000,000 rows faster than an indexed scan of 500,000 rows? Probably not. This last example is an index scan of 0.5% of the table; whilst this is not "low-volume SQL" by any stretch or the imagination, it is processing only a very small proportion of the table.
The Secret Cache is ...
The new SQL Result Cache in 11g (this is probably the best article available at this time: http://www.oracle.com/technology/pub/articles/oracle-database-11g-top-features/11g-caching-pooling.html)
In 11g, Oracle will now cache the results of queries and PL/SQL blocks within a session. Which totally invalidates some older rules of thumb.
Interesting article, and one I will have to keep in mind, as a counter to my usual experience. My usual experience being, when there is a query which is taking hours, and it should take minutes, it is because Oracle is doing a hash join, and the fix is make it do a nested loops join. Obviously, one size does not fit all.
One size still fits all, but there's more than one "all"
Results cache is interesting - it should narrow the gap for scalar function calls. Even though it will result in fewer function calls, those calls will still be nested and will result in less efficient IO over a large data set. I haven't done any benchmarking, but I don't expect to be changing my position on the issue of Functions.
If Nested Loops is out-performing Hash Join / Full Scan, it's because you are dealing with Low Volume SQL, not High Volume SQL (the subject of this article). I usually draw the line anywhere between 1% and 10% of a table. If your SQL requires more than 10% of a large table to produce its results, then I contend Indexed Nested Loops will never out-perform Hash Join / FTS except in extraordinary cases deliberately designed to confound hash joins. Conversely, if your SQL needs less than 1% - Indexed NL will always out-perform FTS / HJ if the index is well designed. In-between 1 and 10% - benchmark.
This is really a great article!!
thanks .
"Never, ever, ever use correlated scalar subqueries".
Hi,
Many thanks for this fantastic article.
Regards,
Vishwa....
Very useful article
Thanks for giving such a nice information.
Mr and Mrs Oracle at the virtual shopping mall
I came across your article while trying to develop a little thought experiment: "what is the most efficient way to determine which rows are common to two row sets?" You emphasise the importance of avoiding correlated subqueries, as used by the iterative operations nested loop join and filters with two children. Generally, I would agree - but in this case I'm not sure.
A cost of 6, which I'll take as my baseline.My example is the OE demo schema: which products are stocked by both the warehouse in Toronto (warehouse_id=5) and Sydney (warehouse_id=9)? I have five equivalent SQLs, and the results surprised me.
First, the obvious solution which is to use a compund query:
Now two versions that use a correlated subquery:
Both have been re-written to a join and given the same cost, which is far lower than my baseline. I do not understand why it is so low: the nested subquery which has to iterate 114 times is costed at zero. I had expected both these to be worse than the compound query.
Then I tried two versions of, in effect, manually coding the INTERSECT: The first is re-written to the join with its correlated subquery and gets the same cost, with an extra 1 for removing duplicates. Fair enough. The second has done exactly what I asked it to do (no transformation) but it is still slightly cheaper than the INTERSECT.
My conclusion: perhaps nested subqueries are not as bad as one might think. The CBO is pretty good at transformations nowadays.
This was using release 12.1.0.2.
--
John Watson
Oracle Certified Master DBA
http://skillbuilders.com
Aha! You found the exception
Hi John. Sorry it's taken me almost a year to reply, but I come here so infrequently now... /excuse
I neglected to mention in this article, although I do make the point in an earlier article: http://www.orafaq.com/node/1403
In your example, the loop is accessing only the index - this is very quick indeed, especially when the entire index is cached. If you conspired to create a high-volume SQL where the inner loop hit the index AND the table, I feel confident that one of your join alternatives would fare better.
For clarity, I'll go back up into the article and note the exception, giving you credit for finding me out.