Thanks to Tomasz Sroka’s pull request for my hint.sql in the TPT repo and blog entries by Sayan 1 and Jonathan 2, we now have a HINT_SCOPE
field in my hint.sql script too!
SQL> @hint merge NAME VERSION VERSION_OUTLINE INVERSE HINT_SCOPE ----------------------- ----------- --------------- -------------- -------------- MERGE_CONST_ON 8.0.0 STATEMENT MERGE_AJ 8.1.0 8.1.7 QBLOCK MERGE_SJ 8.1.0 8.1.7 QBLOCK MV_MERGE 9.0.0 QBLOCK MERGE 8.1.0 10.1.0 NO_MERGE QBLOCK OBJECT NO_MERGE 8.0.0 10.1.0 MERGE QBLOCK OBJECT USE_MERGE_CARTESIAN 11.1.0.6 11.1.0.6 OBJECT USE_MERGE 8.1.0 8.1.7 NO_USE_MERGE OBJECT NO_USE_MERGE 10.1.0.3 10.1.0.3 USE_MERGE OBJECT
The hint scope shows which kind of nodes in the parse tree (Abstract Syntax Tree) you can target the hints to apply.
For example, the hint OPT_PARAM
applies to the entire statement, you can’t somehow make just one branch of your query to use different optimizer settings than the others:
SQL> @hint opt_param
NAME VERSION VERSION_OUTLINE INVERSE HINT_SCOPE
------------------------ ----------- --------------- -------------- ----------------
OPT_PARAM 10.2.0.1 10.2.0.1 STATEMENT
The PARALLEL
hint can however be specified both at the entire statement level or individual object level within the query (I almost always use statement-level parallelism if I need to use a hint):
SQL> @hint parallel
NAME VERSION VERSION_OUTLINE INVERSE HINT_SCOPE
------------------------- ----------- --------------- ------------- ----------------
NOPARALLEL 8.1.0 SHARED STATEMENT OBJECT
NO_PARALLEL 10.1.0.3 SHARED STATEMENT OBJECT
An INDEX
hint applies only to a specific table in the query and you can’t somehow force optimizer to use indexes for all tables with just one statement level hint:
SQL> @hint index_rs_asc
NAME VERSION VERSION_OUTLINE INVERSE HINT_SCOPE
------------------------- ----------- --------------- ------------- ----------------
INDEX_RS_ASC 11.1.0.6 11.1.0.6 OBJECT
When I peeked around other hints with the new version of my script, I noticed that the CARDINALITY
hint applies to query blocks and JOINS too, in addition to individual tables/objects:
SQL> @hint cardinality NAME VERSION VERSION_OUTLINE INVERSE HINT_SCOPE ------------------------- ------------ --------------- ------------ ------------------ CARDINALITY 9.0.0 QBLOCK OBJECT JOIN
For some reason I had always been under the impression that the CARDINALITY hint could not be used for setting a join cardinality, meant to be used for only for tables and entire query blocks (and you’d have to use the OPT_ESTIMATE hint to adjust join cardinalities). Time to investigate!
Let’s create two tables with 71k rows:
SQL> CREATE TABLE a AS SELECT * FROM dba_objects;
Table created.
SQL> CREATE TABLE b AS SELECT * FROM dba_objects;
Table created.
Now a simple join without any hints:
SQL> SELECT COUNT(*) FROM a,b WHERE a.object_id=b.object_id; COUNT(*) ---------- 71085 SQL> @x Display execution plan for last statement for this session from library cache... ----------------------------------------------------------------------------------------------------- | Id | Operation | Name | E-Rows |E-Bytes|E-Temp | Cost (%CPU)| OMem | 1Mem | Used-Mem | ----------------------------------------------------------------------------------------------------- | 0 | SELECT STATEMENT | | | | | 1016 (100)| | | | | 1 | SORT AGGREGATE | | 1 | 26 | | | | | | |* 2 | HASH JOIN | | 69315 | 1759K| 1696K| 1016 (1)| 5199K| 3201K| 4867K (0)| | 3 | TABLE ACCESS FULL| B | 69315 | 879K| | 417 (0)| | | | | 4 | TABLE ACCESS FULL| A | 83554 | 1060K| | 417 (0)| | | | ----------------------------------------------------------------------------------------------------- Predicate Information (identified by operation id): --------------------------------------------------- 2 - access("A"."OBJECT_ID"="B"."OBJECT_ID") - dynamic statistics used: dynamic sampling (level=2)
The tables A and B are estimated to return 83k and 69k rows in this query. The HASH JOIN above is estimated to return 69k rows too. Looks like dynamic sampling kicked in as I hadn’t gathered stats on my newly created tables, that’s why the estimates are somewhat inaccurate. It doesn’t matter for my experiment here as I will override the cardinality estimates with the hint anyway. Let’s start with setting a different cardinality estimate at object level for both tables individually:
SQL> SELECT /*+ CARDINALITY(a 1000) CARDINALITY(b 1000) */ COUNT(*) FROM a,b WHERE a.object_id=b.object_id; COUNT(*) ---------- 71085 SQL> @x Display execution plan for last statement for this session from library cache... --------------------------------------------------------------------------------------------- | Id | Operation | Name | E-Rows |E-Bytes| Cost (%CPU)| OMem | 1Mem | Used-Mem | --------------------------------------------------------------------------------------------- | 0 | SELECT STATEMENT | | | | 835 (100)| | | | | 1 | SORT AGGREGATE | | 1 | 26 | | | | | |* 2 | HASH JOIN | | 999 | 25974 | 835 (1)| 5199K| 3201K| 3718K (0)| | 3 | TABLE ACCESS FULL| A | 1000 | 13000 | 417 (0)| | | | | 4 | TABLE ACCESS FULL| B | 1000 | 13000 | 417 (0)| | | | --------------------------------------------------------------------------------------------- Predicate Information (identified by operation id): --------------------------------------------------- 2 - access("A"."OBJECT_ID"="B"."OBJECT_ID") - dynamic statistics used: dynamic sampling (level=2)
Indeed, the table level estimates show 1000 rows each. As a result, the join selectivity estimator that bases its calculations on the child numbers below it, estimates that only 999 rows would survive the join.
Can we now change the cardinality of only the join, independently? The CARDINALITY hint is not documented the last time I checked, so lets try what happens if I add both table aliases in a single hint, with a number:
SQL> SELECT /*+ CARDINALITY(a 1000) CARDINALITY(b 1000) CARDINALITY(a b 50000) */ COUNT(*) FROM a,b WHERE a.object_id=b.object_id; COUNT(*) ---------- 71085 SQL> @x Display execution plan for last statement for this session from library cache... --------------------------------------------------------------------------------------------- | Id | Operation | Name | E-Rows |E-Bytes| Cost (%CPU)| OMem | 1Mem | Used-Mem | --------------------------------------------------------------------------------------------- | 0 | SELECT STATEMENT | | | | 835 (100)| | | | | 1 | SORT AGGREGATE | | 1 | 26 | | | | | |* 2 | HASH JOIN | | 50000 | 1269K| 835 (1)| 5199K| 3201K| 3703K (0)| | 3 | TABLE ACCESS FULL| A | 1000 | 13000 | 417 (0)| | | | | 4 | TABLE ACCESS FULL| B | 1000 | 13000 | 417 (0)| | | | --------------------------------------------------------------------------------------------- Predicate Information (identified by operation id): --------------------------------------------------- 2 - access("A"."OBJECT_ID"="B"."OBJECT_ID") - dynamic statistics used: dynamic sampling (level=2)
It worked!
Now I have another tool in my toolbelt when troubleshooting optimizer behavior! Note that I try to not use hints to fix SQL performance problems, but I definitely use them during experiments to understand the root cause better.
For example, if I know that the optimizer should not use a NESTED LOOPS join to join 10M rows with another 10M rows, I could either force a specific join order + method with LEADING() and USE_HASH() hints in a query block and see if it improves the performance and if it does, what cardinality and cost estimates does the optimizer come up with, when “forced to do the right thing” using the hints. Often the cost and/or cardinality estimates are estimated higher for doing the “right thing” when forced by you who knows the “real shape” of your data, compared to whatever the optimizer mistakenly came up with, based only on the quick math done on the stats.
But for complex enough queries with many levels of inline views, subqueries and transformations, such manual join order + join method hinting gets quite complex when done right, so if E-Rows vs A-Rows (cardinality misestimate) analysis shows that the problem indeed comes from wrong cardinality estimates, this is the fastest way to quickly do a “what if” test: What would the optimizer do if it had the correct estimates of the actual rowcounts passing through the plan.
Another useful thing: When you are hoping to force a specific join method (say, nested loops) with USE_NL(a b)
, then this is likely not the correct, intended syntax for using join method hints. Join method hints only need one argument, method to use for the “next table in the join order” and for completeness should be combined with a join-order-forcing LEADING hint. 3
SQL> @hint use NAME VERSION VERSION_OUTLINE INVERSE HINT_SCOPE -------------------- ------------ --------------- ----------------- --------------------------- USE_MERGE 8.1.0 8.1.7 NO_USE_MERGE OBJECT NO_USE_MERGE 10.1.0.3 10.1.0.3 USE_MERGE OBJECT USE_HASH 8.1.0 8.1.7 NO_USE_HASH OBJECT NO_USE_HASH 10.1.0.3 10.1.0.3 USE_HASH OBJECT USE_NL 8.1.0 8.1.7 NO_USE_NL OBJECT USE_NL_WITH_INDEX 10.1.0.3 NO_USE_NL OBJECT NO_USE_NL 10.1.0.3 10.1.0.3 USE_NL OBJECT
The hint scope output above confirms that the join method hints have an OBJECT scope, not JOIN scope, so if you need to force nested loop join for a 4 table join, you would need to use a LEADING hint for 3 first tables in the join order and, 3 (for clarity, separate) USE_NL hints for the 3 last tables in the join order (just like my lotslios.sql demo script does).
Update: I ran my script to list all hints on Oracle 23ai and edited the output to show the hints that apply in the JOIN scope, so far only 3 of them:
NAME VERSION VERSION_OUTLINE INVERSE HINT_SCOPE
---------------- ------------ --------------- ----------- ------------------
CARDINALITY 9.0.0 QBLOCK OBJECT JOIN
LEADING 8.1.6 10.1.0.3 JOIN
OPT_ESTIMATE 10.1.0.3 QBLOCK OBJECT JOIN