Oracle Hint Scopes and Join Level Cardinality Hint

Thanks to Tomasz Sroka’s pull request for my hint.sql in the TPT repo and blog entries by Sayan ¹ and Jonathan ², we now have a HINT_SCOPE field in my hint.sql script too!

SQL> @hint merge

NAME                    VERSION     VERSION_OUTLINE INVERSE        HINT_SCOPE
----------------------- ----------- --------------- -------------- --------------
MERGE_CONST_ON          8.0.0                                      STATEMENT
MERGE_AJ                8.1.0       8.1.7                          QBLOCK
MERGE_SJ                8.1.0       8.1.7                          QBLOCK
MV_MERGE                9.0.0                                      QBLOCK
MERGE                   8.1.0       10.1.0          NO_MERGE       QBLOCK OBJECT
NO_MERGE                8.0.0       10.1.0          MERGE          QBLOCK OBJECT
USE_MERGE_CARTESIAN     11.1.0.6    11.1.0.6                       OBJECT
USE_MERGE               8.1.0       8.1.7           NO_USE_MERGE   OBJECT
NO_USE_MERGE            10.1.0.3    10.1.0.3        USE_MERGE      OBJECT

The hint scope shows which kind of nodes in the parse tree (Abstract Syntax Tree) you can target the hints to apply.

For example, the hint OPT_PARAM applies to the entire statement, you can’t somehow make just one branch of your query to use different optimizer settings than the others:

SQL> @hint opt_param

NAME                     VERSION     VERSION_OUTLINE INVERSE        HINT_SCOPE
------------------------ ----------- --------------- -------------- ----------------
OPT_PARAM                10.2.0.1    10.2.0.1                       STATEMENT

The PARALLEL hint can however be specified both at the entire statement level or individual object level within the query (I almost always use statement-level parallelism if I need to use a hint):

SQL> @hint parallel

NAME                      VERSION     VERSION_OUTLINE INVERSE       HINT_SCOPE
------------------------- ----------- --------------- ------------- ----------------
NOPARALLEL                8.1.0                       SHARED        STATEMENT OBJECT
NO_PARALLEL               10.1.0.3                    SHARED        STATEMENT OBJECT

An INDEX hint applies only to a specific table in the query and you can’t somehow force optimizer to use indexes for all tables with just one statement level hint:

SQL> @hint index_rs_asc

NAME                      VERSION     VERSION_OUTLINE INVERSE       HINT_SCOPE
------------------------- ----------- --------------- ------------- ----------------
INDEX_RS_ASC              11.1.0.6    11.1.0.6                      OBJECT

When I peeked around other hints with the new version of my script, I noticed that the CARDINALITY hint applies to query blocks and JOINS too, in addition to individual tables/objects:

SQL> @hint cardinality

NAME                      VERSION      VERSION_OUTLINE INVERSE      HINT_SCOPE
------------------------- ------------ --------------- ------------ ------------------
CARDINALITY               9.0.0                                     QBLOCK OBJECT JOIN

For some reason I had always been under the impression that the CARDINALITY hint could not be used for setting a join cardinality, meant to be used for only for tables and entire query blocks (and you’d have to use the OPT_ESTIMATE hint to adjust join cardinalities). Time to investigate!

Let’s create two tables with 71k rows:

SQL> CREATE TABLE a AS SELECT * FROM dba_objects;

Table created.

SQL> CREATE TABLE b AS SELECT * FROM dba_objects;

Table created.

Now a simple join without any hints:

SQL> SELECT COUNT(*) FROM a,b WHERE a.object_id=b.object_id;

  COUNT(*)
----------
     71085

SQL> @x
Display execution plan for last statement for this session from library cache...

-----------------------------------------------------------------------------------------------------
| Id  | Operation           | Name | E-Rows |E-Bytes|E-Temp | Cost (%CPU)|  OMem |  1Mem | Used-Mem |
-----------------------------------------------------------------------------------------------------
|   0 | SELECT STATEMENT    |      |        |       |       |  1016 (100)|       |       |          |
|   1 |  SORT AGGREGATE     |      |      1 |    26 |       |            |       |       |          |
|*  2 |   HASH JOIN         |      |  69315 |  1759K|  1696K|  1016   (1)|  5199K|  3201K| 4867K (0)|
|   3 |    TABLE ACCESS FULL| B    |  69315 |   879K|       |   417   (0)|       |       |          |
|   4 |    TABLE ACCESS FULL| A    |  83554 |  1060K|       |   417   (0)|       |       |          |
-----------------------------------------------------------------------------------------------------

Predicate Information (identified by operation id):
---------------------------------------------------

   2 - access("A"."OBJECT_ID"="B"."OBJECT_ID")

   - dynamic statistics used: dynamic sampling (level=2)

The tables A and B are estimated to return 83k and 69k rows in this query. The HASH JOIN above is estimated to return 69k rows too. Looks like dynamic sampling kicked in as I hadn’t gathered stats on my newly created tables, that’s why the estimates are somewhat inaccurate. It doesn’t matter for my experiment here as I will override the cardinality estimates with the hint anyway. Let’s start with setting a different cardinality estimate at object level for both tables individually:

SQL> SELECT /*+ CARDINALITY(a 1000) CARDINALITY(b 1000) */ COUNT(*) 
     FROM a,b 
     WHERE a.object_id=b.object_id;

  COUNT(*)
----------
     71085

SQL> @x
Display execution plan for last statement for this session from library cache...

---------------------------------------------------------------------------------------------
| Id  | Operation           | Name | E-Rows |E-Bytes| Cost (%CPU)|  OMem |  1Mem | Used-Mem |
---------------------------------------------------------------------------------------------
|   0 | SELECT STATEMENT    |      |        |       |   835 (100)|       |       |          |
|   1 |  SORT AGGREGATE     |      |      1 |    26 |            |       |       |          |
|*  2 |   HASH JOIN         |      |    999 | 25974 |   835   (1)|  5199K|  3201K| 3718K (0)|
|   3 |    TABLE ACCESS FULL| A    |   1000 | 13000 |   417   (0)|       |       |          |
|   4 |    TABLE ACCESS FULL| B    |   1000 | 13000 |   417   (0)|       |       |          |
---------------------------------------------------------------------------------------------

Predicate Information (identified by operation id):
---------------------------------------------------

   2 - access("A"."OBJECT_ID"="B"."OBJECT_ID")

   - dynamic statistics used: dynamic sampling (level=2)

Indeed, the table level estimates show 1000 rows each. As a result, the join selectivity estimator that bases its calculations on the child numbers below it, estimates that only 999 rows would survive the join.

Can we now change the cardinality of only the join, independently? The CARDINALITY hint is not documented the last time I checked, so lets try what happens if I add both table aliases in a single hint, with a number:

SQL> SELECT /*+ CARDINALITY(a 1000) CARDINALITY(b 1000) CARDINALITY(a b 50000) */ COUNT(*) 
     FROM a,b 
     WHERE a.object_id=b.object_id;

  COUNT(*)
----------
     71085

SQL> @x
Display execution plan for last statement for this session from library cache...

---------------------------------------------------------------------------------------------
| Id  | Operation           | Name | E-Rows |E-Bytes| Cost (%CPU)|  OMem |  1Mem | Used-Mem |
---------------------------------------------------------------------------------------------
|   0 | SELECT STATEMENT    |      |        |       |   835 (100)|       |       |          |
|   1 |  SORT AGGREGATE     |      |      1 |    26 |            |       |       |          |
|*  2 |   HASH JOIN         |      |  50000 |  1269K|   835   (1)|  5199K|  3201K| 3703K (0)|
|   3 |    TABLE ACCESS FULL| A    |   1000 | 13000 |   417   (0)|       |       |          |
|   4 |    TABLE ACCESS FULL| B    |   1000 | 13000 |   417   (0)|       |       |          |
---------------------------------------------------------------------------------------------

Predicate Information (identified by operation id):
---------------------------------------------------

   2 - access("A"."OBJECT_ID"="B"."OBJECT_ID")

   - dynamic statistics used: dynamic sampling (level=2)

It worked!

Now I have another tool in my toolbelt when troubleshooting optimizer behavior! Note that I try to not use hints to fix SQL performance problems, but I definitely use them during experiments to understand the root cause better.

For example, if I know that the optimizer should not use a NESTED LOOPS join to join 10M rows with another 10M rows, I could either force a specific join order + method with LEADING() and USE_HASH() hints in a query block and see if it improves the performance and if it does, what cardinality and cost estimates does the optimizer come up with, when “forced to do the right thing” using the hints. Often the cost and/or cardinality estimates are estimated higher for doing the “right thing” when forced by you who knows the “real shape” of your data, compared to whatever the optimizer mistakenly came up with, based only on the quick math done on the stats.

But for complex enough queries with many levels of inline views, subqueries and transformations, such manual join order + join method hinting gets quite complex when done right, so if E-Rows vs A-Rows (cardinality misestimate) analysis shows that the problem indeed comes from wrong cardinality estimates, this is the fastest way to quickly do a “what if” test: What would the optimizer do if it had the correct estimates of the actual rowcounts passing through the plan.

Another useful thing: When you are hoping to force a specific join method (say, nested loops) with USE_NL(a b), then this is likely not the correct, intended syntax for using join method hints. Join method hints only need one argument, method to use for the “next table in the join order” and for completeness should be combined with a join-order-forcing LEADING hint. ³

SQL> @hint use

NAME                 VERSION      VERSION_OUTLINE INVERSE           HINT_SCOPE
-------------------- ------------ --------------- ----------------- ---------------------------
USE_MERGE            8.1.0        8.1.7           NO_USE_MERGE      OBJECT 
NO_USE_MERGE         10.1.0.3     10.1.0.3        USE_MERGE         OBJECT
USE_HASH             8.1.0        8.1.7           NO_USE_HASH       OBJECT
NO_USE_HASH          10.1.0.3     10.1.0.3        USE_HASH          OBJECT
USE_NL               8.1.0        8.1.7           NO_USE_NL         OBJECT
USE_NL_WITH_INDEX    10.1.0.3                     NO_USE_NL         OBJECT
NO_USE_NL            10.1.0.3     10.1.0.3        USE_NL            OBJECT

The hint scope output above confirms that the join method hints have an OBJECT scope, not JOIN scope, so if you need to force nested loop join for a 4 table join, you would need to use a LEADING hint for 3 first tables in the join order and, 3 (for clarity, separate) USE_NL hints for the 3 last tables in the join order (just like my lotslios.sql demo script does).

Update: I ran my script to list all hints on Oracle 23ai and edited the output to show the hints that apply in the JOIN scope, so far only 3 of them:

NAME             VERSION      VERSION_OUTLINE INVERSE     HINT_SCOPE
---------------- ------------ --------------- ----------- ------------------
CARDINALITY      9.0.0                                    QBLOCK OBJECT JOIN
LEADING          8.1.6        10.1.0.3                    JOIN
OPT_ESTIMATE     10.1.0.3                                 QBLOCK OBJECT JOIN

Oracle Hint Scopes and Join Level Cardinality Hint

2024-06-06