Every now and then I need to generate a large number of rows into some table.
Not that I have any personal urge to do so, but you know, this is needed for some test cases and so on ;-)
It’s quite well known that it’s possible to generate lots of rows using CONNECT BY LEVEL <= #rows syntax.
However as the CONNECT BY LEVEL syntax is recursive by nature, it will consume more memory the more rows you query in one shot ( I think it was Mark Bobak who once posted this observation to Oracle-L, but I didn’t manage to find that thread anymore ).
So, here’s a test case:
SQL> select count(r) 2 from ( 3 select rownum r from dual connect by rownum <= 100000000 4 ) 5 / select rownum r from dual connect by rownum <= 100000000 * ERROR at line 3: ORA-04030: out of process memory when trying to allocate 44 bytes (kxs-heap-w,cursor work heap)
After running for a while the server process run out of private memory, used for the CONNECT BY cursor work heap.
Let’s investigate:
Execution plan is following, we have a single recursive CONNECT BY operator:
SQL> @x PLAN_TABLE_OUTPUT ------------------------------------------------------------------------------------------------------- SQL_ID 5n2f53tsa3bq3, child number 0 ------------------------------------- select count(r) from ( select rownum r from dual connect by rownum <= 100000000 ) Plan hash value: 2085675642 --------------------------------------------------------- | Id | Operation | Name | E-Rows | --------------------------------------------------------- | 1 | SORT AGGREGATE | | 1 | | 2 | VIEW | | 1 | | 3 | COUNT | | | | 4 | CONNECT BY WITHOUT FILTERING| | | | 5 | FAST DUAL | | 1 | ---------------------------------------------------------
I actually ran Snapper on the session performing connect by and we see the large memory allocations from its output as well:
SQL> @snapper out,gather=s,sinclude=memory 3 10 109 -- Session Snapper v1.07 by Tanel Poder ( ) --------------------------------------------------------------------------------------------------------------------------------------------- HEAD, SID, SNAPSHOT START , SECONDS, TYPE, STATISTIC , DELTA, DELTA/SEC, HDELTA, HDELTA/SEC --------------------------------------------------------------------------------------------------------------------------------------------- DATA, 109, 20080608 04:48:47, 3, STAT, session uga memory , 72534112, 24178037, 72.53M, 24.18M DATA, 109, 20080608 04:48:47, 3, STAT, session pga memory , 72810496, 24270165, 72.81M, 24.27M DATA, 109, 20080608 04:48:47, 3, STAT, sorts (memory) , 1, 0, 1, .33 -- End of snap 1 DATA, 109, 20080608 04:48:50, 4, STAT, session uga memory , 95642904, 23910726, 95.64M, 23.91M DATA, 109, 20080608 04:48:50, 4, STAT, session pga memory , 95748096, 23937024, 95.75M, 23.94M -- End of snap 2 DATA, 109, 20080608 04:48:54, 3, STAT, session uga memory , 96559400, 32186467, 96.56M, 32.19M DATA, 109, 20080608 04:48:54, 3, STAT, session pga memory , 96665600, 32221867, 96.67M, 32.22M -- End of snap 3 [...some lines snipped...] -- End of snap 8 DATA, 109, 20080608 04:49:14, 4, STAT, session uga memory , 100945488, 25236372, 100.95M, 25.24M DATA, 109, 20080608 04:49:14, 4, STAT, session uga memory max , 100880024, 25220006, 100.88M, 25.22M DATA, 109, 20080608 04:49:14, 4, STAT, session pga memory , 101056512, 25264128, 101.06M, 25.26M DATA, 109, 20080608 04:49:14, 4, STAT, session pga memory max , 100990976, 25247744, 100.99M, 25.25M -- End of snap 9 DATA, 109, 20080608 04:49:18, 3, STAT, session uga memory , 100159920, 33386640, 100.16M, 33.39M DATA, 109, 20080608 04:49:18, 3, STAT, session uga memory max , 100159920, 33386640, 100.16M, 33.39M DATA, 109, 20080608 04:49:18, 3, STAT, session pga memory , 100270080, 33423360, 100.27M, 33.42M DATA, 109, 20080608 04:49:18, 3, STAT, session pga memory max , 100270080, 33423360, 100.27M, 33.42M -- End of snap 10 PL/SQL procedure successfully completed.
As Snapper calculates session statistic counter deltas then these snapshots show that my process allocated around 30MB of UGA memory per second during the CONNECT BY statement execution. In fact by the crash my process UGA/PGA size had grown to almost 1.8GB in size:
SQL> @ses 109 memory NAME VALUE ---------------------------------------------------------------- ---------- session uga memory 1825306316 session uga memory max 1825306316 session pga memory 1827832112 session pga memory max 1827832112 redo blocks read (memory) 0 redo blocks read (memory) by LNS 0 workarea memory allocated 0 sorts (memory) 148 8 rows selected.
Note that this memory used for recursive connect by is not reported as workarea memory in it’s traditional sense (like sort, hash and bitmap merge areas), therefore the normal PGA size restriction constraints like PGA_AGGREGATE_TARGET don’t apply, this work area lives outside that domain. In fact if you query V$SQL_WORKAREA_ACTIVE during the SQL execution, nothing is returned for that SQL.
So, how to generate lots of rows fast and safely?
Well, we can still use CONNECT BY with lower recursion and let a cartesian merge join do the multiplication for us!
SQL> select count(r) 2 from ( 3 select rownum r 4 from 5 (select rownum r from dual connect by rownum <= 1000) a, 6 (select rownum r from dual connect by rownum <= 1000) b, 7 (select rownum r from dual connect by rownum <= 1000) c 8 where rownum <= 100000000 9 ) 10 / COUNT(R) ---------- 100000000
100 million rows were generated this time, without running out of memory.
The execution plan is following:
SQL> @x PLAN_TABLE_OUTPUT ------------------------------------------------------------------------------------------------------- SQL_ID 76prmvktb3r9a, child number 0 ------------------------------------- select count(r) from ( select rownum r from (select rownum r from dual connect by rownum <= 1000) a, (select rownum r from dual connect by rownum <= 1000) b, (select rownum r from dual connect by rownum <= 1000) c where rownum <= 100000000 ) Plan hash value: 2999494577 ----------------------------------------------------------------------------------------- | Id | Operation | Name | E-Rows | OMem | 1Mem | Used-Mem | ----------------------------------------------------------------------------------------- | 1 | SORT AGGREGATE | | 1 | | | | | 2 | VIEW | | 1 | | | | |* 3 | COUNT STOPKEY | | | | | | | 4 | MERGE JOIN CARTESIAN | | 1 | | | | | 5 | MERGE JOIN CARTESIAN | | 1 | | | | | 6 | VIEW | | 1 | | | | | 7 | COUNT | | | | | | | 8 | CONNECT BY WITHOUT FILTERING | | | | | | | 9 | FAST DUAL | | 1 | | | | | 10 | BUFFER SORT | | 1 | 15360 | 15360 |14336 (0)| | 11 | VIEW | | 1 | | | | | 12 | COUNT | | | | | | | 13 | CONNECT BY WITHOUT FILTERING| | | | | | | 14 | FAST DUAL | | 1 | | | | | 15 | BUFFER SORT | | 1 | 15360 | 15360 |14336 (0)| | 16 | VIEW | | 1 | | | | | 17 | COUNT | | | | | | | 18 | CONNECT BY WITHOUT FILTERING | | | | | | | 19 | FAST DUAL | | 1 | | | | ----------------------------------------------------------------------------------------- Predicate Information (identified by operation id): --------------------------------------------------- 3 - filter(ROWNUM<=100000000)
During the query execution the PGA size never grew larger than about a megabyte:
SQL> @ses 110 memory NAME VALUE ---------------------------------------------------------------- ---------- session uga memory 497316 session uga memory max 751696 session pga memory 885040 session pga memory max 1147184 redo blocks read (memory) 0 redo blocks read (memory) by LNS 0 workarea memory allocated 0 sorts (memory) 13 8 rows selected.
Btw, I’ve just updated my WaitProf script to a merge join row-generator to save memory when profiling over longer periods of time (which you probably don’t want to do as WaitProf never sleeps – is constantly on CPU, sampling as fast as it can).
If you want to read about performance of different row generators, then Adrian Billington has written a comprehensive article about the topic: http://www.oracle-developer.net/display.php?id=408