Performance of Nested Loops - SCN

Getting Started Newsletters Store
Products Services & Support About SCN Downloads

Industries Training & Education Partnership Developer Center
Lines of Business University Alliances Events & Webinars Innovation
Log On Join Us Hi, Guest Search the Community
Activity Communications Actions
Browse
ABAP Testing and Troubleshooting
Previous
post
Next
post
0 Tweet 0
This BLOG is an expanded version of a post that I made to the ABAP forum a while ago. What I wanted to show was
that the effect that when performance tuning, the effect of nested loops can be far worse than poorly constructed
SELECT statements.
I wrote a small program which illustrates this point. Ive posted the code below, but in a nutshell, what it does is
select a number of FI document headers and line items from BKPF and BSEG. The select statement from BSEG is
very inefficient. I did it that way to prove a point. It then reads all records from both tables using a nested loop. Then it
reads all records from both tables using a much more efficient method. Finally, it reads all records from both tables
using a different method which may be even more efficient, but may be somewhat more difficult to program. It keeps
track of the number of records and time taken for each operation.
I ran the program twice in our QA environment once selecting a small amount of data and again selecting a
moderate amount. The outputs are:
First run -
Time for unindexed select : 00:06:09
Number of BKPF entries: 5,863
Number of BSEG entries: 17,683
Time for nested loop : 00:00:53
Number of BKPF reads : 5,863
Number of BSEG reads : 17,683
Time for indexed loop : 00:00:00
Time for parallel cursor : 00:00:00
Second run
Time for unindexed select : 00:21:07
Number of BKPF entries: 55,777
Number of BSEG entries: 205,285
Time for nested loop : 02:16:21
Time for indexed loop : 00:00:01
Time for parallel cursor : 00:00:01
So what can we conclude? In the first case, a gain in time of almost a minute is not much, but in a dialogue program,
it would be worthwhile. But in the second case the other method gains over two hours. This would allow a program
that has to run in the background to run in the foreground. The most striking thing to me though, is the fact that the
nested loop takes substantially longer than an extremely inefficient select statement.
In both cases, the loop using the parallel cursor method did not produce a substantial savings over the loop using a
binary search followed by an indexed read; however, if you run this and extract a very large amount of data, it does run
more quickly. In the program I have provided, the parallel cursor method does appear to be somewhat easier to
program, but I have found that if the outer loop does not contain all of the records in the inner loop, then programming
complexity is increased and I don't think it warrants the extra effort. The binary search method is easy and runs
quickly.
Performance of Nested Loops
Posted by Rob Burbank in ABAP Testing and Troubleshooting on Feb 7, 2006 3:23:15 PM
Share 0 Like
Average User Rating
(0 ratings)
0 Tweet 0
The select screen for the program is quite standard. The amount of data returned is determined by whatever the user
enters and the programmer really has no control over it. So Im suggesting that if you cant guarantee
that the amount of data is small, then you really ought to use the indexed read method.
Also note that the first select statement returns 17,683 rows from BSEG and took 06:09 to run and the second one
returns 205,285 rows and takes 21:07. The second one retrieves almost 11.5 times as much data but takes only
about 3.5 times as long to execute. The two runs of the program were on separate evenings when there
shouldnt be any load, so buffering and workload shouldnt be issues.
So, my conclusion is: when tuning a program that you know will return a small amount of data, tune the select
statement and dont worry too much about loops; however, if the program may return a large amount of data,
avoid nested loops.
And a final thought: once you get a program tuned to a certain point, it doesn't make a lot of sense to try to spend a lot
of time trying to reduce execution time by a small amount. (More on this later)
Code follows:
report ztest_nested_loop. data: bkpf type bkpf, bseg type bseg. select-options: s_bukrs for bseg-bukrs memory id buk obligatory, s_gjahr for bseg-gjahr memory id gjr obligatory, s_lifnr for bseg-lifnr memory id lif obligatory. data: bkpf_tab type standard table of bkpf, bkpf_lin like line of bkpf_tab, bseg_tab type standard table of bseg, bseg_lin like line of bseg_tab. data: start_time type sy-uzeit, end_time type sy-uzeit, difference type sy-uzeit, bkpf_entries type sy-tabix, bseg_entries type sy-tabix, bkpf_reads type sy-tabix, bseg_reads type sy-tabix. start-of-selection. perform unindexed_select. perform nested_loop. perform indexed_loop. PERFORM parallel_cursor. *&---------------------------------------------------------------------* *& Form unindexed_select *&---------------------------------------------------------------------* form unindexed_select. get time field start_time. select * from bseg into table bseg_tab where bukrs in s_bukrs and gjahr in s_gjahr and lifnr in s_lifnr. if sy-subrc <> 0. message id '00' type 'E' number '001' with 'No entries selected'. endif. select * from bkpf into table bkpf_tab for all entries in bseg_tab where bukrs = bseg_tab-bukrs and belnr = bseg_tab-belnr and gjahr = bseg_tab-gjahr and bstat = space. clear bseg_tab. refresh bseg_tab. select * from bseg into table bseg_tab for all entries in bkpf_tab where bukrs = bkpf_tab-bukrs and belnr = bkpf_tab-belnr and gjahr = bkpf_tab-gjahr. get time field end_time. difference = end_time - start_time. describe table bkpf_tab lines bkpf_entries. describe table bseg_tab lines bseg_entries. write: /001 'Time for unindexed select:', difference, /005 'Number of BKPF entries:', bkpf_entries, /005 'Number of BSEG entries:', bseg_entries. skip 1. endform. " unindexed_select *&---------------------------------------------------------------------* *& Form nested_loop *&---------------------------------------------------------------------* form nested_loop. get time field start_time. loop at bkpf_tab into bkpf_lin. bkpf_reads = bkpf_reads + 1. loop at bseg_tab into bseg_lin where bukrs = bkpf_lin-bukrs and belnr = bkpf_lin-belnr and gjahr = bkpf_lin-gjahr. bseg_reads = bseg_reads + 1. endloop. endloop. get time field end_time. difference = end_time - start_time. write: /001 'Time for nested loop:', difference, /005 'Number of BKPF reads:', bkpf_reads, /005 'Number of BSEG reads:', bseg_reads. skip 1. endform. " nested_loop *&---------------------------------------------------------------------* *& Form indexed_loop *&---------------------------------------------------------------------* form indexed_loop. data: bkpf_index like sy-tabix, bseg_index like sy-tabix. clear: bkpf_reads, bseg_reads. get time field start_time. sort: bkpf_tab by bukrs belnr gjahr, bseg_tab by bukrs belnr gjahr. loop at bkpf_tab into bkpf_lin. read table bseg_tab into bseg_lin with key bukrs = bkpf_lin-bukrs belnr = bkpf_lin-belnr gjahr = bkpf_lin-gjahr binary search. bkpf_reads = bkpf_reads + 1. bseg_index = sy-tabix. while sy-subrc = 0. bseg_index = bseg_index + 1. bseg_reads = bseg_reads + 1. read table bseg_tab into bseg_lin index bseg_index. if bseg_lin-bukrs <> bkpf_lin-bukrs or bseg_lin-belnr <> bkpf_lin-belnr or bseg_lin-gjahr <> bkpf_lin-gjahr. sy-subrc = 99. else. endif. endwhile. endloop. get time field end_time. difference = end_time - start_time. write: /001 'Time for indexed loop:', difference, /005 'Number of BKPF reads:', bkpf_reads, /005 'Number of BSEG reads:', bseg_reads. skip 1. endform. " indexed_loop *&---------------------------------------------------------------------* *& Form parallel_cursor *&---------------------------------------------------------------------* * text *----------------------------------------------------------------------* FORM parallel_cursor. DATA: bkpf_index LIKE sy-tabix, bseg_index LIKE sy-tabix. CLEAR: bkpf_reads, bseg_reads. GET TIME FIELD start_time. SORT: bkpf_tab BY bukrs belnr gjahr, bseg_tab BY bukrs belnr gjahr. bseg_index = 1. LOOP AT bkpf_tab INTO bkpf_lin. bkpf_reads = bkpf_reads + 1. LOOP AT bseg_tab INTO bseg_lin FROM bseg_index. IF bseg_lin-bukrs <> bkpf_lin-bukrs OR bseg_lin-belnr <> bkpf_lin-belnr OR bseg_lin-gjahr <> bkpf_lin-gjahr. bseg_index = sy-tabix. EXIT. ELSE. bseg_reads = bseg_reads + 1. ENDIF. ENDLOOP. ENDLOOP. GET TIME FIELD end_time. difference = end_time - start_time. WRITE: /001 'Time for parallel cursor :', difference, /005 'Number of BKPF reads :', bkpf_reads, /005 'Number of BSEG reads :', bseg_reads. ENDFORM. " parallel_cursor

More information can be found in SE30 (Tips and Tricks)
1002 Views Topics: abap Tags: beginner, analytics, loop
Share 0 Like
21 Comments
Like (0)
Matthew Gif f ord Feb 8, 2006 5:45 AM
Your code for the Index and Nested loops is misleading, as these do not retrieve the data from the
database. However this would make the Nested loop times greater by up to 20? miniutes.
Another thing to look out for is the sequence that you try each of these processes. The first process
to fill an internal table will take longer than a process to re-fill it. The memory to store the data has to
be allocated.
I was however suprised at how much longer the nested loop took to process the data.
MattG.
Like (0)
Rob Burbank Feb 8, 2006 7:10 AM (in response to Matthew Gif f ord)
Well, the idea was to compare how the performance of nested loops compared to an
inefficient SELECT statement, so I retrieved the data entirely separately from LOOPing
through it.
Rob
Like (0)
A. de Smidt Jun 21, 2007 8:11 AM (in response to Rob Burbank)
As mentioned in my thread I noticed that the parallel loop fails if the nested table
contains keys which the main table doesn't have.
the very slow standard nested loop doesn't have this problem with additional keys
in the nested table.
Before I do the parallel looop I delete all the entries which are not in the main table
from the nested itab.
perhaps there is a nicer alternative :)
Rob Burbank Jun 21, 2007 9:22 AM (in response to A. de Smidt)
In this case you have to use the parallel cursor technique.
Rob
Like (0)
Like (0)
Feb 8, 2006 6:30 AM
Hi Rob,James Gaddis
Like (0)
Rob Burbank Feb 8, 2006 12:46 PM (in response to )
Thanks for the kind comments. I don't have time today, but I'll have a look at your code as
soon as I can.
Rob
Like (0)
Cem Dedeoglu Feb 10, 2006 7:44 AM (in response to )
thnx for good posts..
You may also use "transporting no fields" addition while finding index of bseg_tab with
binary search. this may reduce the assign overhead.
just for extreme performance ;)
Shaban
Like (0)
Rob Burbank Feb 15, 2006 3:07 PM (in response to )
I added your code to my program and modified it so that it did some more efficient selecting
(the bad one was just to prove a point). Then I ran the three more efficient loops a number
of times. What I found was that you code ran in about in about 80% of the time that the
indexed loop took and about 70% of the time of the parallel cursor. So your code is faster.
But then I wasn't really trying to find the fastest way to process two tables, just to show that
nested loops have to be avoided at all costs for large tables.
Incidentally, from the above, the indexed loop method consistently outperforms the parallel
cursor method. I find this surprising because the records in each table will be read only
once using parallel cursors, but the inner table will be read more often using the indexed
loop. The only reason I can think of for this is that the SAP kernel has somehow optimized
the binary search.
Rob
Like (0)
Matthew Gif f ord Feb 17, 2006 3:09 AM (in response to Rob Burbank)
Rob
Just something to check with these methods is to only sort the data once. Record
the time taken to sort the tables, and add it to the runtime of the methods that need
sorted data.
Sort methods, generally, are quicker at sorting unsorted data than sorted data.
Like (0)
Good point - I agree.
Rob
Matthew Gif f ord Feb 9, 2006 7:25 AM
It is interesting to compare the times for very small number of records. (This is for SCARR, SPFLI on
miniSAP). My times, in microseconds: Select: 96439 Nested: 31 Indexed:
59 Other: 31 This was for 6 SCARR and 10 SPFLI records. So this test would show
that a Nested loop was the best option. (James's 'Other' logic does not include the required sort).
 I tried to find out what was the better method to update an internal table, assign or modify. I
found that the overhead of the first assign was 20 times that of subsequent assigns. This assign
was 6 times longer than the first modify, itself longer than later modifies. What was odd
about the results, it made a difference to the times if I modify loop first or second. My code:
 &----
 *& Report Z_MG_TYPE_FS_ITAB
 & &--
 & A little test program to see if loop at assigning would work. & This also proves that
Like (0)
the field-symbol has structure. &
 REPORT Z_MG_TYPE_FS_ITAB . CONSTANTS: C_DONUMBER
TYPE I VALUE 10. TYPES: BEGIN OF TYP_SPFLI. INCLUDE STRUCTURE SPFLI.
 TYPES: TABIX1 TYPE SYTABIX, TABIX2 TYPE SYTABIX, END OF
TYP_SPFLI. FIELD-SYMBOLS: <SPFLI> TYPE TYP_SPFLI. DATA: T_SPFLI TYPE
STANDARD TABLE OF TYP_SPFLI, H_SPFLI TYPE TYP_SPFLI. TYPES:
BEGIN OF TYP_RESULT, INDEX TYPE SYINDEX, ASSIGN TYPE I, 
AFIRST TYPE I, MODIFY TYPE I, MFIRST TYPE I, END OF TYP_RESULT.
 DATA: T_RESULT TYPE STANDARD TABLE OF TYP_RESULT, S_RESULT
TYPE TYP_RESULT. DATA: TIMESTAMP1 TYPE I, TIMESTAMP2 TYPE I,
 MIN_TIME_A TYPE I VALUE 1000000, MIN_TIME_M TYPE I VALUE 1000000,
 MAX_TIME_A TYPE I VALUE 0, MAX_TIME_M TYPE I VALUE 0, 
MEANTIME_A TYPE I VALUE 0, MEANTIME_M TYPE I VALUE 0. PARAMETERS:
P_DEBUG AS CHECKBOX, Who goes first. P_ASSIGN RADIOBUTTON
GROUP GO, P_MODIFY RADIOBUTTON GROUP GO, P_XTRAAS AS
CHECKBOX. START-OF-SELECTION. IF P_DEBUG <> SPACE. 
BREAK-POINT. ENDIF. SELECT * INTO TABLE T_SPFLI FROM SPFLI. *
Assign field-symbol, so it is not initial. IF P_XTRAAS = 'X'. ASSIGN H_SPFLI TO
<SPFLI>. ENDIF. DO C_DONUMBER TIMES. S_RESULT-INDEX = SY-
INDEX. IF P_MODIFY = SPACE. PERFORM UPDATE_USING_ASSIGN. 
PERFORM UPDATE_USING_MODIFY. ELSE. PERFORM
UPDATE_USING_MODIFY. PERFORM UPDATE_USING_ASSIGN. ENDIF. 
APPEND S_RESULT TO T_RESULT. ENDDO. WRITE: / 'Time Assigning:',
MIN_TIME_A, MAX_TIME_A, MEANTIME_A, / 'Time Modifying:', MIN_TIME_M,
MAX_TIME_M, MEANTIME_M. ULINE. IF P_ASSIGN = 'X'. WRITE: / 'Assign
processed first.'. ELSE. WRITE: / 'Modify processed first.'. ENDIF. ULINE.
 SKIP. LOOP AT T_RESULT INTO S_RESULT. WRITE: / 'Index :', S_RESULT-
INDEX, 'Assign:', S_RESULT-ASSIGN, 'First assign:', S_RESULT-
AFIRST, 'Modify:', S_RESULT-MODIFY, 'First modify:', S_RESULT-
MFIRST. ENDLOOP. ULINE. *&
 & Form UPDATE_USING_ASSIGN *&
 text *
 FORM UPDATE_USING_ASSIGN. DATA: L_FIRSTTIME TYPE I. GET RUN TIME
FIELD TIMESTAMP1. LOOP AT T_SPFLI ASSIGNING <SPFLI>. <SPFLI>-TABIX1 = SY-
TABIX. IF SY-TABIX = 1. GET RUN TIME FIELD L_FIRSTTIME. ENDIF. 
ENDLOOP. GET RUN TIME FIELD TIMESTAMP2. TIMESTAMP2 = TIMESTAMP2 -
TIMESTAMP1. IF TIMESTAMP2 < MIN_TIME_A. MIN_TIME_A = TIMESTAMP2. 
ENDIF. IF TIMESTAMP2 > MAX_TIME_A. MAX_TIME_A = TIMESTAMP2. ENDIF.
 ADD TIMESTAMP2 TO MEANTIME_A. S_RESULT-ASSIGN = TIMESTAMP2. 
S_RESULT-AFIRST = L_FIRSTTIME - TIMESTAMP1. ENDFORM. "
UPDATE_USING_ASSIGN &
 & Form UPDATE_USING_MODIFY *&
 text *--
* FORM UPDATE_USING_MODIFY. DATA: L_FIRSTTIME TYPE I. GET RUN
TIME FIELD TIMESTAMP1. LOOP AT T_SPFLI INTO H_SPFLI. H_SPFLI-TABIX2 = SY-
TABIX. MODIFY T_SPFLI FROM H_SPFLI. IF SY-TABIX = 1. GET RUN TIME
FIELD L_FIRSTTIME. ENDIF. ENDLOOP. GET RUN TIME FIELD TIMESTAMP2.
 TIMESTAMP2 = TIMESTAMP2 - TIMESTAMP1. IF TIMESTAMP2 < MIN_TIME_M. 
MIN_TIME_M = TIMESTAMP2. ENDIF. IF TIMESTAMP2 > MAX_TIME_M. 
MAX_TIME_M = TIMESTAMP2. ENDIF. ADD TIMESTAMP2 TO MEANTIME_M. 
S_RESULT-MODIFY = TIMESTAMP2. S_RESULT-MFIRST = L_FIRSTTIME - TIMESTAMP1.
 ENDFORM. " UPDATE_USING_MODIFY
Like (0)
Hi Matthew - yes, for small amounts of data, there really is not much to be gained by using
either of the techniques I've pointed out. But I'm really not sure if you can count on nested
loops giving better performance. The time to do a loop is dependent on the number of
records in the table. From F1 on READ: "The runtime required to read a line from a table
with 100 entries using the index is around ca. 7 msn (standard microseconds), to read a
line using a key of 30 bytes using a binary search, around 25 msn, and without binary
search, around 100 msn."
Thanks for taking the time to look at this. I appreciate it.
Rob
Feb 9, 2006 11:29 AM (in response to Matthew Gif f ord)
RE: "James's 'Other' logic does not include the required sort."
Very true. Thanks for pointing this out. I was originally using a sorted itab (no BINARY
SEARCH), and neglected to add the sort when I morphed my own code to fit the blog's
example.
Regards,
Like (0)
James
Like (0)
Feb 14, 2006 1:05 AM
bkpf_lin like line of bkpf_tab:
IS this the efficient way of declaring internal table
Like (0)
Rob Burbank Feb 15, 2006 7:42 AM (in response to )
I don't know if it's more efficient, but according to "Obsolete ABAP Language Constructs",
the use of OCCURS when defining an internal table is obsolete.
https://www.sdn.sap.com/irj/servlet/prt/portal/prtroot/docs/library/uuid/5ac31178-0701-0010-
469a-b4d7fa2721ca
Rob
Like (0)
Naimish Dayah Feb 14, 2006 4:09 AM
Can you also provide the code for parallel cursor method please
Like (0)
Rob Burbank Feb 15, 2006 7:39 AM (in response to Naimish Dayah)
My mistake - I have added the code.
Rob
Like (0)
Arun Sambargi Apr 11, 2006 11:34 PM
Hi Rob,
As a beginner I was really confused with performance of various Loop options, I have a clear picture
about these now.
Like (0)
Rob Burbank Apr 12, 2006 6:21 AM (in response to Arun Sambargi)
You're welcome Arun. I'm glad it helped.
Rob
Like (0)
Daniel Grau Jun 26, 2007 3:12 AM
Hi all,
found this and played a little with the coding. Why not just replace in the data defenition:
bseg_tab type standard table of bseg,
by
bseg_tab type sorted table of bseg with non- non-unique key bukrs belnr gjahr buzei,
(or unique key)?
Result in a test with around 400k entries was near the indexed loop.
Greetings,
Daniel
Rob Burbank Jun 27, 2007 10:52 AM (in response to Daniel Grau)
Yes - that's a good point. There are a number of different techniques that can be used
instead of the one I presented here. I like to use standard tables because I tend to re-use
and re-sort them.
But the main thrust of the blog isn't how to fix the problem, it's recogizing the problem in the
first place. Many questions in the forum ask "How can I improve the performace of this
code...?" The code will have a bunch of selects and then one or more nested loops.
Most of the responses try to improve the selects only and ignore the nested loops. A lot of
people just don't see this as a problem.
Rob
Follow SCN
Site Index Contact Us SAP Help Portal
Privacy Terms of Use Legal Disclosure Copyright
Like (0)

Performance of Nested Loops - SCN

Uploaded by

Document Information

Original Title

Copyright

Available Formats

Share this document

Share or Embed Document

Sharing Options

Did you find this document useful?

Is this content inappropriate?

Copyright:

Available Formats

Performance of Nested Loops - SCN

Uploaded by

Copyright:

Available Formats

Getting Started Newsletters Store

Products Services & Support About SCN Downloads

You might also like