Buffer size and CPU time

One strategy to improve performance of SAS® is to increase the amount of memory the application uses. Before rushing out and buying more memory I decided to see if we could make better use of what we have. This page gives more detail on my recent post on the subject.

The SAS system options bufsize and bufno influence the amount of memory a SAS session will use in writing data to disk. Changes to the default settings may make more memory available to SAS and may therefore boost performance.

The default settings on our server are bufsize=4k and bufno=1.

Using these settings a simple write process:

data test;
  do i = 1 to 10**9;
    output;
  end;
run;

was used to set a benchmark. This is quite a large dataset: one variable of 8 bytes and a billion observations gives a size just under 8GB. The average CPU time, as recorded in the SAS log with the fullstimer option, for this process was 66.8 seconds.

This chart shows the impact of increasing the buffer size (bufsize option setting) on CPU time.

CPU time declines as buffer size increases, but no improvement on 256KB

The figures are average times based on a range of buffer count (bufno) settings from 1 to 100,000. The 4KB figure is the Windows default, and some of the buffer count values for a buffer size of 1024KB will have taken the memory requirement outside of the 2GB Windows I/O limitation.

Based on these findings buffer sizes of 256KB and 512KB were tested in more detail with buffer counts of 256, 512, 1024, 2048 and 4096.

The results, below, show a correlation between total memory (bufsize x bufno) and CPU time.

Results

Buffer Size (KB)
Buffer Count 256 512 All
256 47.0   44.7 45.8
512   44.0   42.3 43.2
1,024   42.3   42.7 42.5
2,048   42.0 42.0 42.0
4,096 42.3 41.3 41.8
All   43.5 42.6 43.1
n=30

Conclusions

Before implementing these settings as best practice we need to test whether there is an impact on read performance, whether similar results are obtained when compression (binary or character) is used, and whether the results obtained for this artificial dataset can be replicated with real (wider and shorter) data.

Optimization of the buffer size and buffer count have brought the CPU time required for this process down from 66.8 seconds to 42.3 seconds, a saving of more than a third. This 42.3 second could be further improved, but the cost may be disproportionate: a 2% improvement for an eight-fold increase in memory seems like a price that is not worth paying.

Next: Buffer size and reading data

One Trackback

  1. [...] Buffer size and CPU time [...]

Post a Comment

Your email is never published nor shared. Required fields are marked *

*
*

You may use these HTML tags and attributes: <a href="" title=""> <abbr title=""> <acronym title=""> <b> <blockquote cite=""> <cite> <code> <del datetime=""> <em> <i> <q cite=""> <strike> <strong>