Buffer size and reading data

So far, in Buffer size and CPU time, we’ve looked at the impact of changing the default buffer size on the time it takes to write data to disk.

However, we want to increase overall performance. This means we need to ensure that read speed is not negatively affected by our optimization for writing. Hopefully, if the optimal values for reading differ from those we found for writing, a good compromise between the two can be reached.

What options does this leave us? The buffer size used to create a dataset is part of the dataset – internally the dataset is divided up in bufsize chunks. This means that when we read the data the buffer size of the dataset is fixed and will not be affected by changing the value of the bufsize system option.

We can choose the number of buffers we ask SAS® to use when it reads the dataset. By default SAS will read a single page at a time. In order to assess what impact changing this setting might have we need to get some figures, starting with a benchmark score for the default settings.

The simple dataset created in our earlier article is read using an even simpler datastep:

data _null_;       /* By using _null_ no output will be produced so the */
  set temp; /* timer figures should reflect read performance. */
run;

Before this data step is run the number of buffers is set using the bufno system option. Three bufno values were tested: 1 (the default), 256, and 1,024, and these were used with datasets created using buffer sizes of 4k, 64k and 256k.

The table, below, has statistics for CPU time, gathered using the fullstimer system option.

Results

Buffer Size (KB)
Buffer Count 4 64 256 All
1 1:33.3   1:07.7 1:05.0 1:15.3
256 1:07.3   1:08.0 0:59.3 1:04.9
1,024 1:09.7   1:00.0 0:55.0 1:01.6
All   1:16.8 1:05.2 0:59.8 1:07.3
n=27

Conclusions

Increasing either the buffer size or the number of buffers offers a performance gain. Choosing the settings that optimize writing (256KB, 1,024) also, from the range of data we have examined, optimizes reading. The thirty-eight second reduction in CPU time for this process, (from 93.3 seconds to 55 seconds,) represents a 41% improvement: better than the improvement seen for writing data.
 

Previous: Buffer size and CPU time Next: Buffer size and sorting data

Post a Comment

Your email is never published nor shared. Required fields are marked *

*
*

You may use these HTML tags and attributes: <a href="" title=""> <abbr title=""> <acronym title=""> <b> <blockquote cite=""> <cite> <code> <del datetime=""> <em> <i> <q cite=""> <strike> <strong>