In the first post of the series I gave an introduction to some general aspects of performance tuning, our reference architecture and our data model as well as some general SSIS tuning techniques. The second post will concentrate on some baseline measures we need in order to explain our scale out strategies. So let’s have a look at the simplest SSIS package.

Sequential Read


This is the easiest SSIS package you can think of, just an OleDB Data Source and a Row Count in order to measure the read performance. By design this is a single threaded operation that gives us a read performance of 35 MB/sec. Not that much if you compare that our hardware is able to deliver 6.7 GB/sec sequential read throughput. So let’s have a look at the write performance.

Sequential Writes

As you can see in the picture below that package is still very easy. There is just an OleDB Data Source that reads data from a SQL Server table and loads it directly into another SQL Server table. In order to achieve good performance this has to be happen as minimal logged operation as described in my first post.


Loading data into a heap table gives us a throughput of 23 MB/sec, loading into a clustered index table around 43 MB/sec. Now you can think loading into a clustered index is faster, but if you compare the load duration it is twice slower. Why?

Because the data has to be written twice (data & index) when loading into a clustered index and we have to pay the price for the sort. And again this is also a single threaded write operation (the read operation is handled in parallel).

Resume: Loading into a heap table is the faster option.

Validating DOP impact

As we can read for example in the FastTrack reference guide it is recommended to load data with MAXDOP 1. The reason for the recommendation is to avoid fragmentation of data and indexes. So let’s see if this is really true.

In order to test the behavior I took the sequential write package and tested it with different DOP settings and here are the results:


It shows us that the DOP setting has no influence on our SSIS performance and this also means that the BULK API is single threaded (because performance does not increase). As you can also see we don’t get any fragmentation of our data when we increase the DOP setting.


Next thing to analyze is compression. Very often we can read that compression helps to speed ETL performance. So let’s prove it.

Again I took the sequential write SSIS package and tested it by loading data into an uncompressed, a row and a page compressed table. And here are the results:


As we can see uncompressed is the fastest way for data loading, row compression takes 35% and page compression 208% longer. Additionally I also tested the impact to relational SQL queries and cube processing. Again we don’t see any benefit from data compression. But why is compression sometimes recommended?

It’s recommended if we need to save storage space or when the I/O-subsystem is the bottleneck in the architecture. Then we can reduce the number of I/O operations at the expense of CPU power. But in this case we have a balanced architecture and enough I/O power, so compression does not help in our case.


Very often we can see SSIS packages that look like the ones mentioned above. In my case I removed the business logic part between the source and the destination but the result is the same. Packages designed like this are bound by a single thread write operation with a max write speed of 20MB/s.

And to make it clear, there is nothing we can do about it. We could buy hardware with cores that have a higher clock speeds but we see the opposite is happening in the market. New servers have a lot more cores but the speed of a single core is decreasing. So buying new hardware makes it even worse.

Additionally we have seen that loading into a Heap table is the fastest option, DOP and compression does not help us to increase performance. The only way to go is:


There are different ways to design parallelism in SSIS. I will show some techniques in my next post.