SSIS Performance Tuning–Monitoring & Data Collection

In my first post about my tuning series I wrote about the general methodology how tuning works: SSIS Performance Tuning – Methodology and general approach

Now I’d like to talk about what we need to collect, what the tools are and how they work. Before I start I would like to say a special thank you to my colleague Christian Jurjut, who invested a lot of time in analyzing this area and put together this fantastic guideline.

In order to achieve effective and goal-oriented performance tuning it is mandatory to know your system’s bottlenecks. In most cases the root causes for those bottlenecks lie deep in the system’s architecture and are not visible at first sight. So the identification can be very time consuming and, in worst case, goes round in circles and comes to no result. Fortunately there are a few tools which help us monitoring different parts of the system. The most important and helpful will be explained in the following.

Performance Counter

The Windows integrated performance counters measure and display graphically the performance of the system’s hardware and other components, like installed software or services. So it is for example possible only to monitor events and performance with regard to the SQL Server or the distributed transaction coordinator. At first sight the multitude of different counters can be very confusing. So in this article we will only deal with those which are most helpful for our purposes.

Usually performance counter are displayed and managed by the performance monitor which is also a windows integrated tool. You can open it by typing “perfmon” in the “Start Search” box. Performance monitor delivers many settings and properties which allow you modifying the illustration. To get a more detailed overview of the functionality please have a look at the following site:

http://technet.microsoft.com/en-us/library/cc749115.aspx

(Using Performance Monitor)

Performance Monitor allows you to analyze all performance counters in a single window. Depending on the number of counters this can be very confusing.

But you have also the possibility to open many windows in the same context of the execution. So it is possible to have only one or even a few counters in one graphical survey. If many counters have to be analyzed simultaneously, this is the recommended way.

As alternative it is also possible to use “Performance Counter Charts“, provided by Henk van der Valk. This tool only allows you to display one performance counter per window, but it also provides more useful diagram types, like bars, bubbles, pies and points.

Who will prefer which tool, depends on personal preferences. So no recommendations will be made here.

To analyze and identify potential bottlenecks, mainly the following performance counters are used:

Processor
- %Processor Time: Indicated how much the processor actually spends on productive threads.
Logical Disks
- Disk Read Bytes/sec: Indicated how fast data can be read from the volume.
- Avg. Disk Bytes/Read: Indicated how fast data can be read from the volume in average.
- Disk Writes Bytes/sec: Indicated how fast data can be written from the volume in average.
- Avg. Disk Bytes/Write: Indicated how fast data can be written from the volume in average.
- Disk Queue Length:This counter provides a primary measure of disk overload. The disk queue is an indication of the number of transactions that are waiting to be processed.
Memory
- Process / Private Bytes (DTEXEC.exe): Total amount of memory currently used by Integration Services.
- Process / Working Set (DTEXEC.exe): Total amount of memory allocated by Integration Services.
- SQL Server: Memory Manager / Total Server Memory: Total amount of memory allocated by SQL Server.
- Memory / Page Reads / sec: Total amount of memory pressure on the Server.
SSIS Pipeline
- Buffers in use: Number of buffers in use by the dataflow engine.
- Buffer memory: Total amount of memory (physical and virtual memory) in use by the dataflow engine.
- Flat buffers in use: Number of flat buffers in use by the dataflow engine. Flat buffers are blocks of memory that a component uses to store data. A flat buffer is a large block of bytes that is accessed byte by byte.
- Flat buffer memory: Total amount of memory that all flat buffers use.
- Private buffers in use: The number of buffers that transformations use. A private buffer is a buffer that a transformation used for temporary work only, for example a sort transformation.
- Private buffer memory: Total amount of memory used by private buffers.
- Buffers spooled: Number of buffers currently written to disk. If the dataflow engine runs slow in physical memory, buffers not currently used are written to disk and reloaded when needed.
- Rows read: The number of rows a data source produces.
- Rows written: The number of rows written to a destination.

For more details see: http://msdn.microsoft.com/en-us/library/ms137622.aspx.

Wait Stats

As an application, SQL Server may request system resources as it executes a query and waits for its request to be completed. These waits are represented by SQL Server wait statistics. SQL Server tracks wait information any time that a user connection or session_id is waiting. This wait information is summarized and categorized across all connections so that a performance profile can be obtained for a given work load. First of all it is important to understand the three milestones of the query life-cycle, or in other words, the status a query can be assigned.

Running: A query which is being executed on a CPU is called a running query. This query is responsible for CPU time.
Runnable: A query which is ready to be executed and waits for its turn to run is called a runnable query. This query is responsible for Signal Wait time. So the query is ready to run but CPU is servicing another query.
Suspended: A query which is waiting due, to any reason, to be converted to runnable is suspended. This query is responsible for wait times. This is the time we are trying to reduce.

The query execution time is a summation of the query Executing CPU Time (Running) + Query Wait Time (Suspended) + Query Signal Wait Time (Runnable). It may be possible that a query goes to all these stats multiple times.

In SQL Server the dynamic management view (DMV) “sys.dm_os_wait_stats” provides a valuable source of wait information from an application perspective. The result of this view can look like this:

SELECT * 
FROM sys.dm_os_wait_stats 
WHERE max_wait_time_ms > 0

This dynamic management view collects all the information since the time when the SQL Server services have been (re)started. You can also manually clear the wait stats using the following command:

DBCC SQLPERF('sys.dm_os_wait_stats', CLEAR);

This can be helpful if you only want to track the wait times for the current workload.

The table contains the following information:

Wait Type: Contains the name of the wait Type. There can be three different kinds of wait types – resource, queue and external.
Waiting Task Count: Shows the total number of waits on this wait type. This incremental counter is a good indicator for the frequency of the particular wait time. If this number is very high, it is a good indication for us to investigate on this particular wait type. It is quite possible that the wait time is considerably low, but the frequency of the wait is much high.
Wait Time ms: This is the total wait time accumulated for any type of wait. It also includes singal_wait_time_ms.
Max Wait Time ms: It indicates the maximum wait time ever occurred for that particular wait type. Using this, you can estimate the intensity of the wait type in past. It is not necessary that this max wait time will occur every time, so do not over invest yourself here.
Signal Wait Time ms: Shows the cumulated wait time during the thread is marked as runnable and it gets to the running state. If the runnable queue is very long, you will find that this wait time becomes high.

For a detailed overview of all available wait types please have a look at the following site: http://msdn.microsoft.com/en-us/library/ms179984.aspx

You can easily identify waiting tasks or sessions by using the SQL Server Activity Monitor or the dynamic management view sys.dm_os_waiting_tasks. Sessions with the status “Suspended” are currently waiting.

These waiting tasks can be an indicator for performance problems. The combination of wait types and stats with the analysis of performance counters is an efficient and favored methodology to identify and resolve bottlenecks.

In the following the most common wait types will be explained:

CXPACKET

When a parallel operation is created for a SQL query, multiple threads for a single query are used. Each query usually deals with a different set of data or rows. Due to some reasons, one or more of the threads lag behind, creating the CXPACKET Wait Stat. Not all the CXPACKET wait types are bad. You might experience a case when it totally makes sense. There might also be cases when this is unavoidable. If you remove this particular wait type for any query, then that query may run slower because the parallel operations are disabled for the query.

Depending on the server workload type, there are several ways to reduce or avoid this wait type. Each way uses the properties “Maximum Degree of Parallelism” and “Cost Threshold for Parallelism”. If you want to know more about these properties please have a look at the following sites:

Maximum Degree of Parallelism http://msdn.microsoft.com/en-us/library/ms181007.aspx
Cost Threshold for Parallelism http://msdn.microsoft.com/en-us/library/ms188603.aspx

Both can be configured in SQL Server properties:

On pure OLTP systems, where the transactions are smaller and queries are not long but very quick usually, set the “Maximum Degree of Parallelism” to 1. This way you make sure that the query never goes for parallelism and does not incur more engine overhead.

On data-warehousing or reporting server systems most queries will be running for long time, so it is advisable to set the “Maximum Degree of Parallelism” to the total number of physical CPUs. This way most of the queries will utilize the parallel processor, and long running queries get a boost in their performance due to multiple processors.

On a mixed System, running OLAP and OLTP for example, the challenge is to find the right balance between the “Maximum Degree of Parallelism” and the “Cost Threshold for Parallelism”. The “Cost Threshold for Parallelism” property determines which queries qualify for parallel processing. A recommended way is to start for example with a “Maximum Degree of Parallelism” of 2, which means that queries still use parallelism but only on 2 CPUs, and to keep “Cost Threshold for Parallelism” quite high. This way, not all the queries will qualify for parallelism but only the query with higher cost will go for parallelism.

IO_COMPLETION and ASYNC_IO_COMPLETION

The IO_COMPLETION wait type occurs while SQL Server is waiting for I/O operations to complete. This wait type generally represents non-data page I/Os. Data page I/O completion waits appear as PAGEIOLATCH_* waits, which will be explained later.

The ASYNC_IO_COMPLETION wait type occurs, if by any means your application that’s connected to SQL Server is processing the data very slowly. Several long-running database operations like backup, create database, alter database or other operations can also create this wait type. In order to reduce or avoid the appearance of these wait types, please have a look at the following recommendations:

Proper placing of files is very important. We should check the file system for proper placement of files, which means LDF and MDF on separate drives, the TempDB on a separate drive, huge tables on separate filegroups and on separate disks.
Check the File Statistics and see if there is higher IO Read and IO Write Stall by using the virtual file stats.
Check event log and error log for any errors or warnings related to IO.
If you are using SAN (Storage Area Network), check the throughput of the SAN system as well as the configuration of the HBA Queue Depth.
It is very possible that there are no proper indexes in the system and there are lots of table scans and heap scans. Creating proper indexes can reduce the IO bandwidth considerably. If SQL Server can use appropriate indexes, it can effectively reduce lots of CPU, Memory and IO.

PAGEIOLATCH_XX and PAGELATCH_XX:

This particular wait type occurs when any of the tasks is waiting for data from the disk to move to the buffer cache. Latches will be explained in detail in another chapter.

LCK_M_XX

If any task is waiting to acquire a lock on any resource, this particular wait type occurs. The common reason for the task to be waiting is that the resource is already locked and some other operations may be going on within it. So this wait indicates that resources are not available or are occupied at the moment due to some reasons. There is a good chance that the waiting queries start to time out if this wait type is very high. Beside the above mentioned ways, there are also some other methods to find blocking queries related to this wait type:

EXEC sp_who2
DMV sys.dm_tran_locks
DMV sys.dm_os_waiting_tasks

In order to reduce or avoid the appearance of these wait types, please have a look at the following recommendations:

Check the Explicit Transactions. If transactions are very long, this wait type can start building up because of other waiting transactions. Keep the transactions small.
Serialization Isolation can build up this wait type. If that is an acceptable isolation for your business, this wait type may be natural. The default isolation of SQL Server is ‘Read Committed’.
Identify blocking queries mentioned using various methods described above, and then optimize them.
Partition can be one of the options to consider because this will allow transactions to execute concurrently on different partitions.
If there are runaway queries, use timeout. (Please discuss this solution with your database architect first as timeout can work against you).
Check if there is no memory and IO-related issue using the described performance counters in the according chapter.

WRITELOG and LOGBUFFER

The WRITELOG wait type occurs while an operation is waiting for a log flush to complete. Common operations that cause log flushes are checkpoints and transaction commits. This wait type is usually seen in a heavy transactional database. When data is modified, it is written both on the log cache and buffer cache. This wait type occurs when data in the log cache is flushing to the disk. During this time, the session has to wait due to WRITELOG.

The LOGBUFFER wait type occurs when a task is waiting for space in the log buffer to store a log record. Consistently high values may indicate that the log devices cannot keep up with the amount of log being generated by the server.

As both wait types are related to logs the resolution of them is quite similar. So there are several suggestions to reduce these wait stats:

Move Transaction Log to a separate disk from mdf and other files.
Avoid cursor-like coding methodology and frequent committing of statements.
Find the most active file based on IO stall time, based on the following script based on virtual file stats:

SELECT    DB_NAME(vfs.DbId) DatabaseName, 
            mf.name,
            mf.physical_name, 
            vfs.BytesRead, 
            vfs.BytesWritten, 
            vfs.IoStallMS, 
            vfs.IoStallReadMS, 
            vfs.IoStallWriteMS, 
            vfs.NumberReads, 
            vfs.NumberWrites, 
            (Size*8)/1024 Size_MB
     FROM        ::fn_virtualfilestats(NULL,NULL) vfs
     INNER JOIN     sys.master_files mf 
     ON        mf.database_id = vfs.DbId
AND        mf.FILE_ID = vfs.FileId

(For a better understanding please have a look at the virtual file stats chapter)

Check the IO-related counters

Latches

Latches are lightweight internal structures used by SQL Server database engine to ensure physical data integrity. Most latches occur when a data page is moved from the storage engine to the data cache but there are also some other types of latches.

The following list describes the different types of latches:

Non-buffer (Non-BUF) latch: The non-buffer latches provide synchronization services to in-memory data structures or provide reentrancy protection for concurrency-sensitive code lines. These latches can be used for a variety of things, but they are not used to synchronize access to buffer pages.
Buffer (BUF) latch: The buffer latches are used to synchronize access to BUF structures and their associated database pages. The typical buffer latching occurs during operations that require serialization on a buffer page, during a page split or during the allocation of a new page, for example. These latches are not held for the duration of a transaction. These are indicated in the “sysprocesses” table by the PAGELATCH wait types.
IO latch: The IO latches are a subset of BUF latches that are used when the buffer and associated data page or the index page is in the middle of an IO operation. PAGEIOLATCH wait types are used for disk-to-memory transfers and a significant wait time for these wait types suggests disk I/O subsystem issues.

Locks and latches are similar objects but they have a slightly different purpose. Locks ensure that the same data element cannot be modified by two different connections at the same time. Latches, on the other hand, simply ensure that the physical data page on which the data element resides is readable and writable. A latch is only maintained on the data page while the data is being changed, as opposed to locks which are maintained for the duration of the entire transaction.

Each data page in the buffer cache is associated with a structure that tracks the status information, including whether the page is dirty or not (“dirty” means modified through INSERT, UPDATE or DELETE operation). A data page can only be latched while it is in memory. After a data page is modified in memory it must be written to disk. Before writing the data page to disk SQL Server latches it to prevent further modifications. Once the page is successfully written to disk, the latch is released. Latching is also used when the data is read. SQL Server retrieves data pages that are likely to be used and places them in the buffer cache. This functionality is called “read-ahead”. During read-ahead, SQL Server acquires latches on data pages to protect them from further access and from modification. Once the pages have been moved to the buffer cache, SQL Server double-checks each page for a valid page number and torn page errors. After these checks are complete the latches are released.

Latches can be monitored in three different ways:

Performance Counter: SQL Server provides three counters for monitoring latches:

Average Latch Wait Time counter: Shows average number of milliseconds it takes to obtain a latch.
Latch Waits: Shows the total number of latch requests that could not be immediately satisfied and had to wait.
Total Latch Wait Time: Shows total wait time for acquiring latches for the last second. This counter also applies only to those requests that couldn’t be immediately satisfied due to resource contention.

The first counter applies to those instances for which SQL Server could not obtain required latches immediately. If there is no resource contention, SQL Server will obtain a latch without any wait. Since this counter displays the average wait time it typically remains relatively static. The other two counters can vary widely depending on the system’s activity. If the values of the two counters below are consistently high, you should investigate disk I/O and memory counters to see if those components cause a bottleneck.

Latch Wait Types

As latches can also cause waits on the SQL Server, they can be monitored by using the data management view “sys.dm_os_wait_stats” (see also Wait Stats). The result of this view can look like this:

SELECT * 
FROM sys.dm_os_wait_stats
WHERE max_wait_time_ms > 0
AND wait_type LIKE '%LATCH%'

The following latch wait types can occur:

PAGEIOLATCH_DT: Occurs when a task is waiting on a latch for a buffer that is in an I/O request. The latch request is in Destroy mode.
PAGEIOLATCH_EX: Occurs when a task is waiting on a latch for a buffer that is in an I/O request. The latch request is in Exclusive mode.
PAGEIOLATCH_KP: Occurs when a task is waiting on a latch for a buffer that is in an I/O request. The latch request is in Keep mode.
PAGEIOLATCH_SH: Occurs when a task is waiting on a latch for a buffer that is in an I/O request. The latch request is in Shared mode.
PAGEIOLATCH_UP: Occurs when a task is waiting on a latch for a buffer that is in an I/O request. The latch request is in Update mode.
PAGELATCH_DT: Occurs when a task is waiting on a latch for a buffer that is not in an I/O request. The latch request is in Destroy mode.
PAGELATCH_EX: Occurs when a task is waiting on a latch for a buffer that is not in an I/O request. The latch request is in Exclusive mode.
PAGELATCH_KP: Occurs when a task is waiting on a latch for a buffer that is not in an I/O request. The latch request is in Keep mode.
PAGELATCH_SH: Occurs when a task is waiting on a latch for a buffer that is not in an I/O request. The latch request is in Shared mode.
PAGELATCH_UP: Occurs when a task is waiting on a latch for a buffer that is not in an I/O request. The latch request is in Update mode.

The main difference between PAGEIOLATCH and PAGELATCH is that PAGEIOLATCH is related to IO issues, while PAGELATCH is not related to IO issues but is oftentimes linked to a buffer issue. Long waits for PAGEIOLATCH wait types may indicate problems with the disk subsystem. Please investigate on performance counters concerning disk I/O. The PAGELATCH wait types can indicate problem with the memory. When there is a contention of access of the in-memory pages, this wait type shows up. It is quite possible that some of the pages in the memory are of very high demand for the SQL Server to access them and put a latch on the pages. Additionally, it can also occur when the TempDB has higher contention as well. If there are indexes that are heavily used, contention can be created as well, leading to this wait type.

Dynamic Management Functions

SQL Server provides different dynamic management functions to obtain latch information. The most common will explained here.

To find the information about locks and latches you can examine the contents of dynamic management function “sys.dm_db_index_operational_stats”. The syntax for calling the function is as follows:

SELECT * FROM sys.dm_db_index_operational_stats 
(<Database ID>, 
 <Object ID>, 
 <Index ID>, 
 <Partition Number>)

The first argument is a valid database id. If you specify NULL, 0 or DEFAULT instead you’ll get information for all databases on the current instance. Similarly the second argument is an object identifier. If you specify NULL, 0 or DEFAULT you’ll get all objects within the specified database (or all databases if you didn’t specify a database ID). You can also specify index id to get information about all indexes for a given object; index id of 0 specifies that the object is a heap. If you want to examine latches for a given partitioned table, you can also specify the partition number to limit the output. As this dynamic management object contains a lot of columns, it is helpful to limit the result set. The following columns are relevant for latches:

page_latch_wait_count
page_latch_wait_in_ms
page_io_latch_wait_count
page_io_latch_wait_in_ms

You can also examine the contents of the “sys.dm_os_latch_stats” dynamic management view. This view can help you focus on wait numbers and wait types for each latch class. The data in sys.dm_os_latch_stats is cumulative as of the last time SQL Server service was started. You can clear the contents of this view by executing “DBCC SQLPERF(‘sys.dm_os_latch_stats’, CLEAR)” command or by restarting the service.This view does not track the latches that were granted immediately or that failed without waiting. The following columns are returned by the “sys.dm_os_latch_stats” view:

Latch_Class: Contains the requested latch class. For a detailed overview of all available latch classes, please have a look at the following site: http://msdn.microsoft.com/en-us/library/ms175066.aspx
Waiting_Request_Count: Total number of waits in this class of latches. The value of this column is updated at the start of each latch wait.
Wait_Time_ms: Total number of milliseconds the database engine had to wait for latches of this class. The value of this column is updated every five minutes during the latch wait. This value is also refreshed after the latch wait is over.
Max_Wait_Time_ms: Maximum time a memory object has waited for a latch in this latch class.

You can examine the contents of this view by executing a simple SELECT statement against it to view statistics for each latch class:

SELECT * FROM sys.dm_os_latch_stats

The result of this query could look like this:

You can also query “sysprocesses” system table to obtain information about the resources that SQL Server is waiting for to become available, including latches. If SQL Server is indeed waiting on latches, it will display the latch type. If you’re only interested in user initiated processes that are waiting on latches you can execute a query similar to the following:

SELECT * FROM sysprocesses  
WHERE LastWaitType LIKE '%latch%'  
AND WaitTime > 0

A detailed column list with explanations for this table can be found here:

http://msdn.microsoft.com/en-us/library/ms179881(v=sql.90).aspx

In order to reduce these wait types, you can experiment on the following:

Improve your IO subsystem speed.
This type of wait stats can also happen due to memory pressure or any other memory issues. Putting aside the issue of a faulty IO subsystem, this wait type warrants proper analysis of the memory counters. If due to any reasons, the memory is not optimal and unable to receive the IO data.
Proper placing of files is very important. We should check the file system for proper placement of files, which means LDF and MDF on separate drives, the TempDB on a separate drive, huge tables on separate filegroups and on separate disks.
Check the File Statistics and see if there is higher IO Read and IO Write Stall by using the virtual file stats.
It is also possible that there are no proper indexes on the system and there are lots of table scans and heap scans. Creating proper indexes can reduce the IO bandwidth considerably. If SQL Server can use appropriate indexes, it can significantly reduce lots of CPU, Memory and IO pressure.

Locks

Locking is a major part of every RDBMS. It is a database functionality which supports the multi-user environment. It mainly is a logical and not physical problem. This means that no amount of hardware will help you in the end. You might cut execution times by adding hardware but this is only a virtual fix. In a heavy multi-user environment any logical problems will appear sooner or later. So it is mandatory to fix locking issues also by logical changes, for example in your ETL or the database structure.

Lock information can be retrieved by the dynamic management view “sys.dm_tran_locks”. It can be queried by the following statement:

SELECT * FROM sys.dm_tran_locks
WHERE resource_type <> 'DATABASE'

The filter on resource_type just means that we don’t want to see general shared locks taken on databases. These are always present. A detailed description of this view containing all columns and their meanings can be found here: http://msdn.microsoft.com/en-us/library/ms190345(v=SQL.90).aspx

In the following only the three most important columns will be described:

resource_type: Tells us what resource in the database the locks are being taken on. It can be one of these values: database, file, object, page, key, extent, RID, application, metadata, HOBT, allocation unit.
request_mode: Tells us the mode of our lock.
resource_description: Shows a brief description of the resource. Usually holds the id of the page, object, file, row, etc. It isn’t populated for every lock type.

SQL Server contains different lock modes which are used in different situations and also have a different purpose. In the following these lock modes will be explained briefly.

Shared Locks (S): Shared locks are held on data being read under the pessimistic concurrency model. While a shared lock is being held, other transactions can read but can’t modify locked data. After the locked data has been read the shared lock is released, unless the transaction is being run with the locking hint (READCOMMITTED, READCOMMITTEDLOCK) or under the isolation level equal or more restrictive than Repeatable Read.

Update Lock (U): Update locks are a mix of shared and exclusive locks. When a DML statement is executed, SQL Server has to find the data it wants to modify. To avoid lock conversion deadlocks an update lock is used. Only one update lock can be held on the data at one time, similar to an exclusive lock. But the difference here is that the update lock itself can’t modify the underlying data. It has to be converted to an exclusive lock before the modification takes place.

Exclusive Locks (X): Exclusive locks are used to lock data being modified by one transaction thus preventing modifications by other concurrent transactions. You can read data held by exclusive lock only by specifying a NOLOCK hint or using a read uncommitted isolation level. Because DML statements first need to read the data they want to modify you’ll always find Exclusive locks accompanied by shared locks on that same data.

Intent Locks (I): Intent locks are used to notify other transactions that their own transaction is intending to lock the data. Their purpose is to assure proper data modification by preventing other transactions to acquire a lock on the object higher in lock hierarchy. What this means is that before you obtain a lock on the page or the row level an intent lock is set on the table. This prevents other transactions from putting exclusive locks on the table that would try to cancel the row/page lock.

Schema Locks (Sch): There are two types of schema locks:

Schema stability locks (Sch-S): Used while generating execution plans. These locks don’t block access to the object data.
Schema modification locks (Sch-M): Used while executing a DDL statement. Blocks access to the object data since its structure is being changed.

Bulk Update Locks (BU): Bulk Update locks are used by bulk operations when TABLOCK hint is used by the import. This allows for multiple fast concurrent inserts by disallowing data reading to other transactions.

Conversion Locks: Conversion locks are locks resulting from converting one type of lock to another. There are 3 types of conversion locks:

Shared with Intent Exclusive (SIX). A transaction that holds a Shared lock also has some pages/rows locked with an Exclusive lock
Shared with Intent Update (SIU). A transaction that holds a Shared lock also has some pages/rows locked with an Update lock.
Update with Intent Exclusive (UIX). A transaction that holds an Update lock also has some pages/rows locked with an Exclusive lock.

Key Range Locks: Key-range locks protect a range of rows implicitly included in a record set being read by a Transact-SQL statement while using the serializable transaction isolation level. Key-range locking prevents phantom reads. By protecting the ranges of keys between rows, it also prevents phantom insertions or deletions into a record set accessed by a transaction. There are two types of key-range locks:

RangeX-X – exclusive lock on the interval between the keys and exclusive lock on the last key in the range
RangeS-U – shared lock on the interval between the keys and update lock on the last key in the range

Spin Locks: Spinlocks are a mutual exclusion mechanism to reduce context switching between threads in SQL Server. They are not used to lock data. The requester of a spinlock repeats its request if the lock is not immediately available (it “spins” on the lock until it is free). Spinlocks are often used within SQL Server for resources that are usually not busy. If a resource is busy, the duration of a spinlock is short enough that retrying is better than waiting and then being rescheduled by the operating system, which results in context switching between threads. Spinlocks are used for situations in which the wait for a resource is expected to be brief or if no wait is expected. The “sys.dm_os_tasks” and the “sys.dm_os_spinlock_stats” dynamic management views show a status and additional information for any task that is currently using a spinlock.

The result of the “sys.dm_os_spinlock_stats” dynamic management view looks like this:

SELECT * 
FROM sys.dm_os_spinlock_stats
WHERE collisions <> 0
OR spins <> 0

This view represents some server wide statistics for all spin lock types. SQL Server contains about 175 spinlock objects to protect the various internal structures during multi-threaded access. The goal of the spinlock is to be fast and efficient. So the ultimate goal is to see 0 collisions and 0 spins.

VFS

Due to performance issues it is always good to know which database is very busy and which database is bit slow because of IO issues. By using virtual file stats we get to know which MDF or LDF is busiest and doing most of the work. Following query gets the results very quickly.

SELECT    DB_NAME(vfs.DbId) DatabaseName, 
        mf.name,
        mf.physical_name, 
        vfs.BytesRead, 
        vfs.BytesWritten, 
        vfs.IoStallMS, 
        vfs.IoStallReadMS, 
        vfs.IoStallWriteMS, 
        vfs.NumberReads, 
        vfs.NumberWrites, 
        (Size*8)/1024 Size_MB
 FROM    ::fn_virtualfilestats(NULL,NULL) vfs
 INNER JOIN sys.master_files mf 
 ON        mf.database_id = vfs.DbId
 AND    mf.FILE_ID = vfs.FileId

When you run above query you will get many valuable information like what is the size of the file as well how many times the reads and writes are done for each file. It also displays the read/write data in bytes. Due to IO if there has been any stall (delay) in read or write, you can know that as well. The DMV sys.dm_io_virtual_file_stats() represents a commonly used alternative to the above used function. In fact, since SQL Server 2005 it was introduced to replace this function. The query pointing on the DMV looks like this:

SELECT dbs.name as Databasename
      ,dbf.name as FileName
      ,dbf.type_desc as FileType
      ,[num_of_bytes_read]
      ,[num_of_bytes_written]
      ,[io_stall]
      ,[io_stall_read_ms]
      ,[io_stall_write_ms]
      ,[num_of_reads]      
      ,[num_of_writes]     
      ,([size_on_disk_bytes])/1024/1024 Size_MB
FROM [sys].dm_io_virtual_file_stats(NULL,NULL) as vfs
INNER JOIN [sys].[databases] dbs ON vfs.database_id = dbs.database_id
INNER JOIN [sys].[database_files] dbf ON vfs.file_id = dbf.file_id

Top Queries

The top query statistics can be used to identify long running and highly resource consuming SQL statements. Tuning those statements can be an attempt in order to gain more performance or to reduce execution times. Query execution information can be retrieved by the following systems views and functions:

sys.dm_exec_query_stats: This view returns aggregate performance statistics for cached query plans. The view contains one row per query statement within the cached plan, and the lifetime of the rows are tied to the plan itself. When a plan is removed from the cache, the corresponding rows are eliminated from this view. For a detailed overview please have a look at the following site: http://msdn.microsoft.com/en-us/library/ms189741.aspx
sys.dm_exec_sql_text (sql_handle): This function returns the text of the SQL statement that is identified by the sql_handle parameter, which can be retrieved from the sys.dm_exec_query_stats view. For a detailed overview please have a look at the following site: http://msdn.microsoft.com/en-us/library/ms181929.aspx
sys.dm_exec_text_query_plan (plan_handle): This function returns the query execution plan in text format for a specific SQL statement within the batch. The query plan specified by the plan handle can either be cached or currently executing. The corresponding execution plan will be identified by the plan_handle parameter which can be retrieved from the sys.dm_exec_query_stats view. For a detailed overview please have a look at the following site: http://msdn.microsoft.com/en-us/library/bb326654.aspx

By combining these components, you get a good overview about query execution statistics. The query selecting the DMV looks like this:

SELECT    qt.text as SQLStatement,
        db_name(qt.dbid) AS database_name,    
        qs.total_elapsed_time,
        qs.total_worker_time,
CAST(qs.total_elapsed_time / 1000 AS FLOAT) / 
CAST(qs.execution_count as FLOAT) as avg_cpu_time_ms,
        qs.execution_count,
        CAST(qp.query_plan AS XML) query_plan
FROM         sys.dm_exec_query_stats qs
CROSS APPLY sys.dm_exec_sql_text(qs.sql_handle) as qt
CROSS APPLY sys.dm_exec_text_query_plan(qs.plan_handle,
statement_start_offset, statement_end_offset) as qp
WHERE     qs.execution_count>0
ORDER BY     total_elapsed_time desc

The result set provides the following information:

SQL Statement: The SQL statement for which the statistics have been collected.
Database Name: The database on which the query was executed.
Total Elapsed Time: Total elapsed time, reported in microseconds, for completed executions of this plan.
Total Worker Item: Total amount of CPU time, reported in microseconds, that was consumed by executions of this plan since it was compiled.
Average CPU Time: Ratio between the Total Elapsed Time and the Execution Count.
Execution Count: Number of times that the plan has been executed since it was last compiled.
· Query Plan: The link to the query execution plan of the executed SQL statement. The result of the query looks like this:

References

4 thoughts on “SSIS Performance Tuning–Monitoring & Data Collection”

Add yours

Piotr Kundu (@piotrkundu)
July 21, 2014 at 12:23 pm


SSIS is to my understanding a lot about abstracting complexity and to let non-developers “click-and-choose”. I’m a bit surpriced (or actually I’m not) that there is no way to have an easy performance monitoring GUI with a typical performance counters. I would also like to set a limitation on cpu usage for a SSIS package in terms of number of cores, cpu % etc

- dwjunkie
  July 23, 2014 at 10:31 pm
  
  
  Hi Piotr, yes there are Performance Counters also specific to SSIS, as I listed the in my post. Your tool is then Perfmon in order to vizualize the results. But you also need to monitor your database in order to find bottlenecks on the database side.

Data Analytics Junkie

Analytics and AI on the Microsoft platform – Marcel Franke –