Introduction
A checkpoint is a background process that writes dirty pages to disk. A checkpoint performs a full scan of the pages in the buffer pool, lists all the dirty pages that are yet to be written to disk, and finally writes those pages to disk. In SQL instances that do not have many dirty pages in the buffer pool, this is a trivial operation. However, with SQL instances that have OLTP databases, use more memory and/or involve sequential scanning of all pages, the performance of the system could be impacted.
With SQL Server 2012, indirect checkpoints were introduced. In this case, the dirty page manager manages the dirty page list and keeps track of all the dirty page modifiers of the database. By default, it runs every 60 seconds and tracks the dirty pages that need to be flushed.
For example, if an indirect checkpoint is set to 120 seconds, then individual pages would be flushed around 120 seconds after they were made dirty. This will impact I/O and may lead to I/O spikes. I/O also depends on the number of dirty modifiers lists of the database. We may even see a significant spike in the set target interval and see small hits as the pages change. This eliminates overhead from tracing and flushing the dirty pages, which results in a significant improvement in backup performance.
To view your current target recovery time, you can query sys.databases
.
SELECT name, target_recovery_time_in_seconds
FROM sys.databases;
On a database where an automatic checkpoint mode is set, tracing the entire buffer pool is necessary to evaluate for possible dirty pages. On the other hand, with an indirect checkpoint, only those pages of the buffer pool that are dirty must be dealt with.
For example, let’s consider a system with 2 TB of memory that contains approximately 250 million buffer units and has 100 dirty pages. Assume that it requires 5 CPU cycles to trace and identify the status of each buffer unit. To traverse 250 million buffer units, it would require 1.25 trillion CPU cycles—and to what? Simply to find 100 positive dirty pages in the automatic checkpoint mode! However, 100 dirty pages are hardly anything with the indirect checkpoint mode. The entire buffer pool scanning time is eliminated.
How it Works
When target_recovery_time
is set at the database level, indirect checkpoint is activated, and a dirty page manager is created for the database. The dirty page manager maintains the dirty page lists (DPList) for the database. Partitioned DPLists help improve concurrency and avoid any contention, with each worker thread running on a scheduler writing to its corresponding dirty page list. Because of this, when the transactional activity on the database is low (<3 k–5 k transactions per second), the overhead of tracking dirty pages in DPList is minimal and doesn’t impact the overall throughput of the workload running on the database with indirect checkpoint.
When the total number of dirty pages exceeds the dirty page threshold (calculated internally from target_recovery_time
), the recovery writer wakes up, swaps the list, and starts iterating through the DPList in a loop—starting with the longest DPList. It collects the page IDs, sorts the pages in the list to optimize and bundle writes together, and finally posts asynchronous write requests to write the pages.
These are good improvements for the most part; however, in some situations on a busy system, you can still see contention in this process. The recovery writer handles dirty pages by organizing, sorting, and posting writes to optimize database performance. When transactional activity is high, contention can occur between the recovery writer and I/O threads, leading to slower processing. To maintain recovery time targets, worker threads may step in to assist the recovery writer, ensuring efficient handling of dirty pages despite increased workload. While this strategy helps meet recovery goals, it can temporarily impact overall system performance.
This performance impact may occur because, in some situations, indirect checkpoint can cause uneven distribution of dirty pages—especially in databases like tempdb
—leading to long dirty page lists. When these lists grow too large, the recovery writer struggles to handle them efficiently, causing contention with I/O threads and consuming additional CPU resources. In severe cases, this can lead to non-yielding scheduler errors, where the system encounters performance issues due to excessive spinlock contention and delays in processing.
If you see non-yielding schedulers in your environment due to this, you will see the following symptoms in the error log.
Process 0:0:0 (0x34c0) Worker 0x000000FCD89BC160 appears to be non-yielding on Scheduler 9. Thread creation time: 13140343858116. Approx Thread CPU Used: kernel 0 ms, user 60968 ms. Process Utilization 13%. System Idle 86%. Interval: 70234 ms.
A time-out occurred while waiting for buffer latch -- type 3, bp 0000010C1900C6C0, page 4:323520, stat 0x10f, database id: 2, allocation unit Id: 6488064, task 0x000000FE6076D088 : 0, waittime 300 seconds, flags 0x19, owning task 0x00000102ED0C8108. Not continuing to wait.
To detect if your SQL Server instance is experiencing excessive spinlock contention on DPList—even if there is no non-yielding scheduler detected—you can use the following diagnostics available in SQL Server.
Monitoring
Use sys.dm_os_spinlock_stats
DMV to look for a high number of spins and backoff events over periods of time. Run the following query multiple times at one-minute intervals. Calculate the difference between snapshots to identify the number of spins and backoffs between consecutive snapshots.
Note: It is expected to see some spins and backoffs due to healthy spinlock contention between the recovery writers and I/O threads, so you will need to compare different snapshots to see if exponential backoff continues to increase.
SELECT GETDATE() AS [Date Captured], *
FROM sys.dm_os_spinlock_stats
WHERE name = 'DP_LIST';
You can use the following script to capture an Extended Event for spinlock backoff for DPList in your environment. The call stack captured from the XEvents can be analyzed using SQLCallStackResolver developed by Arvind Shyamsundar.
Interpretation Guidelines
Metric | What it means | When to worry |
---|---|---|
collisions | Number of times threads tried to acquire the spinlock simultaneously | High values are normal under load |
spins | Total spin attempts | High with low collisions = inefficiency |
spins_per_collision | Avg spins per collision | >1000 may indicate contention |
backoffs | Times a thread gave up spinning | High and increasing = bad contention |
Azure SQL
All of this is applicable to Azure SQL: SQL Server on Azure VM, Azure SQL Managed Instance (SQL MI), and Azure SQL Database (SQL DB).
Azure SQL Database does support indirect checkpoints, but with some important distinctions compared to SQL Server and Azure SQL Managed Instance.
- You cannot directly configure
target_recovery_time
in Azure SQL Database. - Microsoft manages checkpoint behavior internally to optimize for high availability and fast recovery, especially in geo-replicated and hyperscale environments.
- You don’t have access to all system-level DMVs like
sys.dm_os_spinlock_stats
or the ability to run custom Extended Events sessions that require elevated privileges. - This limits your ability to diagnose spinlock contention or recovery writer behavior in the same way as on SQL Server or Managed Instance.
- If you’re using Azure SQL Database Hyperscale, the checkpointing mechanism is completely different. It uses page servers and a log service to manage durability and recovery, not traditional buffer pool flushing.
tempdb
is shared and abstracted in Azure SQL Database. You don’t manage it directly, so issues like skewed DPLists or spinlock contention in tempdb are not something you can monitor or tune.
Adjusting the Target Recovery Time
Raising or lowering the target_recovery_time
in SQL Server (or Azure SQL Managed Instance) directly affects how aggressively SQL Server flushes dirty pages to disk, which in turn impacts I/O performance and recovery time after a crash. The default is 60 seconds for most modern SQL Server versions and Azure SQL MI. The default value is appropriate for most workloads.
Best Practices for Adjusting target_recovery_time
1. Generally, leave it alone
The default is 60 seconds and works well for most modern SQL Server versions and Azure SQL MI.
If you have a value of 0, this is likely because the database has migrated from an older version of SQL Server and has retained the original value. We generally recommend setting this to 60, then checking for performance improvement, performance degradation, or spinlock contention as described earlier. Larger memory systems—say 500 GB—benefit more by adjusting from 0 to 60.
If you have a value other than 0 or 60, then it is not the default value from older or newer versions of SQL Server and was explicitly set. Investigate why it was set to the current value.
2. Consider the Workload
If you must adjust because you have identified one of the symptoms discussed earlier, consider the following suggestions based on your workload type. Always test with a production-like workload before adjusting target_recovery_time
in production.
Workload Type | Suggested Setting | Reason |
---|---|---|
OLTP (high transaction rate) | Higher (e.g., 120–300 s) | Reduces I/O pressure and spinlock contention |
Reporting / Read-heavy | Higher | Less frequent writes; recovery time is less critical |
Mission-critical / HA | Lower (e.g., 20–60 s) | Ensures fast recovery and minimal data loss |
Batch / ETL workloads | Higher | Checkpoints can be scheduled around batch windows |
3. Monitor and Adjust Iteratively
Make adjustments in small increments. Use sys.dm_db_log_stats
to monitor actual recovery time and checkpoint behavior.
- Watch for:
- I/O spikes
- Spinlock contention (e.g., DP_LIST)
- Non-yielding schedulers
- Adjust in small increments (e.g., from 60 s → 120 s) and observe impact.
4. Avoid Overly Aggressive Settings
Setting it too low (e.g., 10 s) can cause:
- Frequent checkpoints
- High I/O load
- CPU overhead from recovery writer activity
5. Use Different Settings per Database
You can set target_recovery_time
per database. Tune individually based on each database’s role and workload. Again, the default value of 60 seconds works in most cases. Only adjust if you have detected the symptoms above.
ALTER DATABASE [MyDB] SET TARGET_RECOVERY_TIME = 120 SECONDS;
The scripts in this article are for demonstration purposes only. Please test all scripts in a test environment.
Conclusion
Indirect checkpoints improve how dirty pages are handled, particularly on large-memory machines. In most cases, setting the TARGET_RECOVERY_TIME
to 60 seconds works for most databases. Review your error log for non-yielding scheduler entries and monitor the DP_LIST spinlock backoff over time.
Additional Reading
- Database Checkpoints (SQL Server) – SQL Server | Microsoft Learn
- Why Enable SQL Server Indirect Checkpoints
- “0 to 60”: Switching to indirect checkpoints – SQLPerformance.com
- sys.dm_os_spinlock_stats (Transact-SQL) – SQL Server | Microsoft Learn
Thank you to Yvonne Vanslageren for contributing to this article. You are greatly missed.
Amazingly well explained. Thanks for sharing! This gives me ideas to setup a Datadog alert to monitor this at escale.