Optimizing Batch Deletes For Large Sharepoint Lists

The Problem of Slow Batch Deletes

Large SharePoint lists with over 100,000 items often experience poor performance when attempting to run batch delete operations. There are several reasons why deleting a large number of items from SharePoint becomes exponentially slower as the list grows.

SharePoint stores list items in a SQL Server content database, within a table for each list. As this table grows to hundreds of thousands of records, SQL queries for batch deleting groups of items become increasingly expensive and time-consuming to execute. The more items match a delete query, the longer it will take to find and delete those records from the database.

In addition, SharePoint implements internal thresholds and limits to ensure optimal performance at scale, which can severely throttle batch operations. Lists are only allowed 5,000 items by default before automatic indexing kicks in. Indexes improve read performance, but batch writes still degrade once past these thresholds.

Furthermore, enabling versioning on large lists also compounds the performance impact during batch deletes. Every deleted item might generate a version record, doubling the amount of writes necessary, not to mention the added storage bloat.

Best Practices for Structuring Lists

When architecting SharePoint solutions, it is best practice to avoid massive lists in favor of smaller, more optimized structures designed for bulk operations. Try limiting lists to less than 5,000 items whenever possible to stay under default throttling thresholds. If batches of more than 5,000 items need to be deleted regularly, the list should be re-designed.

Versioning should be avoided on any lists requiring frequent batch deletes, as it effectively doubles the delete effort to manage both records. Versioning is great for audit trails, but terrible for batch operations and should only be enabled where necessary.

For use cases requiring hundreds of thousands of list items, it is better to break them down into multiple smaller lists that represent logical groupings based on metadata or content types. Separating items by type, department, or other attributes allows batch operations to scale across lists more efficiently.

Optimizing Batch Delete Code

To optimize code that performs batch deletes, use PowerShell whenever possible rather than client-side object models. PowerShell scripts allow sending CAML queries to filter down the items to delete in the most efficient manner. Query performance degrades substantially if no filtering is applied.

An example PowerShell script for deleting list items in batches of 500 at a time would look like:

  $web = Get-SPWeb http://sharepoint/site 
  $list = $web.Lists["Large List"]

  do {
    $listItems = $list.GetItems([String]::Format("{0}", $currentId))
    $listItemsBatch = $listItems | Where-Object {$_["ID"] -le (($currentId + 500) - 1)}  
    $listItemsBatch | ForEach-Object {$_.Delete()}
    
    $currentId += 500
  }
  while ($listItemsBatch.Count -ge 500)

This scripts grabs items in batches of 500 at a time based on ID, then loops back to grab the next 500 incrementally. Appropriately filtering items and deleting in paginated batches avoids timeouts and throttling limits.

Overcoming List Throttling Limits

SharePoint implements throttling limits on list operations to ensure stability at scale, but these limits often halt large batch operations mid-process. When delete queries exceed 5,000 item thresholds, timeouts kick in disrupting scripts.

Solutions include custom indexing to boost query performance, directly increasing list throttling through PowerShell, or upgrading to alternative non-SharePoint data stores. Custom result throttling helps bypass governor limits during batched operations through the object model.

For lists approaching millions of records though, SharePoint may simply not be equipped to handle such frequent large-scale batching efficiently. At this inflection point, migrate storage to SQL Server tables or a NoSQL database for deleting records in bulk at scale.

Monitoring Batch Delete Performance

To properly monitor optimize batch delete processes, track key metrics like number of items deleted per batch, overall runtime, and storage consumption. Log these metrics over time to identify trends.

Analyze the SharePoint ULS logs while running deletes to check for errors related to exceeded list view thresholds and recycled application pool processes. If certain delete operations consistently fail after a certain size or time limit, throttling limits are likely kicking in.

Tuning batch sizes, filtering queries, and paging through the deletes differently can help avoid throttling based on the capture metrics. The sweet spot for batch delete performance varies greatly based on list size, items deleted, and other environmental factors.

Leave a Reply

Your email address will not be published. Required fields are marked *