Loading Folders, Subfolders And Files With Caml Queries In Sharepoint

The Problem: Managing Large Amounts of Data

SharePoint offers robust options for organizing and managing content, but loading large amounts of data can be cumbersome without the right approach. CAML queries allow you to insert, update, or delete folders, subfolders and files in bulk while avoiding throttling limits.

Querying Folders and Subfolders

Overview of folder structure and hierarchy in SharePoint

SharePoint stores content within a hierarchical folder structure. The top level root folder is the site itself. Under the root are document libraries and lists which function as containers for folders, files and items. Folders within libraries can contain subfolders to arbitrary depth. CAML queries can retrieve data about folders and subfolders for bulk operations.

Constructing CAML queries to retrieve folder and subfolder data

CAML (Collaborative Application Markup Language) provides an XML schema for structuring queries and operations in SharePoint. To retrieve folders, the <GetFolder> query action is used along with filtering criteria. The <Folder> element specifies the current folder while the recursive <Folders> element loads data about immediate subfolders. Queries output XML which must be parsed.

Filtering, sorting, and managing return sets

CAML includes operators to filter, sort, group, and manage return sets. The <Where> clause applies criteria to return specific folders based on properties like Name or Created date. The <OrderBy> element sorts by folder attributes either ascending or descending. Options like row limits, folder depth restrictions, and pagination control return set size.

Loading Files into Folders

Uploading files with CAML batch commands

Batch commands allow uploading multiple files simultaneously through CAML. The <Batch> element contains one or more <Method> nodes defining create, update or delete operations. The <Field> nodes specify file properties. Reference target folders through the <Web> and <List> elements. Batch simplify multi-file loads.

Referencing folders paths and setting properties

The folder URL forms the basis for referencing its location. The <Web> element contains the server relative path while <List> identifies the document library or list. Specify the folder in the <url> attribute of <Batch> or <Method>. Use <Field> nodes like <FileLeafRef> to denote filenames and extensions along with metadata properties.

Handling errors and troubleshooting file uploads

Failed batch commands produce error codes for diagnostics. Check correlation tokens to match command responses to operations. Review error messages and codes to determine causes like invalid folders, properties, throttling limits etc. Try smaller batches and isolate failing files. Refer to ULS logs and trace verbose output. Handle errors programmatically in code.

Updating Metadata and Properties

Modifying file properties in bulk

Batch commands can update metadata and properties for multiple existing files. Specify the list URL and use the <Method ID=”1″ Cmd=”Update”> node. Inside include <Field> nodes like <FileLeafRef> to denote files and set key-values for properties using <Field Name=”PropertyName”>Value</Field> format.

Setting metadata, content types, and other attributes

In addition to basic properties, XML nodes can set SharePoint columns, site metadata, content types, versioning and other attributes. Some common update scenarios include modifying Title, Keywords, Created By etc. Or applying custom site columns and content types in bulk. Refer to files by URL or query to target items.

Versioning and maintaining update history

Batch updating files can create new versions to preserve history. Include the <Versioning><Update>Major|Minor</Update></Versioning> node in the Method element. Major versions increment the next number while Minor revisions append .1, .2 etc. Set the Approve attribute to Yes to formalize publishing.

Example CAML Code Snippets

Basic folder query

<Query>
   <Where>
     <Eq>
       <FieldRef Name="FSObjType"/>
       <Value Type="Integer">1</Value>
     </Eq>
   </Where>
   <Folders></Folders>
</Query>

Recursive file upload script

<Batch>
   <Method ID="1" Cmd="New">
     <Field Name="FileLeafRef">Filename.docx</Field>
     <Field Name="Content">BASE64EncodedData</Field> 
   </Method>
</Batch>

Set content type on multiple items

<Batch>
  <Method ID="1" ListUrl="" Cmd="Update">
    <Field Name="ID">1</Field>
    <Field Name="ContentTypeId">0x010042</Field>
  </Method>
</Batch>

Optimizing Queries for Large Data Sets

Using batches and continuation tokens

Large query sets can be split across batches with continuation tokens to track progress across requests. Tokens indicate server state to resume across batch calls. Helps avoid timeouts and throttling.

Understanding throttling limits

SharePoint has thresholds to maintain stability under load which can throttle queries. Factors like number of objects, property depth, filter complexity etc affect limits. Realistic batch sizes with simple property scopes helps balance throughput.

Tips for handling large loads efficiently

Follow SharePoint bulk import best practices like separating read vs. write operations, running sequential/parallel batches, retry mechanisms and more. Tune CAML for specific server versions. Optionally distribute load across multiple front-ends for scale.

Leave a Reply

Your email address will not be published. Required fields are marked *