We, at Hedgehog Development, have been intimately working with Sitecore serialization since it was released with Sitecore 6.0 in 2008. Specifically our interest has been in helping developers bring their Sitecore items into Visual Studio and essentially allow you to treat your Sitecore items as code. We do this with our Team Development for Sitecore product.
I was having a discussion the other day and was asked to elaborate on the serialization format that Sitecore uses. I figured I would have some notes, or official documentation, on the format, but I couldn't find any! I figured this is a good a place as any to describe the serialization format that Sitecore uses.
Serialized files are UTF-8 encoded and the first line in a serialized item must be “----item----“
Immediately following the item declaration are the properties of the item that are the same regardless of version or language. The properties are:
Version: always 1
Id: The Sitecore ID of the item
Database: The database this item came from
Path: The path where the item should be
Parent: The ID of its parent
Name: The name of the item
Master: The branch template used for creation
Template: The ID of the template that this item is based on
TemplateKey: The name of the template
- Version is always 1
- The path property isn’t used for reverting.
- The Parent ID property is used for placing the item in the tree.
- The Name property is used for naming the item
- The TemplateKey isn’t used for reverting.
- The Template ID property specifies the template to be used.
Immediately following the item properties, but before the first “----version----“ we have any ‘shared’ fields that may exist. Shared field are special in that the value of a shared field is shared amonst all languages and versions of an item.
The field serialization format is rather simple. All text values are written to the serialization file exactly as they are stored in Sitecore. In the case of binary fields the data is base-64 encoded. The field definition with a line with “----field----“ and then we have the following properties:
Field: The ID of the field
Name: The name of the field
Key: The lowercase name of the field
Content Length: The number of unicode characters in the field value
Immediately following the content length property we have an empty line (\n\n). The subsequent line(s) contains the content of the specified length. Afrter the field value we must have another empty line (\n\n). There then may be any number of empty lines before the next section.
- The name or key properties aren’t used with update/revert.
- The ID property is used to identify the field for setting the value.
Once we have defined the shared fields we come into the definition for a specific language and version. This is signified by a “----version----“ line. The version definition simply defines the following properties:
Language: The language of the version
Version: The number version
Revision: The ID of the version
Proceeding the version definition is any field definitions specific to that version of the item. Field definitions are repeated until a new version definition, or the end of the file, appears.
path: /sitecore/content/Home/Item Name
name: Item Name
templatekey: Sample Template
value of shared field
value of unversioned shared field
name: Field 1
key: field 1
name: Field 1
key: field 1