Serialising F#'s Option type using DataContractSerialiser

Data contracts are a quick and relatively easy way of serialising objects to XML (or JSON, but we’ll stick to XML in this post) – e.g. for use as DTOs.

Various attributes can be used to control aspects of the generated XML, meaning you can produce clean, simple XML quite easily. Using these attributes it is possible for the DataContractSerializer to serialise and deserialise F# record types, which can be really handy when writing web services. However, a few wrinkles arise when using F#’s Option type.

For example, take this simple model of a person:

1 [<DataContract (Name = "person", Namespace = "")>]
2 type Person = {
3   [<field: DataMember (Name = "name")>] Name : String;
4 }

The default XML produced will look like this:

1 <person xmlns:i="http://www.w3.org/2001/XMLSchema-instance">
2   <name>John Smith</name>
3 </person>

Notice that the DataMember attribute is applied using the field modifier – this tells .Net to apply the attribute to the backing field used by the property, as the property itself has no setter which is required by the serialiser.

So far, so good. However, problems start to appear when we want to represent optional parts of our model, for example take this model:

 1 [<DataContract (Name = "department", Namespace = "")>]
 2 type Department = {
 3     [<field: DataMember (Name = "name")>] Name : String;
 4 }
 5  
 6 [<DataContract (Name = "team", Namespace = "")>]
 7 type Team = {
 8     [<field: DataMember (Name = "name")>] Name : String;
 9     [<field: DataMember (Name = "dept")>] Department : Department option;
10 }
11  
12 [<DataContract (Name = "developer", Namespace = "")>]
13 type Developer = {
14     [<field: DataMember (Name = "name")>] Name : String;
15     [<field: DataMember (Name = "team")>] Team : Team option;
16 }

Here a developer can optionally be associated with a team, which in turn can optionally be associated with a department. The problem is that the DataContractSerializer does not know how to serialise Option and so you end up with ugly XML containing platform specific information. Here’s an example:

 1 <developer xmlns:i="http://www.w3.org/2001/XMLSchema-instance">
 2   <name>John Smith</name>
 3   <team xmlns:a="http://schemas.datacontract.org/2004/07/Microsoft.FSharp.Core">
 4     <a:value>
 5       <dept>
 6         <a:value>
 7           <name>Product Development</name>
 8         </a:value>
 9       </dept>
10       <name>Red Team</name>
11     </a:value>
12   </team>
13 </developer>

If we wanted to use this XML for a web service or similar we would ideally want Some X to be serialised as X and None to be omitted or empty, e.g.:

1 <developer xmlns:i="http://www.w3.org/2001/XMLSchema-instance">
2   <name>John Smith</name>
3   <team>
4     <dept>
5       <name>Product Development</name>
6     </dept>
7     <name>Red Team</name>
8   </team>
9 </developer>

Luckily there are a couple of tricks at our disposal.

Using IDataContractSurrogate

The first thing we can do is use a data contract surrogate to tell the serialiser that Some X should be serialised as X and None should be treated as null. The interface defines several methods but we only need to implement three of them:

 1 type OptionSurrogate () = 
 2   interface IDataContractSurrogate with
 3  
 4       ///Tell the serialiser to treat Option<'T> as 'T
 5       member this.GetDataContractType (currentType : Type) = 
 6           match (getOptionalType currentType) with
 7           | Some baseType -> baseType
 8           | _ -> currentType
 9  
10       ///When deserialising, convert null to None and 'T to Some<'T>
11       member this.GetDeserializedObject (current, currentType) = 
12           if (isOptionalType currentType) then
13               if (current <> null) then
14                   Activator.CreateInstance (currentType, [| current |])
15               else
16               
17                   let noneProperty = currentType.GetProperty ("None")
18                   noneProperty.GetValue (null)
19  
20           else
21               current
22  
23       ///When serialising, convert Some<'T> to 'T and None to null
24       member this.GetObjectToSerialize (current, _) = 
25           if (current <> null) then
26           
27               let currentType = current.GetType ()
28  
29               if (isOptionalType currentType) then
30  
31                   let isSomeProperty = currentType.GetProperty "IsSome"
32                   let isSome = isSomeProperty.GetValue (null, [| current |]) :?> bool
33  
34                   if isSome then
35  
36                       let valueProperty = currentType.GetProperty "Value"
37                       valueProperty.GetValue (current)
38  
39                   else
40                       null
41               else
42                   current
43           else
44               current
45  
46       member this.GetCustomDataToExport (_ : MemberInfo, _ : Type) = box null
47       member this.GetCustomDataToExport (_ : Type, _ : Type) = box null
48       member this.GetKnownCustomDataTypes _ = ()
49       member this.ProcessImportedType (decl, _) = decl
50       member this.GetReferencedTypeOnImport (_, _, _) = null

Now when you create your DataContractSerializer you need to give it an instance of the surrogate:

 1 let surrogate = OptionSurrogate ()
 2 
 3 let serialiser = DataContractSerializer (
 4  typeof<Developer>, 
 5  Seq.empty, 
 6  Int32.MaxValue, 
 7  true, 
 8  false, 
 9  surrogate
10 )

When we serialise our model optional types will be slightly cleaner. For example:

1 <developer xmlns:i="http://www.w3.org/2001/XMLSchema-instance">
2   <name>John Smith</name>
3   <team xmlns:a="http://schemas.datacontract.org/2004/07/Microsoft.FSharp.Core">
4     <dept>
5       <name>Product Development</name>
6     </dept>
7     <name>Red Team</name>
8   </team>
9 </developer>

Or when Team is None:

1 <developer xmlns:i="http://www.w3.org/2001/XMLSchema-instance">
2   <name>John Smith</name>
3   <team i:nil="true" xmlns:a="http://schemas.datacontract.org/2004/07/Microsoft.FSharp.Core"/>
4 </developer>

Setting the EmitDefaultValue property to false on the DataMember attribute for the Team property will cause the element to be omitted entirely as None is considered to be the default value of Option.

Although the XML structure is cleaner the serialiser has now started to decorate elements with attributes relating to the Option type, despite it no longer being serialised.

Cleaning up

Luckily, removing these unnecessary attributes is fairly easy. In fact, the only attributes that are of any use to us are the ones that reside in the XML schema namespace (prefixed “i” in the examples) – so we can take the stream written to by the data contract serialiser and simply tweak the generated XML so that all attributes not in the XML schema namespace are removed.

For example, a simple function which cleans up a stream of XML is shown below:

 1 ///castMany<'T> : IEnumerable -> 'T list
 2 ///getXml : Stream -> XmlDocument
 3  
 4 let [<Literal>] XmlnsNamespace = "http://www.w3.org/2000/xmlns/"
 5 let [<Literal>] SchemaNamespace = "http://www.w3.org/2001/XMLSchema-instance"
 6  
 7 let cleanUp (input : Stream) = 
 8  
 9   let rec removeAttrs (node : XmlNode) = 
10  
11       if (node.Attributes <> null && node.Attributes.Count > 0) then
12           node.Attributes
13           |> castMany<XmlAttribute>
14           |> List.filter (fun attr -> not (attr.NamespaceURI = SchemaNamespace))
15           |> List.iter (fun attr -> node.Attributes.Remove (attr) |> ignore)
16  
17       if node.HasChildNodes then
18           node.ChildNodes
19           |> castMany<XmlNode>
20           |> List.iter removeAttrs
21  
22   let appendXmlns (xml : XmlDocument) = 
23  
24       let attr = xml.CreateAttribute ("xmlns", "i", XmlnsNamespace)
25       attr.Value <- SchemaNamespace
26  
27       xml.DocumentElement.Attributes.Append (attr)
28       |> ignore
29  
30   let xml = getXml input
31  
32   removeAttrs xml.DocumentElement
33   appendXmlns xml
34  
35   let settings = XmlWriterSettings ()
36   settings.OmitXmlDeclaration <- true
37   settings.NamespaceHandling <- NamespaceHandling.OmitDuplicates
38  
39   let output = new MemoryStream ()
40   use writer = XmlWriter.Create (output, settings)
41  
42   xml.WriteTo (writer)
43  
44   output

When the previous examples are passed through this function we end up with:

1 <developer xmlns:i="http://www.w3.org/2001/XMLSchema-instance">
2   <name>John Smith</name>
3   <team>
4     <dept>
5       <name>Product Development</name>
6     </dept>
7     <name>Red Team</name>
8   </team>
9 </developer>

And:

1 <developer xmlns:i="http://www.w3.org/2001/XMLSchema-instance">
2   <name>John Smith</name>
3   <team i:nil="true" />
4 </developer>

Much better!

Handily the additional attributes added by the serialiser are not needed to deserialise the XML – so the above examples will both be deserialised correctly.

Cavaet

An unfortunate cavaet to this is that the surrogate’s GetDeserializedObject which is used to convert from T to Some T during deserialisation only appears to be called if T is a data contract. That means you cannot use this approach for optional properties which are not data contracts – e.g. String option.

Example code

An example application which demonstrates default serialisation, serialisation using the surrogate and serialisation using the surrogate and clean-up can be found on here on GitHub.

Comments