Modeling historical dates

As a part of a side project I continually fail to actually start I would need the ability to sort events based on a date. Sounds simple, right? And so it would be if all of the dates were CE (or AD if you prefer) or if .Net's DateTime type supported BCE dates. Unfortunately these dates could be BCE or CE and System.DateTime's minimum value is 1st Jan 1 CE. Further complicating the matter is the fact that these dates come from historical sources which can be extremely vague (e.g. "before January 17 BCE").

In this post I will look at modeling historical dates using F#. I'll look at creating a model that captures the vagueness of historical dates and a method for sorting them which is further complicated by the surprisingly chequered history of leap years.

Setting the scene

To put the issues with historical dates into some context, consider the following list of dates taken from this list of Roman battles:

Date Description
June 58 BCE Battle of Arar
9th August 48 BCE Battle of Pharsalus
43 CE Battle of the Medway

As you can see, the dates range from complete to just the year. In addition you will frequently see events described as occuring "some time before July 77 CE" or "some time between March and May 1872 CE".

There are three challenges here:

There are some .Net libraries out there that look to address the first point (e.g. nodatime) but on balance it would seem that a custom date model is in order.

Modeling dates

Our first task is to create a model which captures both the partial nature of some dates - e.g. Jun 865 CE - and the vagueness of an event's relation to a date - e.g. before Jun 865 CE. We can do this by splitting these concepts between two model types.

Firstly, let's tackle partial dates by creating a FuzzyDate model. I am broadly using the single case union approach to models in F#.

///Discriminated union for the two eras
type Era = 
    | BCE
    | CE

///Contains functions and types for working with "fuzzy" historical dates
[<RequireQualifiedAccess>]
module FuzzyDate = 

    ///Describes a basic fuzzy date
    type Data = {
        Day : Int32 option;
        Month : Int32 option;
        Year : Int32;
        Era : Era;
    }
    with
        static member Empty = 
            {
                Day = None;
                Month = None;
                Year = 1;
                Era = Era.CE;
            }

    ///Instance union type
    type Instance = FuzzyDate of Data

All dates using this model must have an era and year but the month and day are optional. We can define some functions to act as constructors for the patterns we want to support. I have created a Outcome<T> type to represent success/failure conditions and will use it here to return either a FuzzyDate.Instance or an error message.

///Validate a given date's components
let private _validate era year month day = 

    let month' = Option.unwrap 1 month
    let day' = Option.unwrap 1 day

    if (year < 1) then 
        (Some "The year cannot be less than 1.")
    else if (month' < 1) || (month' > 12) then 
        (Some "The month must be between 1 and 12.")
    else if (day' < 1) || (day' > (Month.getDayCount era year month')) then 
        (Some "The day is not valid for the given month and year.")
    else None

///Validate a date's components and then create a success result if valid
let private _create era year month day = 
    match (_validate era year month day) with
    | Some message -> Error message
    | _ -> 
        let date = 
            FuzzyDate (
                { Data.Empty 
                    with 
                        Day = day; 
                        Month = month; 
                        Year = year; 
                        Era = era; 
                }
            )
        in (Success date)

///Create a day, or fully specified date - e.g. 31st Jan 42B CE
let createDay era year month day = _create era year (Some month) (Some day)

///Create a month - e.g. Jan 42 BCE
let createMonth era year month = _create era year (Some month) None

///Create a year - e.g. 42 BCE
let createYear era year = _create era year None None

Now we can create full or partial dates in the three basic formats:

//Jun 20 1832 CE
let complete = FuzzyDate.createDay CE 1832 6 20

//May 7 BCE
let quiteVague = FuzzyDate.createMonth BCE 7 5

//766 BCE
let reallyVague = FuzzyDate.createYear BCE 766

Now that we have support for partial dates we can start thinking about how to model on/before/after/between type dates.

In this context between means that an event occurred at some point between two dates, but depending on your use case it could also be taken to mean that an event started and finished on the dates given.

To model on/before/after/between event dates I created an EventDate model which builds upon FuzzyDate.

///Contains functions and types for historical event dates
[<RequireQualifiedAccess>]
module EventDate =

    ///Union describing various types of event date
    type Data = 
        | Specific of FuzzyDate.Instance
        | Before of FuzzyDate.Instance
        | After of FuzzyDate.Instance
        | Between of (FuzzyDate.Instance * FuzzyDate.Instance)

    ///Instance union type
    type Instance = EventDate of Data

Using this model we can represent events that occurred on a specific date, before a date, after a date or at some point between two dates. Again we can add some constructor functions.

///Creates a single specific date - e.g. 18th Apr 1472 CE
let createSpecific date = EventDate (Specific date)

///Creates a "some time before" date - e.g. < 13 BCE
let createBefore date = EventDate (Before date)

///Creates a "some time after" date - e.g. > 13 BCE
let createAfter date = EventDate (After date)

///Creates a date range - e.g. 13BCE - 14th Jun 34 CE
let createBetween first last = 

    let firstValue = FuzzyDate.getSortValue first
    let secondValue = FuzzyDate.getSortValue last

    if (firstValue > secondValue) then
        Error "The second date cannot come before the first."
    else        
        Success (EventDate (Between (first, last)))

We can now model a variety of vague historical dates.

//On 25th Dec 800 CE
let exact = EventDate.createSpecific (FuzzyDate.createDay CE 800 12 25)

//Between Jan and 3rd Feb 1701 CE
let quiteVague = 
    EventDate.createRange 
    <| (FuzzyDate.createMonth CE 1701 1) 
    <| (FuzzyDate.createDay CE 1701 2 3)

//Before 700 BCE
let reallyVague = EventDate.createBefore (FuzzyDate.createYear BCE 700)

I also created a module for parsing both FuzzyDate and EventDate from strings - you can see the source code here on GitHub.

let date = Parser.readEventDate "19 Oct 1691 CE" //Outcome<EventDate>

Sorting dates

As I mentioned in the introduction my hypothetical app would need to be able to sort a series of events into chronological order. The vague nature of the dates and the mix of BCE and CE dates complicate sorting somewhat.

In order to support sorting of FuzzyDate I added the getSortValue function which returns an integer which can used when sorting - e.g. using Seq.sortBy. The value returned is the difference, in days, between the date provided and 1st Jan 1 BCE (the beginning of astronomical year zero). When dealing with partial dates a default value of 1 is simply used for the missing part(s) of the date so both Jan 1 BCE and 1 BCE become 1st Jan 1 BCE. The table below contains examples of the result of FuzzyDate.getSortValue,

Date Sort value
1 Jan 2 BCE -365
31st Dec 2 BCE -1
1st Jan 1 BCE 0
2nd Jan 1 BCE 1
1st Jan 1 CE 365

The method for calculating the number of days since 1st Jan 1 BCE for a date differs slightly depending on whether the date is BCE or CE. For CE dates we can simply sum the number of days since 1st Jan 1 CE and then add the number of days in 1 BCE.

In the code below data is of type FuzzyDate.Data.

//The sum of all days to date in the CE era plus the number of days in 1 BCE
let dayCountCe = 
    [ 1 .. data.Year ]
    |> List.sumBy (fun year ->
            if (year = data.Year) then
                (Year.getDayNumber CE year month day) - 1
            else
                Year.getDayCount CE year
        )
in (dayCountCe + (Year.getDayCount BCE 1))

For BCE dates things are slightly different because the dates run "backwards" - i.e. 3 BCE comes before 2 BCE. For dates in 1 BCE itself we simply count the days as per the CE method. For any other year, Y, we sum the number of days in the the years between 2 and (Y - 1), if appropriate, add the number of days remaining in the year Y and add then 1 (because there are 0 days remaining in the year on 31st Dec, but we want the number of days to 1st Jan of the next year).

//Remember that BCE dates are backwards - e.g. 31st Dec 3BCE moves to 1st Jan 2BCE. 
[ 2 .. data.Year ]
|> List.sumBy (fun year ->
        if(year = data.Year) then
            (Year.getDaysRemaining BCE year month day) + 1
        else
            Year.getDayCount BCE year
    )

The result is then negated for dates prior to 1st Jan 1 BCE.

This value allows us to chronologically sort a series of FuzzyDate instances. To support chronological sorting of EventDate instances I added a corresponding getSortValue function which simply uses FuzzyDate.getSortValue to get a baseline value, converts it to a decimal and then tweaks it slightly so that the before, after and between cases are placed appropriately - e.g. "before 18 BCE" should appear in the sorted list before "18 BCE". The table below contains examples of the result of EventDate.getSortValue.

Date Sort value
Before 2nd Jan 1 BCE 0.9
On 2nd Jan 1 BCE 1.0
After 2nd Jan 1 BCE 1.1
Between 2nd Jan 1 BCE and 2nd Feb 1 BCE 1.1

In the code below data is of type EventDate.Data.

let getSortValue = _apply (fun data ->
    let date, adjustment = 
        match data with
        | Specific date -> (date, 0.0M)                                 //Absolute date
        | Before date -> (date, -0.1M)                                  //Just before events on date
        | Between (date, _) | After date -> (date, 0.1M)                //Just after events on date
    in (decimal (FuzzyDate.getSortValue date)) + adjustment
)

We now have way of sorting event dates chronologically, but there's a moderately sized elephant in the room: leap years. Several functions included in the snippets above requires us to be able to calculate leap years. So how are we doing that?

Leap years

Leap years exist because the astronomical year (i.e. how long it takes the Earth to do one orbit of the Sun) is not completely synchronised with the calendar year.

Leap years were first introduced in the Julian calendar in 46 BCE by Julius Caesar. An extra day was to be added to the year every 4 years to realign the calendar year with the astronomical year, but unfortunately the priests charged with maintaining the calendar erroneously added a day to the year every 3 years. To mitigate this error there were no leap years between 8 BCE and 4 CE.

According to the latest works this means that the leap years between the introduction of the Julian calendar in 46 BCE and 4 CE are 44, 41, 38, 35, 32, 29, 26, 23, 20, 17, 14, 11 and 8 BCE. The next leap year is then 4 CE and every 4 years after that until the introduction of the Gregorian calendar in 1582 CE.

The calculation of leap years in the Gregorian calendar is slightly more complicated than in the Julian calendar as it had been noticed that the astronomical year was not quite exactly 365.25 days long and so the Julian method was actually over-compensating slightly.

In the Gregorian calendar a year is a leap year if it is divisible by 4 and not divisible by 100 or also divisible by 400. So, for example, 1800 CE is not a leap year despite being divisible by 4 because it is also divisible by 100 but not 400. On the other hand 2000 CE is a leap year because it is divisible by 4, 100 and 400.

The two getSortValue functions we have already looked at are based on the proleptic Gregorian calendar which simply extends our current calendar back into the past prior to its introduction. When calculating the leap year we also use the astronomical year which allows us to easily calculate leap years for BCE dates (where 1 BCE is year 0).

When researching leap years most sources seemed to suggest that your choice of leap year calculation would depend on what period you were working with and whether seasonal dates were important. As the events we are sorting could be from any time period I have decided on an hybrid approach which I have no doubt has all manner of issues for serious historians and horologists, but seems good enough for my purposes. Essentially my approach was to consider a year a leap year if the people alive at the time would have done so. This means no leap years prior to 46 BCE and "fudged" dates between 46 BCE and 4 CE.

To calculate leap years we first we need to convert our era-based year into an astronomical year. This allows us to easily calculate leap years using the modulo operation.

let getAstronomicalYear year = function
    | CE -> year
    | BCE -> (year - 1) * -1

Next we need a way of deciding which leap year system to use. I implement this as an active pattern, one of F#'s handiest features.

///Contains the starting years for various leap year calculations
[<RequireQualifiedAccess>]
module StartYears = 
    let [<Literal>] Triennial = -45
    let [<Literal>] Julian = 4
    let [<Literal>] Gregorian = 1582

///Active pattern used to decide on the leap year calculation to be used
let (|None|Triennial|Julian|Gregorian|) astronomicalYear = 
    if (astronomicalYear >= StartYears.Gregorian) then

        Gregorian

    else if (astronomicalYear >= StartYears.Julian) then

        Julian

    else if (astronomicalYear >= StartYears.Triennial) then

        Triennial

    else

        None

Now that we know which method we want to use we need some functions to actually determine whether a given astronomical year is a leap year.

///True if a year is an erroneous triennial leap year in the period between 46 BCE and 4 CE
let isTriennialLeapYear = 

    let triennialLeapYears = [ -43 .. 3 .. -7 ] //Every 3 years between 44 BCE and 8 BCE

    fun astronomicalYear ->
        List.exists ((=) astronomicalYear) triennialLeapYears

///True if a year is a calculated leap year according to the Julian method - e.g. divisible by 4 
let isJulianLeapYear astronomicalYear = ((astronomicalYear % 4) = 0)

///True if a year is a calculated leap year according to the Gregorian method - e.g. divisible by 4, and 400 if also divisible by 100
let isGregorianLeapYear astronomicalYear = 
    match ((astronomicalYear % 4), (astronomicalYear % 100)) with
    | (0, 0) -> (astronomicalYear % 400) = 0
    | (0, _) -> true                
    | _ -> false   

Finally, we can combine everything to create a single function that determines whether a year is a leap year.

///True if a year is a leap year
let isLeapYear era year = 

    let astronomicalYear = getAstronomicalYear year era

    match astronomicalYear with
    | None -> false
    | Triennial -> isTriennialLeapYear astronomicalYear
    | Julian -> isJulianLeapYear astronomicalYear
    | Gregorian -> isGregorianLeapYear astronomicalYear  

We can now use this function when calculating things like the day of the year of any given date, the number of days in a month or the days in a year. As I said I'm sure that there are a multitude of issues with this approach that I am not sufficiently qualified to comment on, but as a naive compromise I think it works rather well.

Putting it all together

By way of an example I took the dates of Roman battles in the first century BCE and the first century CE from this list and wrote a small console application which randomises them and then sorts them into chronological order before writing a simple HTML page listing them. I also added a couple of "before" and "after" events as well for variety.

The source code for the application can be found here on GitHub.

Conclusion

In this post I have shown how partial and vague historical dates can be modeled in F#. I have also outlined an approach to leap year calculations which, while probably questionable from a purely academic standpoint, works quite well as a simple compromise to what is a potentially extremely tricky problem.

In terms of possible enhancements, being able to choose the method of leap year calculation might be useful depending on the events being sorted (e.g. if they are all from a particular period of history then one method may be preferable to another). Similarly being able to choose the calendar used to represent the dates might be useful - if dates were stored as the number of days since 1st Jan 1 BCE it should be fairly straight forward to convert to a date in a specific calendar when required.

The source for the project is available on GitHub.

Comments