Using DBSCAN for fun and (mostly) profit in Elite:Dangerous

Elite:Dangerous is the reboot of the classic 1984 space trading, combat and exploration game Elite and it has been out for a few months now. After some initial aimless wandering I, like a lot of people, settled on a career trading rare goods. These goods, as their name suggests, are rare and available in limited quantities and can bring in huge profits if sold sufficiently far away from their system of origin.

In order to maximise profits it is desirable to circulate as much as possible between systems that supply rare goods, buying and selling at each stop. One popular strategy is to follow a circular route, periodically selling and buying. Another, and the one I favour, is to ferry cargo between clusters of systems that supply rare goods - zipping around one cluster buying goods and then travelling to another to sell before repeating the process.

The Elite:Dangerous community has done a good job of exploring the game's trade network and documenting what rare goods are available where, but it can be hard to visualise the systems relative to each other, even using the in-game galaxy map. So I thought it would be interesting to see if I could take that community generated data and group the systems together into clusters programmatically.

This post details how I used an immutable DBSCAN implementation to identify rare goods clusters in Elite:Dangerous.

DBSCAN

I pondered approaches to clustering rare good systems on and off for a few days, coming up with various approaches before realising that the abstract problem of clustering points in 3D space must have already been thoroughly addressed.

Some cursory internet research lead me to the DBSCAN algorithm which is one of the most commonly used clustering algorithms in the scientific literature with the added bonus that it is both compact and elegant. The Wikipedia article also handily included pseudo-code.

The algorithm takes three parameters - a data set, the maximum distance between points in a cluster and the minimum number of points required to form a cluster. It then iterates over all points in the data set and uses neighbouring points within the maximum distance to create clusters.

Modelling rare goods in Elite:Dangerous

Before I could implement any clustering algorithm I would need something to cluster. So I wrote a simple model that represents rare goods in Elite:Dangerous. The item that would eventually be clustered was the rare good Manifest of a aystem:

///Represents rare goods information for a system
type Manifest = {
    System : System;
    Commodities : (Station * Commodity) list;
}

You can see the rest of the model here, but to summarise:

I then took local copies of the systems and rare goods data from Phil Hibbs' excellent Google spreadsheet, added the CSV files to my project and loaded the data into my model. If you are interested in the raw data you can see the systems data here and the rare goods data here, while the data parsing is here.

Immutable DBSCAN implementation

After writing a model and settling on DBSCAN as the clustering algorithm I had a quick look around to see what existing implementations I could find. I tend to avoid mutable state in F# as much as possible so wanted an immutable solution. However, the only implementation I could find was mutable. So, armed with the pseudo-code from Wikipedia I set out to create an immutable version.

I knew the function was going to be recursive, so the first thing I did was create a simple model of the state of the function that I could easily modify between calls:

///Represents the current state of the algorithm
type State = {
    ClusterNo : Int32;
    Clusters : (Int32 * Manifest) list;
    Noise : Manifest list;
    Visited : Int32 list;
}
with
    static member Empty = 
        {
            ClusterNo = 0;
            Clusters = [];
            Noise = [];
            Visited = [];
        }

This allowed me to keep track of the current cluster, what clusters manifests had already been assigned to, what manifests had been marked as noise (potential outliers) and what manifests had already been visited.

I created a separate model for the algorithm settings which would be the input for any code calling the algorithm:

///Represents the settings that can be used for this DBSCAN implementation
type Settings = {
    Manifests : Manifest list;
    MaxDistance : Decimal;
    MinSize : Int32;
}

Various utility functions aside, my implementation closely followed the pseudo-code example. The function which does the majority of the heavy lifting is expandCluster:

//Expand a cluster by recursively moving out from core points
let expandCluster settings =

    //Partially apply some functions with settings where appropriate
    let getLocalGroup' = getLocalGroup settings
    let isCore' = isCore settings

    let union x y = 
        List.append x y
        |> Seq.distinctBy (fun manifest -> manifest.System.Id)
        |> Seq.toList

    let rec expandLocalGroup state = function
        | [] -> state
        | manifest :: manifests when (visited manifest state) -> 

            //Manifest has already been visited; simply add to the current cluster if 
            //it has not already been clustered.

            let state' = 
                if (clustered manifest state) then
                    state
                else
                    addToCluster manifest state

            expandLocalGroup state' manifests

        | manifest :: manifests ->

            //Manifest has not been visited, so add to the visited list. As it has not been visited
            //it also cannot be clustered, so add to the current cluster. Expand the local group and
            //if this manifest represents a core point then add the local group to the list of
            //manifests to consider and recurse.

            let state' =                     
                state
                |> recordVisit manifest
                |> addToCluster manifest                      

            let localGroup = getLocalGroup' manifest //Includes self

            let manifests' = 
                if (isCore' localGroup) then
                    union manifests localGroup
                else
                    manifests

            expandLocalGroup state' manifests'

    fun manifest localGroup state ->

        //Increment the cluster, add the current manifest and expand

        let state' = startNewCluster manifest state
        in expandLocalGroup state' localGroup

This function accepts a Settings input, defines some internal helper functions and returns another function with signature Manifest -> Manifest list -> State -> State where the parameters are the manifest being considered, the list of manifests within the maximum distance (the local group) and the initial state and returning an updated state. This gives the function an effective signature of Settings -> Manifest -> Manifest list -> State -> State.

The function is used to create and expand a cluster, and works by first creating the new cluster by incrementing the ClusterNo of the state and and assigning the current manifest to that cluster, then an internal recursive function, expandLocalGroup, is used to evaluate each manifest in the local group - adding and expanding as appropriate.

You will notice at several points in this algorithm that it is possible for the same manifest to be considered multiple times. In these scenarios the visited and clustered functions are used to decide whether the manifest can be safely ignored.

Once I had an implementation of expandCluster the rest was fairly straight forward:

///Runs the DBSCAN algorithm against a list of manifests using the given settings
let rec run settings =

    //Partially apply some functions with settings where appropriate
    let getLocalGroup' = getLocalGroup settings
    let expandCluster' = expandCluster settings
    let isCore' = isCore settings

    fun state -> function
        | [] -> state
        | manifest :: manifests ->

            let state' = 
                if (visited manifest state) then

                    //Point has already been visited, so it can be ignored 
                    state
                else

                    let localGroup = getLocalGroup' manifest

                    if (isCore' localGroup) then

                        //Point is core, so it can be used to create a new cluster

                        state
                        |> recordVisit manifest
                        |> expandCluster' manifest localGroup

                    else

                        //Point is not core, so mark as noise. 

                        state
                        |> recordVisit manifest
                        |> markAsNoise manifest

            run settings state' manifests

Again, this function accepts a Settings and uses it to partially apply some utility functions before returning a function with signature State -> Manifest list -> State, giving the function the effective signature Settings -> Manifest list -> State -> State.

Code that calls the DBSCAN implementation is only likely to care about the clustered manifests, and possibly the list of manifests marked as noise. So I wrote another function to keep everything tidy. This function is the only publicly visible function in the DBSCAN module.

///Apply the DBSCAN to the given settings, returning the clusters and noise 
let apply settings = 

    let state = run settings State.Empty settings.Manifests

    let clusters = 
        state.Clusters
        |> Seq.groupBy fst
        |> Seq.map (fun (clusterNo, members) ->
                let manifests = 
                    members
                    |> Seq.map snd 
                    |> Seq.sortBy (fun manifest -> manifest.System.Name)
                    |> Seq.toList
                in (clusterNo, manifests)
            )
        |> Seq.sortBy fst
        |> Seq.toList

    (clusters, state.Noise)

The function has the signature Settings -> ((Int32 * Manifest list) list * Manifest list).

Putting it all together

Now that I had the raw data and the algorithm all I needed to do was put it all together into a usable format. For the sake of brevity I chose a simple console app which prompts the user for the maximum distance between systems in a cluster and the minimum number of systems required to make a cluster and then uses the algorithm to generate a list.

I used the centroid of each cluster to calculate a ballbark figure of the distance between them and then displayed which clusters were further than 100 LY away for each.

The list of clusters is then formatted and printed to the screen. The user can optionally save to a file that they can refer to later. You can see the source of the console application here.

If you are interested in seeing the results without building from source then here are some examples:

Max. distance Min. size Clusters Avg. size Report
40 LY 3 7 8.14 Click here
30 LY 3 5 6.2 Click here
30 LY 2 13 3.62 Click here
25 LY 3 4 5 Click here
25 LY 2 10 3.2 Click here
20 LY 2 10 2.8 Click here

Conclusions

In this post I have described how I created an immutable F# implementation of the DBSCAN algorithm, primarily for the purposes of making money in Elite:Dangerous. I enjoyed the combination of programming and gaming as it allowed me to play around with concepts that I would not usually encounter during my every day work and was fun to think about, implement and test.

In the future I may expand the console application to read its data directly from the various community spreadsheets using the Google API, maybe add in some route planning or convert it to a web API but for now I have another 4M credits to earn before I can afford an Asp, so if you'll excuse me...

(You can view the project here on GitHub.)

Comments