googlePolylines - the workhorse of mapping

One of the biggest hurdles I faced when building googleway was plotting sf objects. Especially given I set myself these constraints:

  1. it had to be quick (I found leaflet's implementation too slow)
    • to extract the coordinates
    • to transfer the data from R to javascript
  2. there was to be no dependency on library(sf)

I tried several implementations, but could never achieve the speed I wanted.

R -> javascript

Luckily Google developed the Encoded polyline algorithm to reduce a series of coordinates into a string. This gave me a way to transfer compressed data to javascript. All I had to do was extract the coordinates...

sf coordinates

The sf library has a handy function sf::st_coordinates() for extracting coordinates. But I couldn't use it (see constraint 2).

So I wrote my own extraction code, in C++, which also did the encoding in one step. This gave me the speed, the compressed data, and no dependency on sf.

This lead to the development of a helper package googlePolylines.

I've talked about this decision before (from my googlePolylines post)

I could have kept this new functionality inside googleway, but I thought it would be best suited to its own package. That way other mapping libraries could make use of it if they had support for polylines, without taking a dependency on the whole of googleway.

but in this post I want to revisit the decision and talk a bit more about the process of deciding whether to create new packages or slot additional functionality into current packages.

How many packages do you need to build?

Photo by Simson Petrol on Unsplash

Our team doesn't use strict code rules, but there are a few agreed guidelines we try to stick to. One of these is "one function per function" to keep each function in a package maintainable, testable and save our sanity when we need to adjust things 18 months later.

This principle is a good one to apply to packages too - "One purpose per package". Simpler, focussed packages are easier to explain, maintain and ....another clever word that ends in "ain"...

So when deciding to split googlePolylines into a separate package my thinking went along these lines:

Cons

  • One more package to maintain.
  • Niggling worry we are adding to an already bloated open source universe (after all, googlePolylines is only a bunch of methods to encode and decode data into polylines).
  • With such a focussed package theme, would any other package ever need this functionality?

Pros

  • The conceptual "neatness" of keeping one purpose per package.
  • I might be able to convince the boss to buy new hex stickers.

So I packaged up the encoding/decoding capabilities, and used these functions to give the googleway maps the speed they needed. You can check out the code implemetation within googleway in source code like this in our github repo.

How did it work out?

Recently I have found this decision is paying off.

Photo by Ales Me on Unsplash

Photo by Ales Me on Unsplash

Back in January I hadn't conceived of writing another mapping library. But in July at useR 2018 someone mentioned Deck.gl, and wouldn't it be cool if it was made into an R library.

So I did, and released mapdeck in August. And having googlePolylines as it's own package made plotting sf objects trivial, since it does all the hard work transforming sf objects into javascript-friendly objects.

And as it's used both in mapdeck and googleway the benefit is clear, when you compare to other packages like leaflet.

library(microbenchmark)
library(microbenchmark)
library(sf)
library(geojsonsf)
library(leaflet)
library(googleway)
library(mapdeck)

sf <- geojsonsf::geojson_sf("https://raw.githubusercontent.com/SymbolixAU/data/master/geojson/SA1_2016_VIC.json")

microbenchmark(

  google = {

    ## you need a Google Map API key to use this function
    google_map(key = mapKey) %>%
      add_polygons(data = sf)
  },

  mapdeck = {
    mapdeck(token = mapKey) %>%
      add_polygon(data = sf)
  },

  leaflet = {
    leaflet(sf) %>%
      addTiles() %>%
      addPolygons()
  },
  times = 25
)

# Unit: milliseconds
#     expr       min        lq      mean    median        uq       max neval
#   google  530.4193  578.3035  644.9472  606.3328  726.4577  897.9064    25
#  mapdeck  527.7255  577.2322  628.5800  600.7449  682.2697  792.8950    25
#  leaflet 3247.3318 3445.6265 3554.7433 3521.6720 3654.1177 4109.6708    25

and most importantly

Yes, we got the stickers :-)

googlePolylines.png