googlePolylines - the workhorse of mapping
One of the biggest hurdles I faced when building googleway
was plotting sf
objects. Especially given I set myself these constraints:
- it had to be quick (I found leaflet's implementation too slow)
- to extract the coordinates
- to transfer the data from R to javascript
- there was to be no dependency on
library(sf)
I tried several implementations, but could never achieve the speed I wanted.
R -> javascript
Luckily Google developed the Encoded polyline algorithm to reduce a series of coordinates into a string. This gave me a way to transfer compressed data to javascript. All I had to do was extract the coordinates...
sf coordinates
The sf
library has a handy function sf::st_coordinates()
for extracting coordinates. But I couldn't use it (see constraint 2).
So I wrote my own extraction code, in C++, which also did the encoding in one step. This gave me the speed, the compressed data, and no dependency on sf
.
This lead to the development of a helper package googlePolylines
.
I've talked about this decision before (from my googlePolylines post)
I could have kept this new functionality inside googleway, but I thought it would be best suited to its own package. That way other mapping libraries could make use of it if they had support for polylines, without taking a dependency on the whole of googleway.
but in this post I want to revisit the decision and talk a bit more about the process of deciding whether to create new packages or slot additional functionality into current packages.
How many packages do you need to build?
Our team doesn't use strict code rules, but there are a few agreed guidelines we try to stick to. One of these is "one function per function" to keep each function in a package maintainable, testable and save our sanity when we need to adjust things 18 months later.
This principle is a good one to apply to packages too - "One purpose per package". Simpler, focussed packages are easier to explain, maintain and ....another clever word that ends in "ain"...
So when deciding to split googlePolylines
into a separate package my thinking went along these lines:
Cons
- One more package to maintain.
- Niggling worry we are adding to an already bloated open source universe (after all,
googlePolylines
is only a bunch of methods to encode and decode data into polylines). - With such a focussed package theme, would any other package ever need this functionality?
Pros
- The conceptual "neatness" of keeping one purpose per package.
- I might be able to convince the boss to buy new hex stickers.
So I packaged up the encoding/decoding capabilities, and used these functions to give the googleway
maps the speed they needed. You can check out the code implemetation within googleway
in source code like this in our github repo.
How did it work out?
Recently I have found this decision is paying off.
Back in January I hadn't conceived of writing another mapping library. But in July at useR 2018 someone mentioned Deck.gl, and wouldn't it be cool if it was made into an R library.
So I did, and released mapdeck in August. And having googlePolylines
as it's own package made plotting sf
objects trivial, since it does all the hard work transforming sf
objects into javascript-friendly objects.
And as it's used both in mapdeck
and googleway
the benefit is clear, when you compare to other packages like leaflet
.
library(microbenchmark)
library(microbenchmark)
library(sf)
library(geojsonsf)
library(leaflet)
library(googleway)
library(mapdeck)
sf <- geojsonsf::geojson_sf("https://raw.githubusercontent.com/SymbolixAU/data/master/geojson/SA1_2016_VIC.json")
microbenchmark(
google = {
## you need a Google Map API key to use this function
google_map(key = mapKey) %>%
add_polygons(data = sf)
},
mapdeck = {
mapdeck(token = mapKey) %>%
add_polygon(data = sf)
},
leaflet = {
leaflet(sf) %>%
addTiles() %>%
addPolygons()
},
times = 25
)
# Unit: milliseconds
# expr min lq mean median uq max neval
# google 530.4193 578.3035 644.9472 606.3328 726.4577 897.9064 25
# mapdeck 527.7255 577.2322 628.5800 600.7449 682.2697 792.8950 25
# leaflet 3247.3318 3445.6265 3554.7433 3521.6720 3654.1177 4109.6708 25
and most importantly
Yes, we got the stickers :-)