Geospatial Queries - R and MongoDB

In my never ending (and often it feels like I'm barely scratching the surface) quest to know everything there is to know about geo-spatial analysis, I've been delving into the world of geospatial databases (through R #obvs).

Now, if I'm brutally honest I haven't delved very far: I've used MongoDB for a few years, and was already aware of its geospatial capabilities. My hold-up has just been getting on and doing it!

The data

Helpfully the guys at MongoDB have supplied some clean and formatted GeoJSON data (in bold because in the real-world of data analysis, this is rare!), which is a set of restaurants (points) and neighbourhoods (polygons) in New York. So all we have to do is download and insert it into MongoDB.

(taken from the example given in the MongoDB tutorial)

## I've omitted the connection details to my MongoDB instance
library(mongolite)
mNeighbourhood <- mongo( ... )
url <- "https://raw.githubusercontent.com/mongodb/docs-assets/geospatial/neighborhoods.json"
mNeighbourhood$import(curl::curl(url))


mRestaurant <- mongo( ... )
url <- "https://raw.githubusercontent.com/mongodb/docs-assets/geospatial/restaurants.json"
mRestaurant$import(curl::curl(url))

And there we have it

restaurants.png

Which we can also get into R

mRestaurant$find( limit = 1 )
#   location.coordinates location.type                  name
# 1  -73.85608, 40.84845         Point Morris Park Bake Shop

So what's the point?

One of the benefits of geospatial databases are the geospatial query operations.

For example, imagine you're standing on the banks of the Harlem River...

library(googleway)
mapKey <- "your_api_key"

df_me <- data.frame(lat = 40.82302903, lon = -73.93414657, colour = "blue")
google_map(key = mapKey) %>%
  add_markers(data = df_me, colour = "colour")
googleRiver.png

...and you're hungry. You might want to know all the restaurants near by.

Well to help you out, MongoDB has a set of geospatial queries, and the one to find things 'near by' is $geoNear

qry <- '[
  {
    "$geoNear" : { 
      "near" : { "type" : "Point", "coordinates" : [ -73.93414657, 40.82302903 ] },
      "distanceField" : "dist.calculated",
      "maxDistance" : 1000,
      "spherical" : true
    }
  }
] '

df <- mRestaurant$aggregate(pipeline = qry)
df_coords  <- setNames(as.data.frame(do.call(rbind, df$location$coordinates)), c("lon", "lat"))


google_map(key = mapKey) %>%
  add_markers(data = df_me, colour = "colour") %>%
  add_circles(data = df_me, radius = 1000) %>%
  add_markers(data = df_coords)
googleRestaurants.png

Now imagine you want to know which neighbourhoods are bordering the one in which you're standing. The first step is to find out the neighbourd that you're in

## get your neighbourhood
qry <- '
  {
    "geometry" : {
      "$geoIntersects" : {
        "$geometry" : {
          "type" : "Point", 
          "coordinates" : [-73.93414657, 40.82302903]
        }
      }
    }
  }'

myNeighbourhood <- mNeighbourhood$find(query = qry, fields = '{ "geometry.coordinates" : 1, "name" : 1 }' )
myNeighbourhood$name
# "Central Harlem North-Polo Grounds"

Ok, so we're standing in Central Harlem North-Polo Grounds. To find the neighbourhoods bordering us we need find all those that share a border. All this needs is another call to $geoIntersects, but this time on a Polygon, rather than a point

## create a polygon-string to use in the query
lons <- myNeighbourhood$geometry$coordinates[[1]][, , 1]
lats <- myNeighbourhood$geometry$coordinates[[1]][, , 2]
poly <- paste0("[ [", paste0("[", paste0(lons, ", ", lats, collapse = "], ["), "]"), "] ]")

qry <- paste0('
{
    "geometry" : {
      "$geoIntersects" : {
        "$geometry" : {
          "type" : "Polygon", 
          "coordinates" : ', poly, '
        }
      }
    }
  }
')

mNeighbourhood$find(query = qry, fields = '{ "name" : 1, "_id" : 0}')

#                                name
# 1       park-cemetery-etc-Manhattan
# 2 Central Harlem North-Polo Grounds
# 3                 East Harlem North
# 4              Central Harlem South
# 5                    Manhattanville
# 6                  Hamilton Heights