How can I do a spatial join with the sf package using st_join()

Question

Here's a toy example I've been wrestling with

# Make points
point1 <- c(.5, .5)
point2 <- c(.6, .6)
point3 <- c(3, 3)
mpt <- st_multipoint(rbind(point1, point2, point3))  # create multipoint

# Make polygons
square1 <- rbind(c(0, 0), c(1, 0), c(1,1), c(0, 1), c(0, 0))
square2 <- rbind(c(0, 0), c(2, 0), c(2,2), c(0, 2), c(0, 0))
square3 <- rbind(c(0, 0), c(-1, 0), c(-1,-1), c(0, -1), c(0, 0))
mpol <- st_multipolygon(list(list(square1), list(square2), list(square2)))  # create multipolygon

# Convert to class' sf'
pts <- st_sf(st_sfc(mpt))
polys <- st_sf(st_sfc(mpol))

# Determine which points fall inside which polygons
st_join(pts, polys, join = st_contains)

The last line produces

Error in as.data.frame.default(x[[i]], optional = TRUE, stringsAsFactors = stringsAsFactors) : 
  cannot coerce class "c("sfc_MULTIPOINT", "sfc")" to a data.frame

How can I do a spatial join to determine which points fall inside which polygons?

Could you clarify what you mean by "spatial join" ? What would be the expected result ? — lbusett, May 09 '17 at 06:16
Given a set of polygons and a set of points, create the mapping (PointId, PolygonId) that states which points are contained by which polygons. — Ben, May 09 '17 at 18:13
I recently wrote [this tutorial](https://gormanalysis.com/spatial-data-analysis-in-r/) for the [sf package](https://github.com/r-spatial/sf) to help myself and others understand the basic concepts. Understanding the fundamentals is the key to solving specific problems like the one I had here. — Ben, Aug 02 '17 at 15:51

lbusett · Accepted Answer · 2017-05-11T19:58:28.860

I'm also working my way around the features of the sf package, so apologies if this is not correct or there are better ways. I think one problem here is that if building the geometries like in your example you are not obtaining what you think:

> pts
Simple feature collection with 1 feature and 0 fields
geometry type:  MULTIPOINT
dimension:      XY
bbox:           xmin: 0.5 ymin: 0.5 xmax: 3 ymax: 3
epsg (SRID):    NA
proj4string:    NA
                     st_sfc.mpt.
1 MULTIPOINT(0.5 0.5, 0.6 0.6...

> polys
Simple feature collection with 1 feature and 0 fields
geometry type:  MULTIPOLYGON
dimension:      XY
bbox:           xmin: 0 ymin: 0 xmax: 2 ymax: 2
epsg (SRID):    NA
proj4string:    NA
                    st_sfc.mpol.
1 MULTIPOLYGON(((0 0, 1 0, 1 ...

You can see that you have only one "feature" both in pts and in polys. This means that you are building one "multipolygon" feature (that is, a polygon constituted by 3 parts), instead thatn three different polygons. The same goes for the points.

After digging a bit, I found this different (and in my opinion easier) way to build the geometries, using WKT notation:

polys <- st_as_sfc(c("POLYGON((0 0 , 0 1 , 1 1 , 1 0, 0 0))",
                     "POLYGON((0 0 , 0 2 , 2 2 , 2 0, 0 0 ))", 
                     "POLYGON((0 0 , 0 -1 , -1 -1 , -1 0, 0 0))")) %>% 
  st_sf(ID = paste0("poly", 1:3))    

pts <- st_as_sfc(c("POINT(0.5 0.5)",
                   "POINT(0.6 0.6)",
                   "POINT(3 3)")) %>%
  st_sf(ID = paste0("point", 1:3))

> polys
Simple feature collection with 3 features and 1 field
geometry type:  POLYGON
dimension:      XY
bbox:           xmin: -1 ymin: -1 xmax: 2 ymax: 2
epsg (SRID):    NA
proj4string:    NA
     ID                              .
1 poly1 POLYGON((0 0, 0 1, 1 1, 1 0...
2 poly2 POLYGON((0 0, 0 2, 2 2, 2 0...
3 poly3 POLYGON((0 0, 0 -1, -1 -1, ...

> pts
Simple feature collection with 3 features and 1 field
geometry type:  POINT
dimension:      XY
bbox:           xmin: 0.5 ymin: 0.5 xmax: 3 ymax: 3
epsg (SRID):    NA
proj4string:    NA
      ID              .
1 point1 POINT(0.5 0.5)
2 point2 POINT(0.6 0.6)
3 point3     POINT(3 3)

you can see that now both polys and pts have three features.

We can now find the "intersection matrix" using:

# Determine which points fall inside which polygons
pi <- st_contains(polys,pts, sparse = F) %>% 
  as.data.frame() %>% 
  mutate(polys = polys$ID) %>% 
  select(dim(pi)[2],1:dim(pi)[1])
colnames(pi)[2:dim(pi)[2]] = levels(pts$ID)

> pi
  polys point1 point2 point3
1 poly1   TRUE   TRUE  FALSE
2 poly2   TRUE   TRUE  FALSE
3 poly3  FALSE  FALSE  FALSE

meaning (as pointed out @symbolixau in the comments) that polygons 1 and 2 contain points 1 and 2, while polygon 3 doesn't contain any points. Point 3 is instead not contained in any polygon.

HTH.

Very interesting. Hopefully others with chime in on this. I wish the `sf` documentation had better examples for `st_join` and building simple features from scratch. — Ben, May 05 '17 at 14:13
I too find "building" the features as lists of lists a bit cumbersome. However, it's true that I rarely need to build "geometries" from scratch, and the speed increase with respect to `sp` is really great ! — lbusett, May 05 '17 at 14:32
I **think** the result is saying points 1 and 2 are in both polygons 1 and 2 (they overlap), and point 3 isn't in any polygon, and polygon 3 doesn't contain any points. (I tried the same question using a different approach other than `sf` to help interpret the results) — SymbolixAU, May 09 '17 at 11:11
If you are looking for better documentation on what the spatial functions, you can look here https://postgis.net/docs/reference.html — troh, May 13 '17 at 14:10

score 4 · Answer 2 · answered Jun 02 '17 at 14:56

4

I see a different output:

> # Determine which points fall inside which polygons
> st_join(pts, polys, join = st_contains)
Simple feature collection with 1 feature and 0 fields
geometry type:  MULTIPOINT
dimension:      XY
bbox:           xmin: 0.5 ymin: 0.5 xmax: 3 ymax: 3
epsg (SRID):    NA
proj4string:    NA
                        geometry
1 MULTIPOINT(0.5 0.5, 0.6 0.6...

was this with the most recent CRAN version of sf?

answered Jun 02 '17 at 14:56

Edzer Pebesma

3,474
14
23

Thanks for the reply and awesome package. At the time I wrote this question, it was the most recent version on CRAN. Now I'm on 0.4-3 and I still get an error in the same spot (although a different message). The culprit appears to be this call `pts – Ben Jun 02 '17 at 15:21
1

Thanks - this has been corrected in the development version a while ago. – Edzer Pebesma Jun 02 '17 at 21:36

score -1 · Answer 3 · answered Mar 01 '18 at 18:44

Note, the original set of multipoint and multipolygon can be 'cast' to point and polygon, without creating new objects:

st_contains(polys %>% st_cast("POLYGON"), pts %>% st_cast("POINT"), sparse = F)
#      [,1]  [,2]  [,3]
#[1,]  TRUE  TRUE FALSE
#[2,]  TRUE  TRUE FALSE
#[3,] FALSE FALSE FALSE

How can I do a spatial join with the sf package using st_join()

3 Answers3