11

I have a 2-dimensional array of data, with some missing values. There are three columns:

  • x
  • y
  • intensity

I can plot x against y in ggplot2, with intensity as a colour scale.

enter image description here

I’d like to smooth the transitions between the colours, and have come across the idw function, from the gstat package. idw aims to interpolate NAs in 2-dimenstions. It shouldn’t extrapolate, and whilst it technically does respect the limits of the data (±20 in both directions), it also makes an attempt to fill the gaps at the edge of the plot, as seen below:

enter image description here

I’d like to avoid any extrapolation occurring outside the limits of the data I have, including the bottom-right of the data shown in the first figure.

How might I achieve this?

Edit: Here is an example dataset. This isn't exactly the same dataset as shown above, but it again contains a large missing region in the lower-right corner.

structure(list(x = c(10L, 15L, -10L, 0L, -5L, -10L, -15L, 0L, 
-15L, 15L, 5L, 10L, -20L, -5L, -15L, -15L, -5L, 5L, 20L, -20L, 
-15L, 20L, -15L, 5L, -5L, -20L, -5L, 15L, 0L, 0L, 15L, 10L, 0L, 
20L, -10L, 5L, 5L, 0L, 20L, 5L, -15L, 5L, -5L, -5L, -15L, -10L, 
-10L, -10L, -5L, -10L, 15L, 20L, 0L, 20L, -15L, 20L, -20L, -15L, 
10L, 15L, 15L, -5L, 5L, 15L, 20L, 20L, -10L, -20L, -20L, 15L, 
-10L, 10L, 5L, -20L, 20L, 10L, 0L, 10L, -10L, 0L, 10L, 10L, 10L, 
-20L, 15L, -20L, 0L, -20L, -5L, 5L), y = c(0L, -10L, 0L, 20L, 
0L, -10L, 0L, 0L, -20L, 20L, 0L, -10L, -10L, -10L, -10L, 20L, 
10L, -10L, -20L, -20L, -10L, -10L, 0L, 10L, -20L, 20L, 0L, 0L, 
0L, -20L, 0L, 0L, 10L, 10L, -20L, -20L, -10L, 20L, 10L, 20L, 
10L, -20L, 20L, -10L, 20L, 20L, 10L, 10L, -20L, -10L, -10L, 20L, 
-10L, -10L, -20L, 0L, -10L, 10L, -10L, 10L, -20L, 10L, 20L, 20L, 
-20L, 20L, 0L, 10L, 10L, -20L, 20L, -20L, 10L, 0L, 0L, 10L, 10L, 
-20L, -20L, -20L, 20L, 20L, 10L, 20L, 10L, -20L, -10L, 0L, 20L, 
0L), intensity = c(12.9662, NA, 24.4379, 26.3923, 26.9449, 16.7372, 
13.7691, 8.029, 11.922, 11.1967, 15.2792, NA, 14.4159, 20.6542, 
22.0509, 17.356, 14.3841, NA, NA, 10.326, 6.0451, NA, 12.9515, 
3.6745, NA, 18.1552, 9.9532, 9.9361, 7.0392, NA, 10.9814, 10.8351, 
4.9017, 5.7864, 14.098, NA, NA, 6.3305, 6.4405, 49.2791, 19.9774, 
NA, 25.1955, 28.5234, 20.2077, 20.3224, 12.688, 22.1371, NA, 
17.5108, NA, 7.9351, NA, NA, 11.0975, 8.2349, 12.1194, 21.865, 
NA, 10.7178, NA, 21.8222, 13.5971, 6.9751, NA, 8.8046, 22.0709, 
14.2043, 27.8561, NA, 17.4329, NA, 7.4057, 15.2797, 1.0122, 11.1874, 
35.5814, NA, 27.5919, NA, 11.8159, 15.8433, 12.297, 29.1978, 
20.4151, 22.6336, NA, 16.0019, 16.9746, 10.8613)), .Names = c("x", 
"y", "intensity"), row.names = c(NA, -90L), class = "data.frame")
CaptainProg
  • 5,188
  • 22
  • 67
  • 109
  • You could just resample the original data to the grid size you want, leaving NAs where you have no data. Then, use that resampled data to mask your IDW output. – Forrest R. Stevens May 25 '16 at 12:47
  • I find it curious that you find a point of high intensity in your extrapolated region (around 5, -20). As far as I understand idw, this should not happen. – S van Balen Dec 28 '16 at 12:49

1 Answers1

6

If I understand what you are trying to do, you can do this with base methods in ggplot2 by removing NA prior to interpolating.

 library(ggplot2)

 data<-data.frame(data)
 data_NA.rm<-data[!is.na(data$intensity),]

 ggplot(data=data_NA.rm,aes(x=x,y=y))+
      geom_raster(aes(fill=intensity),interpolate=TRUE)

Results in:

enter image description here

user20650
  • 21,689
  • 5
  • 46
  • 77
SeldomSeenSlim
  • 761
  • 8
  • 21