1

My current dataset is structured as follows:

> dput(head(MovementAnalysis, 3))
structure(list(Quarter = c(1L, 1L, 1L), Name = c("Greg", "Greg", 
"Greg"), Sample = 1:3, Position = c("Back", "Back", "Back"), X = c(26.6627, 
26.6564, 26.6497), Y = c(-12.3782, -12.3711, -12.3635), Time = c(0.01, 
0.02, 0.03), Timemin = c(0.000166666666666667, 0.000333333333333333, 
5e-04), Distance = structure(c(NA, 0.0094921019800667, 0.0101316336293823
), Size = 2L, Diag = FALSE, Upper = FALSE, method = "euclidean", call = dist(x = matrix(c(26.6564, 
    -12.3711, 26.6627, -12.3782), nrow = 2, byrow = TRUE)), class = "dist"), 
    Velocity = c(NA, 0.94921019800667, 1.01316336293823), Acceleration = c(NA, 
    NA, 6.39531649315564), AngularVelocity = c(NA_real_, NA_real_, 
    NA_real_)), .Names = c("Quarter", "Name", "Sample", "Position", 
"X", "Y", "Time", "Timemin", "Distance", "Velocity", "Acceleration", 
"AngularVelocity"), class = c("data.table", "data.frame"), row.names = c(NA, 
-3L))

I use the following code to find occurrences when Velocity is over 10.000.

require(dplyr)
my.data.frame <- filter(MovementAnalysis, Name == "Greg" | Velocity > 10.000)

However, the data frame below is returned:

> dput(head(my.data.frame, 3))
structure(list(Quarter = c(1L, 1L, 1L), Name = c("Greg", "Greg", 
"Greg"), Sample = 1:3, Position = c("Back", "Back", "Back"), X = c(26.6627, 
26.6564, 26.6497), Y = c(-12.3782, -12.3711, -12.3635), Time = c(0.01, 
0.02, 0.03), Timemin = c(0.000166666666666667, 0.000333333333333333, 
5e-04), Distance = structure(c(NA, 0.0094921019800667, 0.0101316336293823
), Size = 2L, Diag = FALSE, Upper = FALSE, method = "euclidean", call = dist(x = matrix(c(26.6564, 
    -12.3711, 26.6627, -12.3782), nrow = 2, byrow = TRUE)), class = "dist"), 
    Velocity = c(NA, 0.94921019800667, 1.01316336293823), Acceleration = c(NA, 
    NA, 6.39531649315564), AngularVelocity = c(NA_real_, NA_real_, 
    NA_real_)), .Names = c("Quarter", "Name", "Sample", "Position", 
"X", "Y", "Time", "Timemin", "Distance", "Velocity", "Acceleration", 
"AngularVelocity"), class = c("data.table", "data.frame"), row.names = c(NA, 
-3L))

This seems like it may be a silly mistake that I am making, but I have been unable to solve it. Where am I going wrong?!

user2716568
  • 1,624
  • 3
  • 20
  • 32
  • 3
    Your code finds rows where `Name == "Greg"` _or_ `Velocity > 10.000`. If you want them both to be true, use `&` or just `,` instead of `|`. – alistaire Feb 18 '16 at 05:16
  • Why do you need to filter by `Name` here? Please make a more realistic example regarding your real case. –  Feb 18 '16 at 05:16
  • I filter by name because my data frame is comprised of more than 50 participants and is 3068944 obs. of 12 variables. I wish to identify `Velocity` outliers by each participant. My example is realistic and is a real case! – user2716568 Feb 18 '16 at 05:19
  • Thank you @alistaire, I was using the last `dplyr` answer from this question: http://stackoverflow.com/questions/4935479/how-to-combine-multiple-conditions-to-subset-a-data-frame-using-or – user2716568 Feb 18 '16 at 05:20
  • No it is not. Use at least 2 different names –  Feb 18 '16 at 05:21
  • @Pascal I only wanted to print the first 3 rows as it is a large data frame. I tried to make my example as simple as possible to understand and to answer my specific (real!) question. I don't know why you are requiring me to use at least 2 different names, when I want to identify outliers for one participant. – user2716568 Feb 18 '16 at 05:24
  • Because in your example, filtering by `Name` doesn't make sense. –  Feb 18 '16 at 05:25
  • @Pascal But it does make sense to what I want to search for, using my real dataset. I was also unsure if there was something wrong with my code when trying to filter by both conditions, which the first person to comment kindly pointed out. I am sorry that my question didn't meet your expectation and will seek to improve on this when asking questions in the future. I simply wanted to see where I was going wrong when running the code through my dataset. – user2716568 Feb 18 '16 at 05:33

0 Answers0