0

Got my hand dirty on GPS trajectory dataset. This data set consists of sequence of the GPS points of trips for users, until the length of trip:

SELECT * FROM gps_track;
+---------+------------------+------------------+
| user_id |       lat        |       lon        |
+---------+------------------+------------------+
| 1       | 39.984702        | 116.318417       |
| 1       | 39.984683        | 116.31845        |
| 1       | 39.984611        | 116.318026       |
| .       | .                | .                |
| 2       | 26.162202        | 119.943787       |
| 2       | 26.161528        | 119.943234       |
| 2       | 26.1619          | 119.943228       |
| .       | .                | .                |
| 3       | 22.8143366666667 | 108.332281666667 |
| 3       | 22.81429         | 108.332256666667 |
| 3       | 22.81432         | 108.332258333333 |
| .       | .                | .                |
| 4       | 32.9239666666667 | 117.386683333333 |
| 4       | 32.9235166666667 | 117.386616666667 |
| 4       | 32.9232833333333 | 117.386683333333 |
| .       | .                | .                |
+---------+------------------+------------------+

I can get the COUNT of GPS points for each user_id 1, 2,3,.. etc.

SELECT distinct user_id
    , COUNT(lat) AS lat_count
FROM gps_track
GROUP BY user_id

How do I then get the median of the number of GPS points in all the trips? Not the median point for each user. Here's the fiddle for sample points from my dataset.

arilwan
  • 1,952
  • 2
  • 16
  • 29

1 Answers1

1

Maybe:

SELECT percentile_disc(0.5) WITHIN GROUP (ORDER BY lat_count)
FROM (SELECT user_id
    , COUNT(lat) AS lat_count
FROM gps_track
GROUP BY user_id) du;
pifor
  • 6,358
  • 2
  • 5
  • 14