1

I work for a healthcare company and have information on patient's location (address, city, state, zip). I am trying to determine what percentage of patients live closest to 5 specific locations. The answer I'm looking for would be something like "25% of patients live closest to location #1, 10% closest to #2, etc." I'm hoping someone can point me in the right direction (app to use/approach to take).

The data is stored in a Tableau extract but work with the data in Excel.

Applications I have at my disposal in order of comfort are: - Tableau - Excel - Alteryx - Python - R

anddon
  • 15
  • 5

3 Answers3

2

There are multiple tasks to tackle this problem. You might want to divide the responsibilities up among different tools at different stages, say using Python or R to do some data prep and enrichment work up front, followed by Tableau doing some interactive visualization further along in the process.

In addition to the tools you mention, I recommend spending some time getting familiar with spatial databases such as the open source PostGIS. Oracle and Microsoft also have spatial extensions to their databases. With a spatial database you can store spatial data such as points, polygons and paths along with relational data, and they have spatial functions built-in as well.

The main tasks to consider:

  • Geocoding - convert addresses to spatial coordinates. You're going to want to use a geocoding web service for this, probably calling it from Python or R. This is something you'd want to do once up front when preparing new data for analysis and then store the result for reuse by later stages.

  • Computing voronoi polygons. This algorithm tiles the plane with polygons so that all the points within a polygon are closest a particular point. So if you want to know which areas are closest to each of 5 hospitals, a voronoi diagram will divide the map into non-overlapping regions, with one polygon for each hospital containing the areas that are closer to that hospital than any other. So if your hospitals don't move, you should compute the voronoi polygons once and store the result for reuse. There are libraries in Python and R that can do this. PostGIS has a function for computing voronoi polygons also.

  • Hit testing. This step tests whether a point falls within a polygon. You can either perform hit testing by calling dedicated R or Python functions, but I recommend using a spatial database like PostGIS instead. Then you can perform hit testing, either by calling a function or by using the INTERSECTS spatial join operator. You can optimize performance by building a spatial index for your polygons after loading them into the database.

  • Presentation, Summarization. This is where Tableau is helpful. You can display spatial data like points and polygons, directly from a spatial database if you are using one, and also easily compute summary statistics like the percentages you mentioned.

The more recent versions of Tableau have even more support for spatial data and can compute distances and buffers although it may still be faster to push that work into the spatial database.

This point is probably obvious to many people, but just to be clear ... The approach described above performs well when the points you are measuring against, say the hospitals, have static locations so it is the worth the time to precompute the Voronoi polygons and to create any spatial indices - and also when you have many points that you wish to test. In those cases, the precomputed polygons and spatial indices can pay off with performance gains.

For smaller data sets, you can of course just compute the distance between every possible source and every possible destination and then choose the connection with the smallest distance. That simple, but brute force approach, gets less feasible as the number of data points grows, in which case its more worth the effort to go through the steps above.

Alex Blakemore
  • 10,521
  • 2
  • 22
  • 46
1

There are several ways you can do this in Alteryx by using their Spatial tools. Some general methods you could consider are calculating the distance each "address" is to each "location" using the distance tool, rank each "address"' distance from each "location", then using the summarize tool you can group them by rank and then "location" and see various statistics.

From that data set you can visualize this in Tableau in multiple ways, get creative!

You could calculate these distances in Excel with a lot more hassle. If you have latitudes and longitudes, the formula for distance between two points is:

=ACOS(COS(RADIANS(90-Lat1)) *COS(RADIANS(90-Lat2)) +SIN(RADIANS(90-Lat1)) *SIN(RADIANS(90-Lat2)) *COS(RADIANS(Long1-Long2))) *6371
Dan
  • 434
  • 2
  • 12
  • The latest versions of Tableau have a function called Distance() that implements the Haversine function you cite. – Alex Blakemore Apr 18 '20 at 19:45
  • That is true and makes it much easier to do those calculations in Tableau than before. I'd still suggest doing all calculations in Alteryx as it would be a fraction of the time and complexity as it would be to do it in Tableau. – Dan Apr 19 '20 at 20:52
  • Unfortunately my version of Tableau doesn't have this function. Believe the tools in Alteryx will prove useful. Thank you! – anddon Apr 21 '20 at 19:25
0

I think you could write this in Alteryx, Python and possibly R. Then visualise in Tableau. I did something similar to that with UK schools using Alteryx. Check this blog post for details around the Alteryx tools I used and how it worked. https://tarsolutions.co.uk/blog/alteryx-tableau-display-uk-schools-data/

Andy TAR Sols
  • 1,592
  • 2
  • 3
  • 11