2

So I have a location search field that I want to accept pretty much everything (city, state and zip), examples:

  • Los Angeles, California
  • California
  • 90210
  • Orange CA

And any combination there of...

From that I split up the words into an array with

$inputs = preg_split("/[\s,-\/]+/", $input);

Which gives me something like

array(5) {
    [0]=> string(4) "Some"
    [1]=> string(4) "City"
    [2]=> string(3) "New"
    [3]=> string(4) "York"
    [4]=> string(5) "88888"
}

Then I pick out the zip code first

foreach ($inputs as $key => $value) {

    if (is_numeric($value) && strlen($value) == 5) {

        $zip = $value;              
        unset($inputs[$key]);
    }
}

Notice the unset()

Now I need to match the state name to my database of states. The dilemma is that some states have multiple words in the name (North Carolina, New York).

How can I match my $inputs to state names and abbreviations, the remove the matched criteria from my array (I have to do the same thing for cities next)?


I was thinking of trying

$inputString = "'" . implode("','", $inputs) . "'";

$result = mysql_query("SELECT state_name
                      FROM states
                      WHERE state_name IN ({$inputString})
                      OR state_abbrev IN ({$inputString})");

But that doesn't tell which stuff it matched or work for multi-word states

Edit:

To the haters, I would rather not have 3 separate fields. I think this complicates the user experience. I would rather have the server do the thinking instead of them, to best guess the location they were trying convey. I'll have an "advanced" search as well, which will have these fields, but all those fields take up too much space for the site design.

Examples:

Steve Robbins
  • 12,813
  • 9
  • 68
  • 117
  • 1
    You say you want to break up the words into an array, right? Doesn't that remove valuable information from the location? If I enter...say "New York-New York, Nevada" (a real place), then how can I differentiate between cities and states if I can pick it off at any order? – Rey Gonzales Jun 28 '12 at 21:58
  • 3
    Google probably have a team of 50 engineers working on this problem. It will be very hard to solve for one person with a bit of mySQL - at the very least, it will be a lot of work – sandradev Jun 28 '12 at 21:58
  • What about doing something with AJAX? Search based on the available data and return a list of options for the user to pick from – Paul Dessert Jun 28 '12 at 22:01
  • 7
    You like how Google does it? Well, [get them to do it for you](https://developers.google.com/maps/documentation/geocoding/) ;-) – DaveRandom Jun 28 '12 at 22:01
  • 2
    Zillow is a good (non Google) example – Paul Dessert Jun 28 '12 at 22:02
  • @DaveRandom: I like it, but: _"the Geocoding API may only be used in conjunction with a Google map"_.... but it indeed gets abused a lot :) – Wrikken Jun 28 '12 at 22:09
  • 1
    @sandradev Well out of the ~1,000,000 SO users I'm sure we can come up with something – Steve Robbins Jun 28 '12 at 22:12
  • @Wrikken There is a simple work around to that clause - give people the *option* to view it on a map, to "verify a correct entry" or some such invention. – DaveRandom Jun 28 '12 at 22:14
  • @DaveRandom: hehe, I was thinking the exact same thing ('we are going to deliver here [tiny view on map link]) indeed ;) – Wrikken Jun 28 '12 at 22:16
  • @stevether you're trying to preform something which is kind of semantic search with one `foreach` loop ? – Nir Alfasi Jun 28 '12 at 22:50
  • @Paul try to run Rey's example with zillow: "New York-New York, Nevada" and see what you'll get... you can also see that they write: "Address or neighborhood or city or zip" – Nir Alfasi Jun 28 '12 at 23:07
  • 1
    @alfasin "New York-New York" isn't a city there but a name of a hotel, something I'm not worried about – Steve Robbins Jun 28 '12 at 23:21
  • @stevether this might help: http://stackoverflow.com/questions/16413/parse-usable-street-address-city-state-zip-from-a-string – Nir Alfasi Jun 28 '12 at 23:24
  • @alfasin - what stevether said ;) It's the name of a resort on the Vegas strip – Paul Dessert Jun 28 '12 at 23:33
  • @Paul okay, that's a bad example. but zillow are no so good as well, for example, I searched 2 different streets in the same town and it returned with the same exact results. then I tried to remove the street and ran the same search with the town name only and guess what ? yes, I got the same result for the third time... – Nir Alfasi Jun 28 '12 at 23:41
  • http://stackoverflow.com/a/16475/1057429 – Nir Alfasi Jun 28 '12 at 23:44
  • @alfasin - Yep, not perfect, but overall, it's good – Paul Dessert Jun 29 '12 at 00:36

4 Answers4

2

You could add a column to your address table that contains the concatenation of City name, State name, Zip code, and so on. Then set a FULLTEXT index on it and run a full text search of your whole input string on it.

Not sure how well this performs, though.

RandomSeed
  • 27,760
  • 6
  • 45
  • 82
0

This is what I'm using currently but there's so many loops and queries that I doubt it's efficient or "guesses" very accurately

    function getLocations($input) {

    $state = NULL;
    $zip = NULL;

    $input = strtoupper(trim($input));

    $inputs = preg_split("/[^a-zA-Z0-9]+/", $input);

    // Resolve zip code
    foreach ($inputs as $key => $value) {

        if (is_numeric($value) && strlen($value) == 5) {

            $zip = $value;              
            unset($inputs[$key]);
        }
    }

    $inputs = array_reverse($inputs);

    $result = mysql_query("SELECT state_name, state_abbrev FROM states");

    // Resolve state (one worded)
    while ($row = mysql_fetch_assoc($result)) {

        foreach ($inputs as $key => $value) {

            if ($row['state_abbrev'] == $value || $row['state_name'] == $value) {

                $state = $row['state_abbrev'];
                unset($inputs[$key]);

                return array(
                    'city' => "'" . implode(" ", array_reverse($inputs)) . "'",
                    'state' => "'" . $state . "'",
                    'zip' => "'" . $zip . "'"
                );
            }
        }
    }

    // Resolve state (2/3 worded)
    for ($i = 0; $i < count($inputs) - 1; $i++) {

        $duoValue = @$inputs[$i + 1] . " " . @$inputs[$i];

        if (count($inputs) > $i + 2) {

            $trioValue = $inputs[$i + 2] . " " . $duoValue;
        }

        $result2 = mysql_query("SELECT state_name, state_abbrev FROM states") or die (mysql_error());

        while ($row = mysql_fetch_assoc($result2)) {

            if ($row['state_abbrev'] == $duoValue || $row['state_name'] == $duoValue) {

                $state = $row['state_abbrev'];
                unset($inputs[$i], $inputs[$i + 1]);

                return array(
                    'city' => "'" . implode(" ", array_reverse($inputs)) . "'",
                    'state' => "'" . $state . "'",
                    'zip' => "'" . $zip . "'"
                );
            }
            else if ($i < count($inputs) - 2) {

                if ($row['state_abbrev'] == $trioValue || $row['state_name'] == $trioValue) {

                    $state = $row['state_abbrev'];
                    unset($inputs[$i], $inputs[$i + 1], $inputs[$i + 2]);

                    return array(
                        'city' => "'" . implode(" ", array_reverse($inputs)) . "'",
                        'state' => "'" . $state . "'",
                        'zip' => "'" . $zip . "'"
                    );
                }
            }
        }
    }

    return array(
        'city' => "'" . implode(" ", array_reverse($inputs)) . "'",
        'state' => "'" . $state . "'",
        'zip' => "'" . $zip . "'"
    );
}
Steve Robbins
  • 12,813
  • 9
  • 68
  • 117
0

I completely agree with your idea of making it easy for the user and having all address info in 1 single input box. However, each user may input the information somewhat differently, and it will be very hard to come up with an algo that covers every case. The best bet is to see if someone has done this already, and as you mention, google has. Luckily, they have an API for just such a problem.

If you use the Google Maps Geocoder (https://developers.google.com/maps/documentation/geocoding/#GeocodingRequests), you can basically pass it anything that reasonably looks like an address, and it will return a well-structured address result.

Google's example: https://google-developers.appspot.com/maps/documentation/javascript/examples/geocoding-simple

Another Example - looking up the white house: Put this URL in your browser: http://maps.googleapis.com/maps/api/geocode/json?address=1600%20pennsylvania%20ave%20washongton%20dc&sensor=false (note I intentionally misspelled here to show the API's tolerance).

The API call returns a very useful JSON object:

{
   "results" : [
      {
         "address_components" : [
            {
               "long_name" : "1600",
               "short_name" : "1600",
               "types" : [ "street_number" ]
            },
            {
               "long_name" : "Pennsylvania Ave NW",
               "short_name" : "Pennsylvania Ave NW",
               "types" : [ "route" ]
            },
            {
               "long_name" : "Washington",
               "short_name" : "Washington",
               "types" : [ "locality", "political" ]
            },
            {
               "long_name" : "District of Columbia",
               "short_name" : "DC",
               "types" : [ "administrative_area_level_1", "political" ]
            },
            {
               "long_name" : "United States",
               "short_name" : "US",
               "types" : [ "country", "political" ]
            },
            {
               "long_name" : "20502",
               "short_name" : "20502",
               "types" : [ "postal_code" ]
            }
         ],
         "formatted_address" : "1600 Pennsylvania Ave NW, Washington, DC 20502, USA",
         "geometry" : {
            "location" : {
               "lat" : 38.89767770,
               "lng" : -77.03651700000002
            },
            "location_type" : "ROOFTOP",
            "viewport" : {
               "northeast" : {
                  "lat" : 38.89902668029149,
                  "lng" : -77.03516801970852
               },
               "southwest" : {
                  "lat" : 38.89632871970850,
                  "lng" : -77.03786598029153
               }
            }
         },
         "partial_match" : true,
         "types" : [ "street_address" ]
      },
      {
         "address_components" : [
            {
               "long_name" : "1600",
               "short_name" : "1600",
               "types" : [ "street_number" ]
            },
            {
               "long_name" : "Pennsylvania Ave NW",
               "short_name" : "Pennsylvania Ave NW",
               "types" : [ "route" ]
            },
            {
               "long_name" : "Washington",
               "short_name" : "Washington",
               "types" : [ "locality", "political" ]
            },
            {
               "long_name" : "District of Columbia",
               "short_name" : "DC",
               "types" : [ "administrative_area_level_1", "political" ]
            },
            {
               "long_name" : "United States",
               "short_name" : "US",
               "types" : [ "country", "political" ]
            },
            {
               "long_name" : "20500",
               "short_name" : "20500",
               "types" : [ "postal_code" ]
            }
         ],
         "formatted_address" : "1600 Pennsylvania Ave NW, Washington, DC 20500, USA",
         "geometry" : {
            "location" : {
               "lat" : 38.89871490,
               "lng" : -77.03765550
            },
            "location_type" : "ROOFTOP",
            "viewport" : {
               "northeast" : {
                  "lat" : 38.90006388029150,
                  "lng" : -77.03630651970849
               },
               "southwest" : {
                  "lat" : 38.89736591970851,
                  "lng" : -77.03900448029150
               }
            }
         },
         "partial_match" : true,
         "types" : [ "street_address" ]
      },
      {
         "address_components" : [
            {
               "long_name" : "1600",
               "short_name" : "1600",
               "types" : [ "street_number" ]
            },
            {
               "long_name" : "Pennsylvania Ave NW",
               "short_name" : "Pennsylvania Ave NW",
               "types" : [ "route" ]
            },
            {
               "long_name" : "Washington",
               "short_name" : "Washington",
               "types" : [ "locality", "political" ]
            },
            {
               "long_name" : "District of Columbia",
               "short_name" : "DC",
               "types" : [ "administrative_area_level_1", "political" ]
            },
            {
               "long_name" : "United States",
               "short_name" : "US",
               "types" : [ "country", "political" ]
            },
            {
               "long_name" : "20004",
               "short_name" : "20004",
               "types" : [ "postal_code" ]
            }
         ],
         "formatted_address" : "1600 Pennsylvania Ave NW, Washington, DC 20004, USA",
         "geometry" : {
            "location" : {
               "lat" : 38.89549710,
               "lng" : -77.03008090000002
            },
            "location_type" : "ROOFTOP",
            "viewport" : {
               "northeast" : {
                  "lat" : 38.89684608029150,
                  "lng" : -77.02873191970852
               },
               "southwest" : {
                  "lat" : 38.89414811970850,
                  "lng" : -77.03142988029153
               }
            }
         },
         "partial_match" : true,
         "types" : [ "street_address" ]
      },
      {
         "address_components" : [
            {
               "long_name" : "1600",
               "short_name" : "1600",
               "types" : [ "street_number" ]
            },
            {
               "long_name" : "Pennsylvania Ave SE",
               "short_name" : "Pennsylvania Ave SE",
               "types" : [ "route" ]
            },
            {
               "long_name" : "Hill East",
               "short_name" : "Hill East",
               "types" : [ "neighborhood", "political" ]
            },
            {
               "long_name" : "Washington",
               "short_name" : "Washington",
               "types" : [ "locality", "political" ]
            },
            {
               "long_name" : "District of Columbia",
               "short_name" : "DC",
               "types" : [ "administrative_area_level_1", "political" ]
            },
            {
               "long_name" : "United States",
               "short_name" : "US",
               "types" : [ "country", "political" ]
            },
            {
               "long_name" : "20003",
               "short_name" : "20003",
               "types" : [ "postal_code" ]
            }
         ],
         "formatted_address" : "1600 Pennsylvania Ave SE, Washington, DC 20003, USA",
         "geometry" : {
            "bounds" : {
               "northeast" : {
                  "lat" : 38.87865290,
                  "lng" : -76.98170180
               },
               "southwest" : {
                  "lat" : 38.87865220,
                  "lng" : -76.98170229999999
               }
            },
            "location" : {
               "lat" : 38.87865290,
               "lng" : -76.98170180
            },
            "location_type" : "RANGE_INTERPOLATED",
            "viewport" : {
               "northeast" : {
                  "lat" : 38.88000153029150,
                  "lng" : -76.98035306970850
               },
               "southwest" : {
                  "lat" : 38.87730356970850,
                  "lng" : -76.98305103029151
               }
            }
         },
         "partial_match" : true,
         "types" : [ "street_address" ]
      }
   ],
   "status" : "OK"
}    
davesnitty
  • 1,610
  • 12
  • 9
  • I'd say that is heading into the right direction, but according to what the OP (had) posted, it looks a bit like he/she wants to learn how to write such a service. – hakre Jun 29 '12 at 00:17
  • So Steve, how 'bout it? Are you looking to code this yourself or just need to get the job done (use the Google code)? Teasing apart a single string free form address field, as others have said, is a huge job, and Google has devoted a lot of effort to it. You can do it, but be prepared to spend a LOT of time on it, and handle lots of errors when real people start using it. I would split on space/comma, coalesce R to L compound state names, expand state abbreviations, expand street abbrevs, etc. Assume state names are to the right end, streets to the left. – Phil Perry Jul 31 '13 at 23:58
0

a possible solution would be to to just request a zip code from a user and use http://www.zippopotam.us/ 's api to get the state and city and such not sure if this follows your ux design your seeking but i've done this with jquery using their api which returns two fields with the values

   $("#text-4edcd39ecca23").keyup(function (event) {
        if (this.value.length === 5) {
            var $citywrap = $("#fm-item-text-4edcd393cb50f");
            var $city = $("#text-4edcd38744891");
            var $statewrap = $("#fm-item-text-4edcd38744891");
            var $state = $("#text-4edcd393cb50f");
            var $zip = $('#text-4edcd39ecca23');

            $.ajax({
                url:"http://zippo-zippopotamus.dotcloud.com/us/" + $zip.val(),
                cache:false,
                dataType:"json",
                type:"GET",
                data:"us/" + $zip.val(),
                success:function (result, success) {
                    // Remove Error Message if one is presant
                    $zip.parent().find('small').remove();
                    // US Zip Code Records Officially Map to only 1 Primary Location
                    var places = result['places'][0];
                    $city.val(places['place name']);
                    $state.val(places['state']);
                    $citywrap.slideDown();
                    $statewrap.slideDown();
                },
                error:function (result, success) {
                    $citywrap.slideUp();
                    $statewrap.slideUp();
                    $city.val('');
                    $state.val('');
                    $zip.parent().find('br').remove();
                    $zip.parent().find('small').remove();
                    $zip.after('<br /><small class="error">Sorry your zipcode was not reconized please try again</small>');
                }
            });
        }
    });
Clark T.
  • 1,410
  • 2
  • 11
  • 25