-1

I have many different strings, such as:

php

$names = [
"David England Mancester",
"David France Paris ",
"David Spain",
"Roger Spain",
"Trevor England",
"Trevor Russia Moscow",
"Lucy Russia",
"Richard J. Russia",
"Richard J. England Blyth",
"Richard M. England",
];

What I need is a way to return the matches, from the start of the string, that appear more than once.

php

$result = ["David", "Trever", "Richard J"]

I thought about splitting the first word from each string and returning any with a count of > 1 but my issue is it could be more than the first word.

This is a made-up sample, the real data will have around 400 strings, so it shouldn't be an issue if it's computationally time consuming, it will only happen once or twice in the apps life cycle.

Can anyone help at all with this?

Serving Quarantine period
  • 66,345
  • 10
  • 43
  • 85
gclark18
  • 494
  • 4
  • 14

1 Answers1

1

Here is a simple way to do a count of words for each element of the array. It takes into account to separate names from locations that, if one more name is found, the last one will have only one character or a "." at the end:

<?php
$names = [
    "David England",
    "David France",
    "David Spain",
    "Roger Spain",
    "Trevor England",
    "Trevor Russia",
    "Lucy Russia",
    "Richard J. Russia",
    "Richard J. England",
    "Richard M. England",
];
$counts = [];
$countsWithMoreThanOneElement = [];

foreach ($names as $i => $name) {
    if (trim($name) === '') {
        continue;
    }

    $tmp = explode(' ', $name);

    if (count($tmp) > 1) {
        // Let's check for a dot or a single letter to separate location and names

        $wordWithNames = [];
        $foundMoreThanOneName = false;

        foreach ($tmp as $j => $word) {
            $wordWithNames[] = str_replace('.', '', $word);

            if (strpos($word, '.') !== false || strlen($word) === 1) {
                $foundMoreThanOneName = true;

                break;
            }
        }

        if (!$foundMoreThanOneName) {
            array_pop($wordWithNames);
        }

        $name = implode(' ', $wordWithNames);
    } else {
        $name = $tmp[0];
    }

    if (!isset($counts[$name])) {
        $counts[$name] = 0;
    }

    ++$counts[$name];

    if ($counts[$name] > 1 && !in_array($name, $countsWithMoreThanOneElement)) {
        $countsWithMoreThanOneElement[] = $name;
    }
}

print_r($countsWithMoreThanOneElement);

Result:

Array
(
    [0] => David
    [1] => Trevor
    [2] => Richard J
)

Runnable example: http://sandbox.onlinephpfunctions.com/code/0e384e24f5ec350b4b7b27397ec8aab20515570c

EDIT 1: I'm sorry, I didn't read the from the start of the string part the first time.

EDIT 2: And I misunderstood again another part of the question :P Now it should work!

Serving Quarantine period
  • 66,345
  • 10
  • 43
  • 85