Given a string, find the longest substring with the same number of vowels and consonants?

Question

Given a string, find the longest substring with the same number of vowels and consonants.

CLARIFICATION: I am unsure, whether we can generate a new string, or the substring has to be part of the original string? So far I have this,

Code Snippet :

    Scanner scanner = new Scanner(System.in);
    String string = new String(scanner.next());
    int lengthOfString = string.length();
    int vowelCount = 0;
    int consCount = 0;

    for (int i = 0; i < lengthOfString; i++) {
        if (string.charAt(i) == 'u' || string.charAt(i) == 'e'  || string.charAt(i) == 'i'
                || string.charAt(i) == 'o' || string.charAt(i) == 'a' ) {
            vowelCount++;


        } else {

            consCount++;
        }

    }

    System.out.println(vowelCount);

EDIT I got the count working, but how do I create a substring?

`Given a string ... find the longest substring` ... so are there two strings in this problem, or only one? — Tim Biegeleisen, Sep 21 '16 at 04:23
For finding if a character is a vowel, you can use [this](http://stackoverflow.com/a/19161184/5743988) for simpler code. — 4castle, Sep 21 '16 at 04:23
@TimBiegeleisen I'm pretty sure there is only 1 string? The task is to find the longest substring where `consonantCount == vowelCount`. — 4castle, Sep 21 '16 at 04:26
Is it safe for you to assume all non-vowels are consonants? What if there are spaces, or numbers? — 4castle, Sep 21 '16 at 04:45

score 3 · Answer 1 · edited May 23 '17 at 12:15

This can be solved in O(n) time and space using the "net" values computed by this answer in combination with the following observation:

A substring s[i .. j] has the same number of consonants and vowels if and only if net[1 .. i-1] = net[1 .. j], where net[i .. j] is the sum of the "net" values (1 for a vowel, -1 for a consonant) for each character between positions i and j, inclusive.

To see this, observe that the condition that tells us that a substring s[i .. j] is the kind we're looking for is that

net[i .. j] = 0.

Adding net[1 .. i-1] to both sides of this equation gives

net[1 .. i-1] + net[i .. j] = net[1 .. i-1]

with the LHS then simplifying to just

net[1 .. j] = net[1 .. i-1]

Algorithm

That means that we can create a table containing two entries (first position seen and last position seen) for each possible distinct value that we could get as we calculate a running sum of net values. This running total could range as low as -n (if every character is a consonant) or as high as n (if every character is a vowel), so there are at most 2n+1 distinct such sums in total, so we'll need that many rows in our table. We then march through the string from left to right calculating a running total net value, and updating the pair in the table that corresponds to the current running total, noticing whenever this update produces a new, maximum-length substring. In pseudocode, with zero-based array indices and using separate arrays to hold the elements in each pair:

Create 2 arrays of length 2n+1, first[] and last[], initially containing all -2s, except for first[n] which is -1. (Need to use -2 as a sentinel since -1 is actually a valid value!)
Set bestFirst = bestLast = bestLen = -1.
Set the running total t = n. (n "means" 0; using this bias just means we can use the running total directly as a nonnegative-valued index into the arrays without having to repeatedly add an offset to it.)
For i from 0 to n-1:
- If s[i] is a vowel, increment t, otherwise decrement t.
- If first[t] is -2:
  - Set first[t] = i.
- Otherwise:
  - Set last[t] = i.
  - If last[t] - first[t] > bestLen:
    - Set bestLen = last[t] - first[t].
    - Set bestFirst = first[t] + 1.
    - Set bestLast = last[t].

A maximum-length range will be returned in (bestFirst, bestLast), or if no such range exists, these variables will both be -1.

I remember seeing this solution, or one very similar to it, somewhere on SO a while back -- if anyone can find it, I'll gladly link to it.

4castle · Answer 2 · 2016-09-21T13:39:07.480

2

To find the longest substring where the number of consonants and vowels are equal, start finding substrings at the largest length, and steadily decrease the length needed until you find a substring that matches the criteria.

This will allow you to short-circuit the operation.

public static String findLongestSubstring(String str) {
    for (int len = str.length(); len >= 2; len--) {
        for (int i = 0; i <= str.length() - len; i++) {
            String substr = str.substring(i, i + len);
            int vowels = countVowels(substr);
            int consonants = len - vowels;
            if (vowels == consonants) {
                return substr;
            }
        }
    }
    return "";
}

private static int countVowels(String str) {
    return str.replaceAll("[^AEIOUaeiou]+", "").length(); 
}

edited Sep 21 '16 at 13:39

answered Sep 21 '16 at 04:39

4castle

28,713
8
60
94

Upvoted because you figured out how to loop in such a way to check longest strings _first_, thus eliminating unnecessary work. – Tim Biegeleisen Sep 21 '16 at 05:04
@TimBiegeleisen Thanks, though I'm still not super happy with this, because it's O(n^3) just like everyone elses, but maybe someone will think of something less brute-force. – 4castle Sep 21 '16 at 05:10
@4castle, how did you find that it was O(n^3)? – YOGIYO Sep 21 '16 at 05:15
1

@YOGIYO `N^2` to iterate over every substring, plus another `O(N)` operation to count vowels and consonants. – Tim Biegeleisen Sep 21 '16 at 05:16
1

More efficient algorithm for generating all substrings discussed [here](http://stackoverflow.com/questions/2560262/generate-all-unique-substrings-for-given-string) – mv200580 Sep 21 '16 at 05:19
@TimBiegeleisen, what is N here? – YOGIYO Sep 21 '16 at 05:19
@YOGIYO You can think of `N` as the worst-case number of comparison operations having to be done to determine if a character is a vowel. You can read more about Big-O notation [here](http://stackoverflow.com/q/487258/5743988), and there's plenty of articles online. – 4castle Sep 21 '16 at 05:28
@4castle I found a way to do this in `O(n^2)`, q.v. my updated answer. It is possible to avoid iterating over each substring to check for vowels, by cleverly keeping a running tally of number of vowels and consonants. – Tim Biegeleisen Sep 21 '16 at 06:44
@4castle, can you help me out in calculating? The first loop has (str.length - 1) rotations. The second loop depends on the first loop, so it goes like: (0), [0, 1], [0, 1, 2] , ... , [0, 1 .., str.length - 2] Thus this is a total of (Second loop only) 1 + 2 + ... + N - 2 = (2N-3)^2/8 -1/8 ~ (2N)^2. If we let N=str.length. In the first loop, we have (N-1) ~ N, thus a total of ~N^3. But then we have to assume that inside both the loops, it is O(1) otherwise we have have > O(N^3)? – YOGIYO Sep 21 '16 at 14:55

Tim Biegeleisen · Answer 3 · 2016-09-21T07:10:32.550

Here is an updated version of my original answer which runs in O(n^2) time. It achieves this by employing a trick, namely keeping track of a single variable (called 'net') which tracks the difference between the number of vowels and consonants. When this number is zero, a given substring is balanced.

It takes O(n^2) to iterate over every possible substring in the worst case, but it doesn't take any additional time check each substring for strings and vowels, because it keeps the net up to date with each new step to choose a substring. Hence, it reduces the complexity from O(n^3) to O(n^2).

public String findLongestSubstring(String input) {
    String longest = "";

    for (int window = inputz.length(); window >=2; --window) {
        String substr = input.substring(0, window);
        String consonants = input.substring(0, window).toLowerCase()
                .replaceAll("[aeiou]", "");
        int vowelCount = input.substring(0, window).length() - consonants.length();
        int consonantCount = consonants.length();

        int net = vowelCount - consonantCount;

        for (int i=window; i <= input.length(); ++i) {
            if (net == 0) {
                longest = input.substring(i-window, i);
                return longest;
            }

            // no-op for last window
            if (i == input.length()) break;

            // update tally by removing high character
            if ("aeiou".indexOf(input.charAt(i)) != -1) {
                ++net;
            }
            else {
                --net;
            }
            // update tally by adding low character
            if ("aeiou".indexOf(input.charAt(i-window)) != -1) {
                --net;
            }
            else {
                ++net;
            }
        }
    }

    return longest;
}

score 0 · Answer 4 · answered Sep 21 '16 at 04:40

I think this could be decision for your task (for not too long input string):

import org.junit.Test;

/**
 * Created by smv on 19/09/16.
 */
public class MainTest {

    public static boolean isVowel(char c) {
        return "AEIOUaeiou".indexOf(c) != -1;
    }

    public int getVowelCount(String str) {
        int res = 0;
        for(int i=0; i < str.length(); i++){
            if(isVowel(str.charAt(i))) {
                res++;
            }
        }
        return res;
    }

    public int getConsonantCount(String str) {
        int res = 0;
        for(int i=0; i < str.length(); i++){
            if(!isVowel(str.charAt(i))) {
                res++;
            }
        }
        return res;
    }

    @Test
    public void run() throws Exception {
        String string = "aasdaasggsertcwevwertwe";
        int lengthOfString = string.length();
        String maxSub = "";
        int maxSubLength = 0;

        // find all substrings of given string
        for( int c = 0 ; c < lengthOfString ; c++ )
        {
            for( int i = 1 ; i <= lengthOfString - c ; i++ )
            {
                String sub = string.substring(c, c+i);

                // comparing count vowels and consonants 
                if (getVowelCount(sub) == getConsonantCount(sub)) {
                    if (sub.length() > maxSubLength) {
                        maxSub = sub;
                        maxSubLength = sub.length();
                    }
                }
            }
        }
        System.out.println(maxSub);
    }
}

score 0 · Answer 5 · answered Sep 21 '16 at 05:41

Well the requirements are very vague here of course. It does not mention if numbers or other keys are included in the input. I assumed a starting index of zero since counts are equal at that point.

    Scanner scanner = new Scanner(System.in);
    String string = new String(scanner.next());
    int lengthOfString = string.length();
    int vowelCount = 0;
    int consCount = 0;
    int maxIndex = -1;

    for(int i = 0; i < lengthOfString; i++) 
    {
        System.out.println("Char: " + string.charAt(i));

        if(string.charAt(i) == 'u' || string.charAt(i) == 'e' || string.charAt(i) == 'i'
            || string.charAt(i) == 'o' || string.charAt(i) == 'a') 
        {
            vowelCount++;
        } 
        else 
        {
            consCount++;
        }

        if(vowelCount == consCount)
        {
            System.out.println("count equal with: " + string.substring(0, (i + 1)));
            maxIndex = i + 1;
        }
    }

    if(maxIndex > 0)
    {
        System.out.println("Longest sub string with equal count of vowels and consonants is: " 
            + string.substring(0, maxIndex));
    }
    else
    {
        System.out.println("No substring existed with equal count of vowels and consonants.");
    }

Given a string, find the longest substring with the same number of vowels and consonants?

5 Answers5

Algorithm

Linked