24

I have the following problem which states

Replace all characters in a string with + symbol except instances of the given string in the method

so for example if the string given was abc123efg and they want me to replace every character except every instance of 123 then it would become +++123+++.

I figured a regular expression is probably the best for this and I came up with this.

str.replaceAll("[^str]","+") 

where str is a variable, but its not letting me use the method without putting it in quotations. If I just want to replace the variable string str how can I do that? I ran it with the string manually typed and it worked on the method, but can I just input a variable?

as of right now I believe its looking for the string "str" and not the variable string.

Here is the output its right for so many cases except for two :(

enter image description here

List of open test cases:

plusOut("12xy34", "xy") → "++xy++"
plusOut("12xy34", "1") → "1+++++"
plusOut("12xy34xyabcxy", "xy") → "++xy++xy+++xy"
plusOut("abXYabcXYZ", "ab") → "ab++ab++++"
plusOut("abXYabcXYZ", "abc") → "++++abc+++"
plusOut("abXYabcXYZ", "XY") → "++XY+++XY+"
plusOut("abXYxyzXYZ", "XYZ") → "+++++++XYZ"
plusOut("--++ab", "++") → "++++++"
plusOut("aaxxxxbb", "xx") → "++xxxx++"
plusOut("123123", "3") → "++3++3"
Rich
  • 13,254
  • 1
  • 56
  • 102
fsdff
  • 609
  • 4
  • 10
  • try `var.replaceAll("[^" + str + "]","+") ` – Scary Wombat Sep 13 '18 at 05:02
  • 1
    when you write [^str] in quotes, it takes the literal characters in that string instead of your variable so it will match to anything that isn't the letter s, t, or r – faris Sep 13 '18 at 05:44

7 Answers7

15

Looks like this is the plusOut problem on CodingBat.

I had 3 solutions to this problem, and wrote a new streaming solution just for fun.

Solution 1: Loop and check

Create a StringBuilder out of the input string, and check for the word at every position. Replace the character if doesn't match, and skip the length of the word if found.

public String plusOut(String str, String word) {
  StringBuilder out = new StringBuilder(str);

  for (int i = 0; i < out.length(); ) {
    if (!str.startsWith(word, i))
      out.setCharAt(i++, '+');
    else
      i += word.length();
  }

  return out.toString();
}

This is probably the expected answer for a beginner programmer, though there is an assumption that the string doesn't contain any astral plane character, which would be represented by 2 char instead of 1.

Solution 2: Replace the word with a marker, replace the rest, then restore the word

public String plusOut(String str, String word) {
    return str.replaceAll(java.util.regex.Pattern.quote(word), "@").replaceAll("[^@]", "+").replaceAll("@", word);
}

Not a proper solution since it assumes that a certain character or sequence of character doesn't appear in the string.

Note the use of Pattern.quote to prevent the word being interpreted as regex syntax by replaceAll method.

Solution 3: Regex with \G

public String plusOut(String str, String word) {
  word = java.util.regex.Pattern.quote(word);
  return str.replaceAll("\\G((?:" + word + ")*+).", "$1+");
}

Construct regex \G((?:word)*+)., which does more or less what solution 1 is doing:

  • \G makes sure the match starts from where the previous match leaves off
  • ((?:word)*+) picks out 0 or more instance of word - if any, so that we can keep them in the replacement with $1. The key here is the possessive quantifier *+, which forces the regex to keep any instance of the word it finds. Otherwise, the regex will not work correctly when the word appear at the end of the string, as the regex backtracks to match .
  • . will not be part of any word, since the previous part already picks out all consecutive appearances of word and disallow backtrack. We will replace this with +

Solution 4: Streaming

public String plusOut(String str, String word) {
  return String.join(word, 
    Arrays.stream(str.split(java.util.regex.Pattern.quote(word), -1))
      .map((String s) -> s.replaceAll("(?s:.)", "+"))
      .collect(Collectors.toList()));
}

The idea is to split the string by word, do the replacement on the rest, and join them back with word using String.join method.

  • Same as above, we need Pattern.quote to avoid split interpreting the word as regex. Since split by default removes empty string at the end of the array, we need to use -1 in the second parameter to make split leave those empty strings alone.
  • Then we create a stream out of the array and replace the rest as strings of +. In Java 11, we can use s -> String.repeat(s.length()) instead.
  • The rest is just converting the Stream to an Iterable (List in this case) and joining them for the result
nhahtdh
  • 52,949
  • 15
  • 113
  • 149
  • Smart idea your `\G` based solution, love it. – bobble bubble Sep 13 '18 at 09:24
  • @nhahtdh I really wanted to post your first solution, but since it was already occupied, [one using CharBuffer instead](https://stackoverflow.com/a/52315819/1059372) – Eugene Sep 13 '18 at 14:24
  • my first Idea was your soultuion 4 (assuming, the split will "correctly" split in case the word is at beginning/end of str => leaving an empty string as first/last in resulting array). My second was (basically) your solution 1, except that I would fill out with plusses and then insert word at positions. but my VERY FIRST thought was that as much as I like using regex, this wouldnt be one of the cases where I would use it. – iPirat Sep 17 '18 at 07:12
  • @iPirat: Solution 4 is correct in such cases, since empty strings are correctly preserved. And yeah, believe it or not, the regex is actually derived from solution 1. – nhahtdh Sep 17 '18 at 07:44
  • An alternative \G solution without backreferences, just fyi: `"(?<=\\G|" + word + ")(?!" + word + ")."` – jaytea Sep 18 '18 at 11:47
  • I also want to do something like this. Only difference is that I only want to replace the ' and which I do not want to get changed. I was trying to use your solution 3. However I was still unable to catch the '|)*+)(?: – Jagath01234 Mar 10 '21 at 10:11
  • Got an answer for my issue using negative lockhead. Posting for someone in need: ` input.replaceAll("(?:(?!
    |)[
    – Jagath01234 Mar 10 '21 at 14:07
6

This is a bit trickier than you might initially think because you don't just need to match characters, but the absence of specific phrase - a negated character set is not enough. If the string is 123, you would need:

(?<=^|123)(?!123).*?(?=123|$)

https://regex101.com/r/EZWMqM/1/

That is - lookbehind for the start of the string or "123", make sure the current position is not followed by 123, then lazy-repeat any character until lookahead matches "123" or the end of the string. This will match all characters which are not in a "123" substring. Then, you need to replace each character with a +, after which you can use appendReplacement and a StringBuffer to create the result string:

String inputPhrase = "123";
String inputStr = "abc123efg123123hij";
StringBuffer resultString = new StringBuffer();
Pattern regex = Pattern.compile("(?<=^|" + inputPhrase + ")(?!" + inputPhrase + ").*?(?=" + inputPhrase + "|$)");
Matcher m = regex.matcher(inputStr);
while (m.find()) {
    String replacement = m.group(0).replaceAll(".", "+");
    m.appendReplacement(resultString, replacement);
}
m.appendTail(resultString);
System.out.println(resultString.toString());

Output:

+++123+++123123+++

Note that if the inputPhrase can contain character with a special meaning in a regular expression, you'll have to escape them first before concatenating into the pattern.

CertainPerformance
  • 260,466
  • 31
  • 181
  • 209
  • I updated the post with the result of my old regular expression and you're correct that it does fail in two test cases. I will review your code and try to understand it. – fsdff Sep 13 '18 at 05:31
  • 1
    From the test cases, the input string can contain special character, so you need to escape it. With that modification, this fails on the test case `plusOut("aaxxxxbb", "xx") → "++xxxx++"`, since the regex matches from the 4th `x` to the end of the string – nhahtdh Sep 13 '18 at 06:36
  • 1
    if the string is `43134ababab6544` and the word is `abab`, what is the expected beheaviour? `+++++abab++++++` or `+++++++abab++++` or `+++++ababab++++` – iPirat Sep 17 '18 at 08:07
2

You can do it in one line:

input = input.replaceAll("((?:" + str + ")+)?(?!" + str + ").((?:" + str + ")+)?", "$1+$2");

This optionally captures "123" either side of each character and puts them back (a blank if there's no "123"):

Bohemian
  • 365,064
  • 84
  • 522
  • 658
1

So instead of coming up with a regular expression that matches the absence of a string. We might as well just match the selected phrase and append + the number of skipped characters.

StringBuilder sb = new StringBuilder();
Matcher m = Pattern.compile(Pattern.quote(str)).matcher(input);
while (m.find()) {
    for (int i = 0; i < m.start(); i++) sb.append('+');
    sb.append(str);
}
int remaining = input.length() - sb.length();
for (int i = 0; i < remaining; i++) {
    sb.append('+');
}
xiaofeng.li
  • 7,284
  • 2
  • 20
  • 29
1

Absolutely just for the fun of it, a solution using CharBuffer (unexpectedly it took a lot more that I initially hoped for):

private static String plusOutCharBuffer(String input, String match) {
    int size = match.length();
    CharBuffer cb = CharBuffer.wrap(input.toCharArray());
    CharBuffer word = CharBuffer.wrap(match);

    int x = 0;
    for (; cb.remaining() > 0;) {
        if (!cb.subSequence(0, size < cb.remaining() ? size : cb.remaining()).equals(word)) {
            cb.put(x, '+');
            cb.clear().position(++x);
        } else {
            cb.clear().position(x = x + size);
        }
    }

    return cb.clear().toString();
}
Eugene
  • 102,901
  • 10
  • 149
  • 252
1

To make this work you need a beast of a pattern. Let's say you you are operating on the following test case as an example:

plusOut("abXYxyzXYZ", "XYZ") → "+++++++XYZ"

What you need to do is build a series of clauses in your pattern to match a single character at a time:

  • Any character that is NOT "X", "Y" or "Z" -- [^XYZ]
  • Any "X" not followed by "YZ" -- X(?!YZ)
  • Any "Y" not preceded by "X" -- (?<!X)Y
  • Any "Y" not followed by "Z" -- Y(?!Z)
  • Any "Z" not preceded by "XY" -- (?<!XY)Z

An example of this replacement can be found here: https://regex101.com/r/jK5wU3/4

Here is an example of how this might work (most certainly not optimized, but it works):

import java.util.regex.Pattern;

public class Test {

    public static void plusOut(String text, String exclude) {

        StringBuilder pattern = new StringBuilder("");
        for (int i=0; i<exclude.length(); i++) {

            Character target    = exclude.charAt(i);
            String prefix       = (i > 0) ? exclude.substring(0, i) : "";
            String postfix      = (i < exclude.length() - 1) ? exclude.substring(i+1) : "";

            // add the look-behind (?<!X)Y
            if (!prefix.isEmpty()) {
                pattern.append("(?<!").append(Pattern.quote(prefix)).append(")")
                        .append(Pattern.quote(target.toString())).append("|");
            }

            // add the look-ahead X(?!YZ)
            if (!postfix.isEmpty()) {
                pattern.append(Pattern.quote(target.toString()))
                        .append("(?!").append(Pattern.quote(postfix)).append(")|");
            }

        }

        // add in the other character exclusion
        pattern.append("[^" + Pattern.quote(exclude) + "]");

        System.out.println(text.replaceAll(pattern.toString(), "+"));

    }

    public static void main(String  [] args) {

        plusOut("12xy34", "xy");
        plusOut("12xy34", "1");
        plusOut("12xy34xyabcxy", "xy");
        plusOut("abXYabcXYZ", "ab");
        plusOut("abXYabcXYZ", "abc");
        plusOut("abXYabcXYZ", "XY");
        plusOut("abXYxyzXYZ", "XYZ");
        plusOut("--++ab", "++");
        plusOut("aaxxxxbb", "xx");
        plusOut("123123", "3");

    }

}

UPDATE: Even this doesn't quite work because it can't deal with exclusions that are just repeated characters, like "xx". Regular expressions are most definitely not the right tool for this, but I thought it might be possible. After poking around, I'm not so sure a pattern even exists that might make this work.

Daedalus
  • 1,617
  • 10
  • 12
1

The problem in your solution that you put a set of instance string str.replaceAll("[^str]","+") which it will exclude any character from the variable str and that will not solve your problem

EX: when you try str.replaceAll("[^XYZ]","+") it will exclude any combination of character X , character Y and character Z from your replacing method so you will get "++XY+++XYZ".

Actually you should exclude a sequence of characters instead in str.replaceAll.

You can do it by using capture group of characters like (XYZ) then use a negative lookahead to match a string which does not contain characters sequence : ^((?!XYZ).)*$

Check this solution for more info about this problem but you should know that it may be complicated to find regular expression to do that directly.

I have found two simple solutions for this problem :

Solution 1:

You can implement a method to replace all characters with '+' except the instance of given string:

String exWord = "XYZ";
String str = "abXYxyzXYZ";

for(int i = 0; i < str.length(); i++){
    // exclude any instance string of exWord from replacing process in str
    if(str.substring(i, str.length()).indexOf(exWord) + i == i){
        i = i + exWord.length()-1;
    }
    else{
        str = str.substring(0,i) + "+" + str.substring(i+1);//replace each character with '+' symbol
    }
}             

Note : str.substring(i, str.length()).indexOf(exWord) + i this if statement will exclude any instance string of exWord from replacing process in str.

Output:

+++++++XYZ

Solution 2:

You can try this Approach using ReplaceAll method and it doesn't need any complex regular expression:

String exWord = "XYZ";
String str = "abXYxyzXYZ";

str = str.replaceAll(exWord,"*"); // replace instance string with * symbol
str = str.replaceAll("[^*]","+"); // replace all characters with + symbol except * 
str = str.replaceAll("\\*",exWord); // replace * symbol with instance string

Note : This solution will work only if your input string str doesn't contain any * symbol.

Also you should escape any character with a special meaning in a regular expression in phrase instance string exWord like : exWord = "++".

Oghli
  • 1,585
  • 1
  • 9
  • 29