1

Considering an arbitrary text I'd like to identify a specific text inside of it. I have been working on a code until now, but it's really hard to me to get some way to solve that.

One important point is that when I get the text the spaces are in some quantity, but after I get other text to compare, that's why I need to don't care about the number of spaces.

Example of the first text:

Here, bla bla bla bla

() => console.log()

end

Example of text to be selected on the second text according to the first text:

() => console.log()

Example of the second text to select the value I want:

Here, bla     bla     bla bla

() => console.    log()

en d

Just remembering some criterias:

  • The spaces between the words can vary and the jump lines can be a lot more
  • I need to match some letters inside other text with other formate, not matter the spaces or jump lines between them

My code until now:

let firstCharCode = mainText.replace(/ /g,'').indexOf(textToBeSelected.replace(/ /g, ''))
let lastCharCode = firstCharCode + textToBeSelected.replace(/ /g, '').length - 1
let numberOfCharsToSelect = lastCharCode - firstCharCode

for (let i = 0; i < a.length; i++) {
  // iterate through them checking where is the chars?
}
Felipe Augusto
  • 5,811
  • 7
  • 29
  • 60
  • What's the criteria for your search exactly? – Zenoo Jul 24 '18 at 12:04
  • @Zenoo I need to find where are some group of letters no matter what the quantity of spaces and jump lines between them – Felipe Augusto Jul 24 '18 at 12:06
  • @FelipeAugusto If I understand you correctly then let's say we have a string "abc def ghi jkl drs sda" and you want to find where "def" lies. Am I correct? Is it a hardcoded string variable or are you taking it via user input? – R.D Jul 24 '18 at 12:08
  • 1
    Provide a [mcve] – Ason Jul 24 '18 at 12:09
  • You could create a Regex by looping on every char of your search, and add a `\s*?` after each char. Be careful about specific character escaping, though. [Here's a quick example for you](https://regex101.com/r/srVWEg/1) – Zenoo Jul 24 '18 at 12:10
  • @R.D it's a value according to a first text, it's get from the first text, but searched on the second, I've updated my example – Felipe Augusto Jul 24 '18 at 12:15
  • @LGSon thank you! I noticed it should be improved and updated my code – Felipe Augusto Jul 24 '18 at 12:16

2 Answers2

1

This function should solve your issue.

searchWithoutBlanks(text, search) returns true if search is found inside text, without looking at the whitespaces.

Here's the rundown :

- Remove any whitespace in the search

- Loop through the characters of your search and escape them + add \s*? after each

- Test your initial text with this freshly generated Regex.

const text = `
Here, bla bla bla bla

() => console.log()

end
`;

//Utility function to escape a String for RegExp use
const escapeRegExp = str => str.replace(/[\-\[\]\/\{\}\(\)\*\+\?\.\\\^\$\|]/g, "\\$&");

const searchWithoutBlanks = (string, search) => {
  let sanitizedSearch = search.replace(/\s+/g, ''); //Removes any whitespace
  let regexString = '';
  for(let i = 0; i < sanitizedSearch.length; i++){  //Loop on your search
    regexString += escapeRegExp(sanitizedSearch[i]) + '\\s*?';  //Add \s*? after each sanitized char
  }
  console.log(regexString); //Here is the resulting RegExp
  return new RegExp(regexString).test(string);
}

console.log(searchWithoutBlanks(text,'()=>console.log()'));
console.log(searchWithoutBlanks(text,'this shouldn\'t match'));
Zenoo
  • 11,719
  • 4
  • 38
  • 57
1

Here is an example, not sure it is very performant though.

It basically makes a RegExp from the text to look for: first it removes the spaces, then it escapes regexp characters (watchout, this has to be completed with every regexp char), then it adds between each char the expectation for any space or line jump.

Then, it's easy to fetch the index of the match with indexOf.

const text = `Here, bla     bla     bla bla

() => console.    log()

en d`;
const target = '() => console.log()';

const escapeRegexp = c => c.replace(/[)(.]/g, c => `\\${c}`); // Protect every regexp char here.

const regexp = new RegExp(target.replace(/\s/g, '').split('').map(escapeRegexp).join('\\s*') ,'g');
const results = regexp.exec(text);

results.map(r => console.log(`Find match '${r}' starting at index ${text.indexOf(r)} and ending at index ${text.indexOf(r) + r.length}`));

Note: To properly escape the String from RegExp syntax: Escape string for use in Javascript regex

sjahan
  • 5,174
  • 3
  • 15
  • 34