Here's is one way to get all matches.
Code
def all_matches_with_spacers(word, str)
word_size = word.size
word_arr = word.chars
str_arr = str.chars
(0..(str.size - word_size)/(word_size-1)).each_with_object([]) do |n, arr|
regex = Regexp.new(word_arr.join(".{#{n}}"))
str_arr.each_cons(word_size + n * (word_size - 1))
.map(&:join)
.each { |substring| arr << substring if substring =~ regex }
end
end
This requires word.size > 1
.
Example
all_matches_with_spacers('bar', 'bar') #=> ["bar"]
all_matches_with_spacers('bar', 'beayr') #=> ["beayr"]
all_matches_with_spacers('bar', 'qbowarprr') #=> ["bowarpr"]
all_matches_with_spacers('bar', 'wbxxxxxayyyyyrzzz') #=> ["bxxxxxayyyyyr"]
all_matches_with_spacers('bobo', 'bobobocbcbocbcobcodbddoddbddobddoddbddob')
#=> ["bobo", "bobo", "bddoddbddo", "bddoddbddo"]
Explanation
Suppose
word = 'bobo'
str = 'bobobocbcbocbcobcodbddoddbddobddoddbddob'
then
word_size = word.size #=> 4
word_arr = word.chars #=> ["b", "o", "b", "o"]
str_arr = str.chars
#=> ["b", "o", "b", "o", "b", "o", "c", "b", "c", "b", "o", "c", "b", "c",
# "o", "b", "c", "o", "d", "b", "d", "d", "o", "d", "d", "b", "d", "d",
# "o", "b", "d", "d", "o", "d", "d", "b", "d", "d", "o", "b"]
If n
is the number of spacers between each letter of word
, we require
word.size + n * (word.size - 1) <= str.size
Hence (since str.size => 40
),
n <= (str.size - word_size)/(word_size-1) #=> (40-4)/(4-1) => 12
We therefore will iterate over zero to 12 spacers:
(0..12).each_with_object([]) do |n, arr| .. end
Enumerable#each_with_object creates an initially-empty array denoted by the block variable arr
. The first value passed to block is zero (spacers), assigned to the block variable n
.
We then have
regex = Regexp.new(word_arr.join(".{#{0}}")) #=> /b.{0}o.{0}b.{0}o/
which is the same as /bar/
. word
with n
spacers has length
word_size + n * (word_size - 1) #=> 19
To extract all sub-arrays of str_arr
with this length, we invoke:
str_arr.each_cons(word_size + n * (word_size - 1))
Here, with n = 0
, this is:
enum = str_arr.each_cons(4)
#=> #<Enumerator: ["b", "o", "b", "o", "b", "o",...,"b"]:each_cons(4)>
This enumerator will pass the following into its block:
enum.to_a
#=> [["b", "o", "b", "o"], ["o", "b", "o", "b"], ["b", "o", "b", "o"],
# ["o", "b", "o", "c"], ["b", "o", "c", "b"], ["o", "c", "b", "c"],
# ["c", "b", "c", "b"], ["b", "c", "b", "o"], ["c", "b", "o", "c"],
# ["b", "o", "c", "b"], ["o", "c", "b", "c"], ["c", "b", "c", "o"],
# ["b", "c", "o", "b"], ["c", "o", "b", "c"], ["o", "b", "c", "o"]]
We next convert these to strings:
ar = enum.map(&:join)
#=> ["bobo", "obob", "bobo", "oboc", "bocb", "ocbc", "cbcb", "bcbo",
# "cboc", "bocb", "ocbc", "cbco", "bcob", "cobc", "obco"]
and add each (assigned to the block variable substring
) to the array arr
for which:
substring =~ regex
ar.each { |substring| arr << substring if substring =~ regex }
arr => ["bobo", "bobo"]
Next we increment the number of spacers to n = 1
. This has the following effect:
regex = Regexp.new(word_arr.join(".{#{1}}")) #=> /b.{1}o.{1}b.{1}o/
str_arr.each_cons(4 + 1 * (4 - 1)) #=> str_arr.each_cons(7)
so we now examine the strings
ar = str_arr.each_cons(7).map(&:join)
#=> ["boboboc", "obobocb", "bobocbc", "obocbcb", "bocbcbo", "ocbcboc",
# "cbcbocb", "bcbocbc", "cbocbco", "bocbcob", "ocbcobc", "cbcobco",
# "bcobcod", "cobcodb", "obcodbd", "bcodbdd", "codbddo", "odbddod",
# "dbddodd", "bddoddb", "ddoddbd", "doddbdd", "oddbddo", "ddbddob",
# "dbddobd", "bddobdd", "ddobddo", "dobddod", "obddodd", "bddoddb",
# "ddoddbd", "doddbdd", "oddbddo", "ddbddob"]
ar.each { |substring| arr << substring if substring =~ regex }
There are no matches with one spacer, so arr
remains unchanged:
arr #=> ["bobo", "bobo"]
For n = 2
spacers:
regex = Regexp.new(word_arr.join(".{#{2}}")) #=> /b.{2}o.{2}b.{2}o/
str_arr.each_cons(4 + 2 * (4 - 1)) #=> str_arr.each_cons(10)
ar = str_arr.each_cons(10).map(&:join)
#=> ["bobobocbcb", "obobocbcbo", "bobocbcboc", "obocbcbocb", "bocbcbocbc",
# "ocbcbocbco", "cbcbocbcob", "bcbocbcobc", "cbocbcobco", "bocbcobcod",
# ...
# "ddoddbddob"]
ar.each { |substring| arr << substring if substring =~ regex }
arr #=> ["bobo", "bobo", "bddoddbddo", "bddoddbddo"]
No matches are found for more than two spacers, so the method returns
["bobo", "bobo", "bddoddbddo", "bddoddbddo"]