Algorithm to remove a character from a word such that the reduced word is still a word in dictionary

Question

Here is the scenario, Given a word remove a single character from a word in every step such that the reduced word is still a word in dictionary. Continue till no characters are left.

Here is the catch: You need to remove the right character, for eg. in a word there may be two possible characters which could be removed and both may cause the reduced word to be a valid word, but at a later stage one may get reduced to the end i.e. no characters left while the other may hang up.

Example:

planet
plant
pant
pan
an
a

OR

planet
plane
lane
not possible further, suppose lan is not a word. hope you got the idea.

Please see my code, im using recursion, but would like to know if there are better efficient solutions to do the same.

public class isMashable
{

  static void initiate(String s)
  {
    mash("", s);
  }

  static void mash(String prefix, String s)
  {
    int N = s.length();
    String subs = "";

    if (!((s.trim()).equals("")))
      System.out.println(s);

    for (int i = 0 ; i < N ; i++)
    {
      subs = s.substring(0, i) + s.substring(i+1, N);
      if (subs.equals("abc")||subs.equals("bc")||subs.equals("c")||subs.equals("a")) // check in dictionary here
        mash("" + s.charAt(i), subs);
    }
  }

  public static void main(String[] args)
  {
    String s = "abc";
    initiate(s);
  }
}

You should have a dictionary (like a `Map`) or some other way to check that the actual word is still a valid word. Also, you should try to sent the word letters combination instead of just sending substrings of your whole word. For example, if you send `planet`, by your algorithm you won´t be able to test `pet` combination. — Luiggi Mendoza, Jun 28 '12 at 05:46
Javascript example (warning: jsfiddle is a bit slow): http://jsfiddle.net/BA8PJ/ — biziclop, Jun 28 '12 at 06:20
you might want to use a directed graph containing each word as node. you create an edge from node A to node B iff it's possible to pass from A to B removing only one letter. To simplify graph creation, you first test word's length before trying to eliminate a letter. — Atmocreations, Jun 28 '12 at 06:38
@LuiggiMendoza: Yup, thanks for finding out the bug have fixed it :) — nmd, Jun 28 '12 at 17:40
@NitishMD I would like to see your final solution to the problem, I'm really interested but don't have enough time to make one myself :(. — Luiggi Mendoza, Jun 28 '12 at 17:55
@LuiggiMendoza: Sure il post it.. fixing few things in it :) — nmd, Jun 29 '12 at 06:21

score 2 · Answer 1 · answered Jun 28 '12 at 06:33

2

Run a BFS algorithm. If you have more than one characters that you can remove, remove them individually and put in a priority queue, if you want to retrace the path, keep the pointer to the parent(the original word from which you created this word by removing a character) of the word in the node itslef. And when you remove all the characters, terminate and retrace the path, or if there is no valid way, you will have an empty priority queue

answered Jun 28 '12 at 06:33

Pankaj Jindal

58
3

1

In the case of only one path, [DFS](http://en.wikipedia.org/wiki/Depth-first_search) is quicker on average (the only difference between DFS and [BFS](http://en.wikipedia.org/wiki/Breadth-first_search) being that they use a stack and a queue, respectively. The rest is identical). However, if the OP wants to check for all possible paths (not just return a single path) or if no path exists then they are equivilent. – acattle Jun 28 '12 at 06:51

score 1 · Answer 2 · answered Jun 28 '12 at 07:54

I have used Porter Stemming in a couple of projects - that will of course only help you trim off the end of the word.

The Porter stemming algorithm (or ‘Porter stemmer’) is a process for removing the commoner morphological and inflexional endings from words in English. Its main use is as part of a term normalisation process that is usually done when setting up Information Retrieval systems.

A reprint occoured in M.F. Porter, 1980, An algorithm for suffix stripping, Program, 14(3) pp 130−137.

Martin even has a Java version available on his site.

COME FROM · Answer 3 · 2012-06-28T10:00:43.713

Here you go. The mash-method will find a solution (list of dictionary words) for any given String using a dictionary passed to the constructor. If there's no solution (ending to a one letter word), the method will return null. If you are interested in all partial solutions (ending before getting to a one letter word), you should tweak the algorithm a bit.

The dictionary is assumed to be a set of uppercase Strings. You could of course use your own class/interface instead.

import java.util.ArrayList;
import java.util.List;
import java.util.Set;

public class WordMash {

    private final Set<String> dictionary;

    public WordMash(Set<String> dictionary) {
        if (dictionary == null) throw new IllegalArgumentException("dictionary == null");
        this.dictionary = dictionary;
    }

    public List<String> mash(String word) {
        return recursiveMash(new ArrayList<String>(), word.toUpperCase());
    }

    private List<String> recursiveMash(ArrayList<String> wordStack, String proposedWord) {
        if (!dictionary.contains(proposedWord)) {
            return null;
        }
        wordStack.add(proposedWord);

        if (proposedWord.length() == 1) {
            return wordStack;
        }

        for (int i = 0; i < proposedWord.length(); i++) {
            String nextProposedWord = 
                proposedWord.substring(0, i) + proposedWord.substring(i + 1, proposedWord.length());    
            List<String> finalStack = recursiveMash(wordStack, nextProposedWord);
            if (finalStack != null) return finalStack;
        }

        return null;
    }

}

Example:

Set<String> dictionary = new HashSet<String>(Arrays.asList(
        "A", "AFRICA", "AN", "LANE", "PAN", "PANT", "PLANET", "PLANT"
));
WordMash mash = new WordMash(dictionary);

System.out.println(mash.mash("planet"));
System.out.println(mash.mash("pant"));


System.out.println(mash.mash("foo"));
System.out.println(mash.mash("lane"));
System.out.println(mash.mash("africa"));

score 1 · Answer 4 · answered Oct 16 '16 at 02:07

Here is an algorithm that uses depth first search. Given a word, you check if its valid (in dictionary). If its valid, remove one character from the string at each index and recursively check the 'chopped' word is valid again. If the chopped word is invalid at any point, you are in the wrong path and go back to previous step.

import java.util.HashSet;
import java.util.Set;

public class RemoveOneCharacter {
    static Set<String> dict = new HashSet<String>();

    public static boolean remove(String word){
        if(word.length() == 1)
            return true;

        if(!dict.contains(word))
            return false;

        for(int i=0;i<word.length();i++){
            String choppedWord = removeCharAt(word,i);
            boolean result = remove(choppedWord);
            if(result)
                return true;
        }
        return false;
    }

    public static String removeCharAt(String str, Integer n) {
        String f = str.substring(0, n);
        String b = str.substring(n+1, str.length());
        return f + b;
    }

    public static void main(String args[]){
        dict.add("heat");
        dict.add("eat");
        dict.add("at");
        dict.add("a");

        dict.add("planets");
        dict.add("planet");
        dict.add("plant");
        dict.add("plane");
        dict.add("lane");
        dict.add("plants");
        dict.add("pant");
        dict.add("pants");
        dict.add("ant");
        dict.add("ants");
        dict.add("an");


        dict.add("clean");
        dict.add("lean");
        dict.add("clan");
        dict.add("can");

        dict.add("why");

        String input = "heat";
        System.out.println("result(heat) " + remove(input));
        input = "planet";
        System.out.println("result(planet) " + remove(input));
        input = "planets";
        System.out.println("result(planets) " + remove(input));
        input = "clean";
        System.out.println("result(clean) " + remove(input));
        input = "why";
        System.out.println("result(why) " + remove(input));
        input = "name";
        System.out.println("result(name) " + remove(input));


    }

}

score 0 · Answer 5 · answered Jun 28 '12 at 07:40

OK, it is not Java, just JavaScript, but probably you can transform it:

http://jsfiddle.net/BA8PJ/

function subWord( w, p, wrapL, wrapR ){
  return w.substr(0,p)
      + ( wrapL ? (wrapL + w.substr(p,1) + wrapR ):'')
      + w.substr(p+1);
}

// wa = word array:         ['apple','banana']
// wo = word object/lookup: {'apple':true,'banana':true}
function initLookup(){
  window.wo = {};
  for(var i=0; i < wa.length; i++) wo[ wa[i] ] = true;
}



function initialRandomWords(){
  // choose some random initial words
  var level0 = [];
  for(var i=0; i < 100; i++){
    var w = wa[ Math.floor(Math.random()*wa.length) ];
    level0.push({ word: w, parentIndex:null, pos:null, leaf:true });
  }
  return level0;
}



function generateLevels( levels ){
  while(true){
    var nl = genNextLevel( levels[ levels.length-1 ]);
    if( ! nl ) break;
    levels.push( nl );
  }
}

function genNextLevel( P ){ // P: prev/parent level
  var N = [];               // N: next level
  var len = 0;
  for( var pi = 0; pi < P.length; pi ++ ){
    pw = P[ pi ].word; // pw: parent word
    for( var cp = 0; cp < pw.length; cp++ ){ // cp: char pos
      var cw = subWord( pw, cp ); // cw: child word
      if( wo[cw] ){
        len++;
        P[ pi ].leaf = false;
        N.push({ word: cw, parentIndex:pi, pos:cp, leaf:true });
      }
    }
  }
  return len ? N : null;
}



function getWordTraces( levels ){
  var rows = [];
  for( var li = levels.length-1; li >= 0; li-- ){
    var level = levels[ li ];
    for( var i = 0; i < level.length; i++ ){
      if( ! level[ i ].leaf ) continue;
      var trace = traceWord( li, i );
      if( trace.length < 2 ) continue;
      rows.push( trace );
    }
  }
  return rows;
}

function traceWord( li, i ){
  var r = [];
  while(true){
    var o = levels[ li ][ i ];
    r.unshift( o );
    i = o.parentIndex;
    if( !i ) break;
    li--;
    if( li < 0 ) break;
  };
  return r;
}



function compareTraces( aa, bb ){
  var a = aa[0].word, b = bb[0].word;
  if( a == b ){
    if( aa.length < bb.length ) return -1;
    if( aa.length > bb.length ) return +1;
  }

  var len = Math.min( aa.length, bb.length )
  for( var i = 0; i < len; i++ ){
    var a = aa[i].word, b = bb[i].word;
    if( a < b ) return +1;
    if( a > b ) return -1;
  }

  if( aa.length < bb.length ) return -1;
  if( aa.length > bb.length ) return +1;

  return 0;
}


function prettyPrintTraces( rows ){
  var prevFirstWord = null;
  for( var ri = rows.length-1; ri >= 0; ri-- ){
    var row = rows[ ri ];

    if(  prevFirstWord != row[0].word  ){
      if( prevFirstWord ) $('body').append('<div class="sep"/>');
      prevFirstWord = row[0].word;
    }

    var $row = $('<div class="row"/>');
    for( var i = 0; i < row.length; i++ ){

      var w = row[i].word;
      var c = row[i+1];
      if( c )  w = subWord( w, c.pos, '<span class="cut">', '</span>');

      var $word = $('<div class="word"></div>').html( w ).toggleClass('last-word', w.length < 2 );
      $row.append( $word );
    }
    $('body').append( $row );
  }
};

function main(){
  initLookup();

  window.levels = [ initialRandomWords() ];

  generateLevels( levels );

  rows = getWordTraces( levels );

  rows.sort( compareTraces );

  prettyPrintTraces( rows );
}

I did it just for fun. The other solutions are indeed much shorter, so you can ignore this. level[0] contains the original words, level[1] has all words with 1 char removed, level[2]: 2 chars removed, plus they point to their parent word, so we can build all possible valid paths from the shortest words to the original word. I use multiple words initially, because I wasn't sure if the short wordlist I found somewhere will contain any path. — biziclop, Jun 28 '12 at 19:43
BTW this is the latest fiddle, I forgot to update answer: http://jsfiddle.net/BA8PJ/1/ — biziclop, Jun 28 '12 at 19:48

score 0 · Answer 6 · edited Jun 28 '12 at 17:57

0

Make a trie (or suffix tree )with given characters in the word(no repetions allowed), and check each subtree of trie with dictionary. This should help you.

For reference visit

edited Jun 28 '12 at 17:57

Luiggi Mendoza

81,685
14
140
306

answered Jun 28 '12 at 08:24

Imposter

2,546
1
19
31

But how is the trie helping? We are removing a character at a time. I did not get this soln. Can you explain? – nmd Jun 28 '12 at 19:37

Algorithm to remove a character from a word such that the reduced word is still a word in dictionary

6 Answers6