Where can I get the CFG used in Stanford Parser?

Question

I am trying to identify the key phrase(s) in a question along with the type of answer expected. I am using Stanford Parser to generate the parse tree of the question. I need to traverse this parse tree and make choices at each node whether it is a key phrase or not based on some heuristics. If only I had access to the complete CFG used in Stanford Parser, I could expand the heuristics to cover all the children of a node that might appear in the tree.

The Stanford Parser: A statistical parser

This is a near-duplicate to my own question a few weeks ago - http://stackoverflow.com/questions/27023506/exporting-pcfg-from-stanford-lexicalized-parser. Unfortunately, we haven't seen an answer to that one either... — AaronD, Feb 06 '15 at 19:26

score 0 · Answer 1 · edited May 23 '17 at 11:49

Every trained LexicalizedParser instance has fields bg and ug, which are learned BinaryGrammar and UnaryGrammar instances. Each of these classes have methods which allow you to look up binary / unary rewrite rules from a given parent or child (or sibling, in the binary case). Every rewrite rule (see Rule interface) has an associated log-probability under the field score. You can use LexicalizedParser#stateIndex to get the necessary int IDs for any given tree constituent.

That being said, it's not clear that seeing all possible productions would be a good thing for your purposes. (The grammar is pretty ugly!) You might do better examining the k-best parses for a given sentence.

Where can I get the CFG used in Stanford Parser?

1 Answers1