Experiments | Lastly, for the WSJ40 runs we used a simple, right branching binarization where each active state is annotated with its previous sibling and first child. |
The Model | Binarization of rules (Earley, 1970) is necessary to obtain cubic parsing time, and closure of unary chains is required for finding total probability mass (rather than just best parses) (Stolcke, 1995). |
The Model | This was done by collapsing all allowed unary chains to single unary rules, and disallowing multiple unary rule applications over the same span.1 We give the details of our binarization scheme in Section 5. |