Sentence splitting and DP structure

I just pushed sentence-splitting code up to the repository. parse_entries.php now splits sentences before feeding them to the parser, which makes a lot more sense, as the parser did not handle multi-sentence paragraphs well. We’re using Adwait Ratnaparkhi’s MXTERMINATOR sentence-splitter.

We’ve also retrained the parser and reparsed the whole database, using David Vadas’ NP structure additions to the Penn Treebank. The two have increased the constituency percentage by about +6%, which is slightly less than I expected.

No related posts.

Related posts brought to you by Yet Another Related Posts Plugin.

Leave a Reply

Your email address will not be published. Required fields are marked *


You may use these HTML tags and attributes: <a href="" title=""> <abbr title=""> <acronym title=""> <b> <blockquote cite=""> <cite> <code> <del datetime=""> <em> <i> <q cite=""> <strike> <strong>