Skip to content

Enhancement: Statistical language model #5

Description

@GoogleCodeExporter
One very useful enhancement to simplenlg would be to add a statistical language 
model (which is present in some other realisers, such as open ccg).  Many of 
the problems reported with simplenlg reflect the difficulty of making some 
syntactic choices without semantic and pragmatic knowledge.  Examples include

Adjective ordering: eg, "happy old lady" vs "old happy lady"

Use of bare infinitive: eg, "I see John eat an apple" vs "I see John thinks he 
is smart"

Use of mass nouns as count nouns: eg, "There is a lot of sand on the beach" vs 
"Many sands contain iron impurities"

I think a good statistical language model could address many of these issues; 
my suggestion would be to encode choices in the lexicon (not grammar), and then 
overgenerate and select using an ngram model.

I doubt I will have time to do this myself, but I would be happy to discuss my 
ideas with anyone who was interested in doing this.  Please email me at 
e.reiter@abdn.ac.uk


Original issue reported on code.google.com by ehud.rei...@gmail.com on 24 Mar 2011 at 7:54

Metadata

Metadata

Assignees

No one assigned

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions