One very useful enhancement to simplenlg would be to add a statistical language
model (which is present in some other realisers, such as open ccg). Many of
the problems reported with simplenlg reflect the difficulty of making some
syntactic choices without semantic and pragmatic knowledge. Examples include
Adjective ordering: eg, "happy old lady" vs "old happy lady"
Use of bare infinitive: eg, "I see John eat an apple" vs "I see John thinks he
is smart"
Use of mass nouns as count nouns: eg, "There is a lot of sand on the beach" vs
"Many sands contain iron impurities"
I think a good statistical language model could address many of these issues;
my suggestion would be to encode choices in the lexicon (not grammar), and then
overgenerate and select using an ngram model.
I doubt I will have time to do this myself, but I would be happy to discuss my
ideas with anyone who was interested in doing this. Please email me at
e.reiter@abdn.ac.uk
Original issue reported on code.google.com by
ehud.rei...@gmail.comon 24 Mar 2011 at 7:54