76
Other Announcements / Re: Advanced magic, freeform system
« on: June 16, 2008, 08:20:51 AM »
I think the overlapping tokens problem is what's called lexical ambiguity in computational linguistics. If by 'realistic syntax' you mean 'naturalistic syntax', there's no reason to avoid it. Syntactic ambiguity is more difficult to resolve.
Native speakers resolve ambiguity so quickly they don't even see it's there. Natural language parsers usually resolve structural ambiguity by ordering the ambiguous parse results by reverse frequency of occurrence, and picking the topmost result; i.e. if several syntactic patterns fit the sentence, they pick the pattern that is most often used by native speakers.
Since you have the option of defining the language itself, you can avoid the problem altogether by restricting the grammar in such a way that a word's functional category (e.g. subject/main verb/object) follows from its position in the sentence. If you avoid syntactic ambiguity, lexical ambiguity is no longer an issue.
Native speakers resolve ambiguity so quickly they don't even see it's there. Natural language parsers usually resolve structural ambiguity by ordering the ambiguous parse results by reverse frequency of occurrence, and picking the topmost result; i.e. if several syntactic patterns fit the sentence, they pick the pattern that is most often used by native speakers.
Since you have the option of defining the language itself, you can avoid the problem altogether by restricting the grammar in such a way that a word's functional category (e.g. subject/main verb/object) follows from its position in the sentence. If you avoid syntactic ambiguity, lexical ambiguity is no longer an issue.