Message-ID: <3B976DFF.64ACF3DB@csi.com>
Date: Thu, 06 Sep 2001 08:37:19 -0400
From: John Colagioia <JColagioia@csi.com>
Organization: No Conspiracy Here...
X-Mailer: Mozilla 4.77 [en] (Win98; U)
X-Accept-Language: en,fr,ru,es,it,ga,de,ja,gd,eu
MIME-Version: 1.0
Newsgroups: rec.arts.int-fiction
Subject: Re: wanna write parser
References: <24efca22.0109040321.4270c7d@posting.google.com> <3b94d7ef_2@excalibur.gbmtech.net> <24efca22.0109050057.553c739e@posting.google.com>
Content-Type: text/plain; charset=us-ascii
Content-Transfer-Encoding: 7bit
NNTP-Posting-Host: 208.34.37.104
X-Original-NNTP-Posting-Host: 208.34.37.104
X-Trace: excalibur.gbmtech.net 999779661 208.34.37.104 (6 Sep 2001 08:34:21 EST)
Lines: 68
X-Authenticated-User: jnc
X-Original-NNTP-Posting-Host: 127.0.0.1
Path: news.duke.edu!newsgate.duke.edu!nntp-out.monmouth.com!newspeer.monmouth.com!diablo.netcom.net.uk!netcom.net.uk!btnet-peer!btnet-peer0!btnet!news.mailgate.org!zur.uu.net!bos.uu.net!nyc.uu.net!excalibur.gbmtech.net
Xref: news.duke.edu rec.arts.int-fiction:92238

Juris Kalnins wrote:

> "John Colagioia" <JColagioia@csi.com> wrote
> > That would probably BE that TADS3 (or Inform) library that you want to
> > experiment with.  I'm not sure exactly how TADS is laid out, but Inform's
> > library includes a (relatively) easy extensible and (very) easily modifiable
> > parser.
>   Yes, but Inform's parser is too avare of gramatical structure of language.
> It makes assuptions like parsing first word as verb or person's name. Changing
> these would require retooling the whole thing, and that's what I want to try and
>
> avoid.

Well, that's not entirely accurate.  You don't necessarily need to retool the
entire thing, simply because the part in existence is also necessary.  What you'd
need to do (depending on your goal) is probably add small bits to the early and
later processing attempts (like the BeforeParsing() hook) to trim down the
sentence to the critical parts that Inform does know how to parse.


>   I know nodody asked, but here's my idea, in two (2, 3, ..., well not many)
> words: I want parser to consist just of a large set of translation
> rules, that are matched and applied to input string, producing large quantites
> of mostly
> dead-end productions. The few that are converted to 'understandable' form are
> weighted, and the winner is chosen as an interpretation of user input. Z
> machine just isn't fast enough for this.

Well, I can't be responsible for your patience threshhold, of course...

However, what it's starting to sound more like you need is some sort of dynamic,
iterative, grammar production system.  That is, something that'll take your
"left-hand sides" and convert them to all possible "right-hand sides."  No such
tool that I'm aware of will do this, but it could probably be coded in a LISP or a
Prolog without excessive hassle.


> > Also nifty about Inform is that the Z-Machine does some lexical analysis for
> > you, making any parsing job significantly easier, even when you're not using
> > the Inform parser.
>   But it doesn't keep history of user input and program output, which I want to
> have at hand to resolve things like 'it', 'them', 'which' and so on. My idea
> of interactive fiction is that you should always have the log (written
> part) of it available to both user and program, and it's also
> impossible in Inform.

Not impossible at all.  Simple in fact:  Copy the input during BeforeParsing()
into whatever arrays you have available.

Now, mind you, such an extensive parser may be more of a hinderance in games than
otherwise.  Consider:
    > NORTH
    I'm not sure what to do with just a noun.

    > GO THAT WAY
    Ah.  I'm going to assume you mean "go northward."

This is much like the question that occasionally comes up as to "how to I ensure
that the player types exactly the phrase I want to accept."  The answer is,
"please don't do that, because it'll annoy the player."

What might be better (assuming you're building a game--which I realize is not
necessarily a valid assumption) might be to scan the input, left-to-right, for the
verb in BeforeParsing(), then let the Inform parser (or whatever) take it from
there.  If you don't like pronoun-handling, you'll have to fiddle with that, as
well, but that shouldn't be too difficult.


