[Cuis-dev] No regex in parser

Thierry Goubier thierry.goubier at gmail.com
Wed Sep 25 23:13:26 PDT 2019


Hi Casey,

Le jeu. 26 sept. 2019 à 00:52, Casey Ransberger via Cuis-dev
<cuis-dev at lists.cuis.st> a écrit :
>
> Holy crap, below
>
> > On Sep 25, 2019, at 3:40 PM, Thierry Goubier via Cuis-dev <cuis-dev at lists.cuis.st> wrote:
> >
> > A chomskyian setup is 20 lines of code plus a table for LR(1), so it
> > is very hard to handwrite a parser that is smaller than that.
>
> DAMN SON. Show me these 20 lines of code. Show me this table.

Attached. A LR(1) is a push down automaton, with two synchronized
stacks and just four possibles actions: shift, reduce, accept, error.
Never backtracks, is deterministic, needs only a single look-ahead
(and anything requiring more look-ahead can be converted to LR(1)).

It uses the tables generated by SmaCC, converted to a simpler,
unoptimized form that requires only a single read from the state
table. SmaCC, like others LR(1) generators, compress the tables
requiring multiple reads to get the action and next state information.

> I shall endeavor to understand.

It's deceptively simple, and not simple. When I teach it, students
have a hard time with the underlying theory...

The scanner part is even simpler, and is 3x faster than the compiled
scanner that SmaCC generates.

Thierry

> —Casey
> --
> Cuis-dev mailing list
> Cuis-dev at lists.cuis.st
> https://lists.cuis.st/mailman/listinfo/cuis-dev
-------------- next part --------------
A non-text attachment was scrubbed...
Name: step.cs
Type: text/x-csharp
Size: 1054 bytes
Desc: not available
URL: <http://lists.cuis.st/mailman/archives/cuis-dev/attachments/20190926/eb6e0ae5/attachment-0001.bin>


More information about the Cuis-dev mailing list