This is a stripped-down and minimally modified version of chibi parse, a parser combinator library with optional memoization. This version of the library is specifically designed to faciliate building read-eval-print loops with parser combinators.
Apart from parse streams, the central abstraction of this libary is therefore a REPL type which continously reads data from standard input and parses this data with parsers constructed using provided parser combinators.
An exemplary parser may be constructed as follows:
(import (srfi 14) (edward parse))
(define hexadecimal
(parse-map
(parse-seq
(parse-string "0x")
(parse-map
(parse-token char-set:hex-digit)
(lambda (str)
(string->number str 16))))
cadr))
This parser recognizes hexadecimal integers using the character set definition provided by SRFI 14 and transform these characters into a Scheme number type using parse-map. Refer to the documentation below for more information on available combinators.
REPL abstraction which provides a read-eval-print loop that continously reads data from standard input and parses this input data using provided parser combinators. The REPL operates on a parse stream internally.
Create an new REPL instance with the given input prompt
string.
(make-repl prompt)
Predicate which returns true if the given object was created using make-repl.
(repl? obj)
Predicate which returns true if the prompt should be shown.
(repl-prompt? read-eval-print-loop)
Change prompt visibility, a truth value means the prompt is shown.
(repl-set-prompt! read-eval-print-loop new-value)
Start the REPL given by repl
, and continuously parse input using
the provided parser f
. Successfully parsed input is passed to
the success continuation sk
, which receives the line number and
parser result as procedure arguments. If the parser failed for the
current input, the failure continuation fk
is invoked. This
continuation receives the line number and failure reason as
procedure arguments. Lastly, an interrupt continuation must
also be provided which is invoked on SIGINT
. This continuation
is not passed any arguments.
(repl-run repl f sk fk ik)
Run a parser interactively within the REPL. That is, deviate from
the standard REPL parser and instead parse the next input line
with the given parser f
. On success, returns the result of f
otherwise, invokes the provided failure continuation fk
.
(repl-interactive repl f fk)
Parse streams are an abstraction to treat ports as proper streams so that we can backtrack from previous states. A single Parse-Stream record represents a single buffered chunk of text.
Create a parse stream open on the given filename
, with a
possibly already opened port
.
(make-parse-stream filename . o)
Open filename
and create a parse stream on it.
(file->parse-stream filename)
Create a parse stream on a string str
.
(string->parse-stream str)
Returns true iff i
is the first character position in the
parse stream source
.
(parse-stream-start? source i)
Returns true iff i
is the last character position in the
parse stream source
.
(parse-stream-end? source i)
Returns the character in parse stream source
indexed by
i
.
(parse-stream-ref source i)
Returns a string composed of the characters starting at parse
stream s0
index i0
(inclusive), and ending at s1
index i1
(exclusive).
(parse-stream-substring s0 i0 s1 i1)
Procedures for operating on a created parse stream.
Combinator to indicate failure.
(parse-failure s i reason)
Call the parser combinator f
on the parse stream
source
, starting at index index
, passing the result to
the given success continuation sk
, which should be a
procedure of the form (result source index fail)
. The
optional failure continuation should be a procedure of the form
(source index reason)
, and defaults to just returning
#f
.
(call-with-parse f source index sk . o)
Call the parser combinator f
on the parse stream
source
, at index index
, and return the result, or
#f
if parsing fails.
(parse f source . o)
Call the parser combinator f
on the parse stream
source
, at index index
. If the entire source is not
parsed, raises an error, otherwise returns the result.
(parse-fully f source . o)
The fundamental parse iterator. Repeatedly applies the parser
combinator f
to source
, starting at index
, as
long as a valid parse is found. On each successful parse applies
the procedure kons
to the parse result and the previous
kons
result, beginning with knil
. If no parses
succeed returns knil
.
(parse-fold f kons knil source . o)
Parse as many of the parser combinator f
from the parse
stream source
, starting at index
, as possible, and
return the result as a list.
(parse->list f source . o)
As parse->list
but requires the entire source be parsed
with no left over characters, signalling an error otherwise.
(parse-fully->list f source . o)
Return a new parser combinator with the same behavior as f
,
but on failure replaces the reason with reason
. This can be
useful to provide more descriptive parse failure reasons when
chaining combinators. For example, parse-string
just
expects to parse a single fixed string. If it were defined in
terms of parse-char
, failure would indicate some char
failed to match, but it's more useful to describe the whole string
we were expecting to see.
(parse-with-failure-reason f reason)
Combinators to construct new parsers.
Parse nothing successfully.
parse-epsilon
Parse any single character successfully. Fails at end of input.
parse-anything
Always fail to parse.
parse-nothing
The disjunction combinator. Returns the first combinator that succeeds parsing from the same source and index.
(parse-or f . o)
The conjunction combinator. If both f
and g
parse
successfully starting at the same source and index, returns the
result of g
. Otherwise fails.
(parse-and f g)
The negation combinator. If f
succeeds, fails, otherwise
succeeds with #t
.
(parse-not f)
The sequence combinator. Each combinator is applied in turn just past the position of the previous. If all succeed, returns a list of the results in order, skipping any ignored values.
(parse-seq . o)
Convert the list of parser combinators ls
to a
parse-seq
sequence.
(list->parse-seq ls)
The optional combinator. Parse the combinator f
(in
sequence with any additional combinator args o
), and return
the result, or parse nothing successully on failure.
(parse-optional f . o)
The repetition combinator. Parse f
repeatedly and return a
list of the results. lo
is the minimum number of parses
(deafult 0) to be considered a successful parse, and hi
is
the maximum number (default infinite) before stopping.
(parse-repeat f . o)
Parse f
one or more times.
(parse-repeat+ f)
Parse f
and apply the procedure proc
to the result on success.
(parse-map f proc)
Parse f
and apply the procedure proc
to the substring
of the parsed data. proc
defaults to the identity.
(parse-map-substring f . o)
Parses the same streams as f
but ignores the result on
success. Inside a parse-seq
the result will not be
included in the list of results. Useful for discarding
boiler-plate without the need for post-processing results.
(parse-ignore f)
Parse with f
and further require check?
to return true
when applied to the result.
(parse-assert f check?)
Parse with f
once and keep the first result, not allowing
further backtracking within f
.
(parse-atomic f)
Parse with f
once, keep the first result, and commit to the
current parse path, discarding any prior backtracking options.
Can optionally be passed a failure reason with which all resulting
failure messages will be prefixed.
(parse-commit f . o)
Procedures for performing boundary checks within a parser combinator.
Returns true iff index
is the first index of the first parse
stream source
.
parse-beginning
Returns true iff index
is the last index of the last parse
stream source
.
parse-end
Returns true iff source
, index
indicate the beginning
of a line (or the entire stream).
parse-beginning-of-line
Returns true iff source
, index
indicate the end of a
line (or the entire stream).
parse-end-of-line
Underlying combinators which parse a constant input and, contrary to the parsers documented above, cannot be passed parser combinators as procedure arguments.
Parse a single char which matches x
, which can be a
character, character set, or arbitrary procedure.
(parse-char x)
Parse a single char which does not match x
, which can be a
character, character set, or arbitrary procedure.
(parse-not-char x)
Parse the exact string str
.
(parse-string str)
Parse a sequence of characters matching x
as with
parse-char
, and return the resulting substring.
(parse-token x)
Lazy evaluation of parser combinators and memoization.
A delayed combinator. This is equivalent to the parser combinator
f
, but is delayed so it can be more efficient if never used
and f
is expensive to compute. Moreover, it can allow
self-referentiality as in:
(letrec* ((f (parse-lazy (parse-or (parse-seq g f) h))))
...)
Parse the same strings as f
, but memoize the result at each
source and index to avoid exponential backtracking. name
is
provided for debugging only.
(parse-memoize name f)
Additional high-level combinators implemented using the aforementioned basic combinators. The provided high-level combinators, return more useful parser error messages then the basic ones.
Parser which always fails with the given error message.
(parse-fail msg)
Bind a constant value
to a given parser
.
That is, always return this value if the parser succeeds.
(parse-bind value parser)
Run a parser f
, which must return a list, and convert
its return value to a string using the list->string
procedure.
(parse-as-string parser)
Parse one or more digits (i.e. 0-9
), interpret them as a
decimal number, and return this number.
parse-digits
Parse an ASCII lowercase character (i.e. a-z
).
parse-lowercase
Attempt parsing using the given parser f
, if it fails return a default value def
.
(parse-default f def)
Parse a newline character.
parse-newline
Parse a single blank character (i.e. horizontal whitespace).
parse-blank
Parse one or more blank characters.
parse-blanks+
Parse zero or more blank characters.
parse-blanks
Invokes parser f
between the parsers lhs
and rhs
.
(parse-between lhs f rhs)
Returns the result of parser f
but allows preceding its input
with a backslash character to escape it in the parsed input format.
(parse-esc f)
Parse an alist
mapping chars to values which should be returned for each char.
(parse-alist alist)
Parse a regex literal which is enclosed by the character ch
.
(parse-regex-lit ch)
Parse a regex literal which starts with character ch
and is
terminated by the same character or the end of line.
(parse-regex-lit* ch)
Invoke given parser and strip trailing blanks (if any).
(parse-strip-blanks parser)
Like parse-seq but skip blanks before each parser.
(parse-blanks-seq . o)
Parse a single line (excluding the terminating newline character).
parse-line
Feed the result of the parser ctx
to a single argument procedure f
.
The procedure must then return a new parser which is executed
afterwards on the same index as ctx
.
(parse-with-context ctx f)