This is the documentation for the R7RS Scheme library chibi parse as generated using scmdoc, a new documentation tool for the Scheme programming language. The library itself provides parser combinators with optional memoization and convenient syntax.
As an example, consider the following parser for hexademical integers:
(define hexdecimal
(parse-map
(parse-seq
(parse-string "0x")
(parse-map
(parse-token char-set:hex-digit)
(lambda (str)
(string->number str 16))))
cadr))
Parse streams are an abstraction to treat ports as proper streams so that we can backtrack from previous states. A single Parse-Stream record represents a single buffered chunk of text.
Create a parse stream on the given filename
, with a
possibly already opened port
.
(make-parse-stream filename . o)
Open filename
and create a parse stream on it.
(file->parse-stream filename)
Create a parse stream on a string str
.
(string->parse-stream str)
Returns true iff i
is the first character position in the
parse stream source
.
(parse-stream-start? source i)
Returns true iff i
is the last character position in the
parse stream source
.
(parse-stream-end? source i)
Returns the character in parse stream source
indexed by
i
.
(parse-stream-ref source i)
Returns a string composed of the characters starting at parse
stream s0
index i0
(inclusive), and ending at s1
index i1
(exclusive).
(parse-stream-substring s0 i0 s1 i1)
Procedures for invoking a constructed parser on a created parse stream.
Combinator to indicate failure.
(parse-failure s i reason)
Call the parser combinator f
on the parse stream
source
, starting at index index
, passing the result to
the given success continuation sk
, which should be a
procedure of the form (result source index fail)
. The
optional failure continuation should be a procedure of the form
(source index reason)
, and defaults to just returning
#f
.
(call-with-parse f source index sk . o)
Call the parser combinator f
on the parse stream
source
, at index index
, and return the result, or
#f
if parsing fails.
(parse f source . o)
Call the parser combinator f
on the parse stream
source
, at index index
. If the entire source is not
parsed, raises an error, otherwise returns the result.
(parse-fully f source . o)
The fundamental parse iterator. Repeatedly applies the parser
combinator f
to source
, starting at index
, as
long as a valid parse is found. On each successful parse applies
the procedure kons
to the parse result and the previous
kons
result, beginning with knil
. If no parses
succeed returns knil
.
(parse-fold f kons knil source . o)
Parse as many of the parser combinator f
from the parse
stream source
, starting at index
, as possible, and
return the result as a list.
(parse->list f source . o)
As parse->list
but requires the entire source be parsed
with no left over characters, signalling an error otherwise.
(parse-fully->list f source . o)
Return a new parser combinator with the same behavior as f
,
but on failure replaces the reason with reason
. This can be
useful to provide more descriptive parse failure reasons when
chaining combinators. For example, parse-string
just
expects to parse a single fixed string. If it were defined in
terms of parse-char
, failure would indicate some char
failed to match, but it's more useful to describe the whole string
we were expecting to see.
(parse-with-failure-reason f reason)
Simple parser combinators which can be combined to create more complex ones.
Parse nothing successfully.
parse-epsilon
Parse any single character successfully. Fails at end of input.
parse-anything
Always fail to parse.
parse-nothing
The disjunction combinator. Returns the first combinator that succeeds parsing from the same source and index.
(parse-or f . o)
The conjunction combinator. If both f
and g
parse
successfully starting at the same source and index, returns the
result of g
. Otherwise fails.
(parse-and f g)
The negation combinator. If f
succeeds, fails, otherwise
succeeds with #t
.
(parse-not f)
The sequence combinator. Each combinator is applied in turn just past the position of the previous. If all succeed, returns a list of the results in order, skipping any ignored values.
(parse-seq . o)
Convert the list of parser combinators ls
to a
parse-seq
sequence.
(list->parse-seq ls)
The optional combinator. Parse the combinator f
(in
sequence with any additional combinator args o
), and return
the result, or parse nothing successully on failure.
(parse-optional f . o)
The repetition combinator. Parse f
repeatedly and return a
list of the results. lo
is the minimum number of parses
(deafult 0) to be considered a successful parse, and hi
is
the maximum number (default infinite) before stopping.
(parse-repeat f . o)
Parse f
one or more times.
(parse-repeat+ f)
Parse f
and apply the procedure proc
to the result on success.
(parse-map f proc)
Parse f
and apply the procedure proc
to the substring
of the parsed data. proc
defaults to the identity.
(parse-map-substring f . o)
Parses the same streams as f
but ignores the result on
success. Inside a parse-seq
the result will not be
included in the list of results. Useful for discarding
boiler-plate without the need for post-processing results.
(parse-ignore f)
Parse with f
and further require check?
to return true
when applied to the result.
(parse-assert f check?)
Parse with f
once and keep the first result, not allowing
further backtracking within f
.
(parse-atomic f)
Parse with f
once, keep the first result, and commit to the
current parse path, discarding any prior backtracking options.
Since prior backtracking options are discarded, prior failure
continuations are also not used. By default, #f
is
returned on failure, a custom failure continuation can be passed
as the second argument.
(parse-commit f . o)
Procedures for performing boundary checks within a parser combinator.
Returns true iff index
is the first index of the first parse
stream source
.
parse-beginning
Returns true iff index
is the last index of the last parse
stream source
.
parse-end
Returns true iff source
, index
indicate the beginning
of a line (or the entire stream).
parse-beginning-of-line
Returns true iff source
, index
indicate the end of a
line (or the entire stream).
parse-end-of-line
Returns true iff source
, index
indicate the beginning
of a word (or the entire stream).
parse-beginning-of-word
Returns true iff source
, index
indicate the end of a
word (or the entire stream).
parse-end-of-word
Parse the combinator word
(default a parse-token
of
char-alphabetic?
or underscores), ensuring it begins and
ends on a word boundary.
(parse-word . o)
As parse-word
, but instead of an arbitrary word
combinator takes a character predicate pred
(conjoined with
char-alphabetic?
or underscore), and parses a sequence of
those characters with parse-token
. Returns the parsed
substring.
(parse-word+ . o)
Parser for parsing input in constant time.
Parse a single char which matches x
, which can be a
character, character set, or arbitrary procedure.
(parse-char x)
Parse a single char which does not match x
, which can be a
character, character set, or arbitrary procedure.
(parse-not-char x)
Parse the exact string str
.
(parse-string str)
Parse a sequence of characters matching x
as with
parse-char
, and return the resulting substring.
(parse-token x)
We provide an extended subset of SRE syntax (see
SRFI 115).
taking advantage of more general parsing features. These are just
translated directly into parser combinators, with characters and
strings implicitly matching themselves. For example, '(or "foo" "bar")
) matches either of the strings "foo"
or "bar"
. Existing parser
combinators may be embedded directly. This is of course more
powerful than SREs since it is not restricted to regular languages
(or in fact any languages), though it does not provide the same
performance guarantees.
(parse-sre x)
Procedures to avoid expontential backtracking through memoization.
Parse the same strings as f
, but memoize the result at each
source and index to avoid exponential backtracking. name
is
provided for debugging only.
(parse-memoize name f)