Grammar Syntax Reference¶

This page is a complete reference for Grammar-Kit's BNF grammar syntax. For a tutorial-style introduction, see Grammar Syntax.

File Structure¶

A .bnf file consists of an optional attribute header block followed by rule definitions:

{
  // attribute header block
  parserClass="com.example.MyParser"
}

// rule definitions
root ::= item *
item ::= id '=' value

The formal structure:

grammar     ::= grammar_element *
grammar_element ::= attrs | rule

Rules¶

A rule associates a name with an expression. It may have modifiers, inline attributes, and an optional trailing semicolon:

rule ::= modifier* id '::=' expression attrs? ';'?

Rule modifiers¶

Modifier	Effect
`private`	No AST node generated. Child nodes fold into the parent.
`external`	No parsing code generated. The parse function is hand-written.
`meta`	Parametrized rule that takes parse functions as arguments.
`left`	Takes the previous sibling and becomes its parent (for left-associative operators).
`inner`	Used with `left`. Takes the previous sibling and becomes its child.
`upper`	Takes the parent node and replaces it.
`fake`	Only PSI classes are generated. No parsing code is produced.

Multiple modifiers can combine on a single rule:

private meta list_of ::= <<p>> (',' <<p>>) *
left inner assign_expr ::= '=' expr

Expressions¶

The right-hand side of a rule is an expression built from sequences, choices, and quantified terms:

expression ::= sequence ('|' sequence)*
sequence   ::= option+
option     ::= predicate | quantified | paren_expr | simple

Choices¶

The | operator separates alternatives. The parser tries each branch in order:

value ::= number | string | object | array

Sequences¶

Adjacent terms form a sequence that must match in order:

pair ::= key ':' value

Quantifiers¶

Operator	Meaning	Example
`?`	Zero or one (optional)	`';'?`
`*`	Zero or more	`item *`
`+`	One or more	`item +`

item_list ::= item (',' item) *
optional_semi ::= ';'?
arguments ::= expr (',' expr) +

Predicates¶

Predicates test without consuming input. They are used for lookahead:

Operator	Meaning	Example
`&`	Positive lookahead (succeeds if the expression matches)	`&'}'`
`!`	Negative lookahead (succeeds if the expression does not match)	`!'}'`

private item_recover ::= !(")" | ",")
private items ::= [!")" item (',' item) *]

Grouping¶

Parentheses `( )`¶

Group expressions with standard precedence:

list ::= '(' item (',' item) * ')'

Brackets `[ ]`¶

Brackets denote an optional group, equivalent to (...)?:

// These two forms are equivalent:
optional_items ::= [item (',' item) *]
optional_items ::= (item (',' item) *)?

Braces `{ }` in expressions¶

Within a rule body, braces create an alternative grouping. At the top level, braces delimit attribute blocks.

Tokens and Literals¶

String literals¶

Quoted strings match literal text. Both single and double quotes work:

plus ::= '+'
keyword ::= "while"

Token references¶

Unquoted identifiers reference other rules or declared tokens:

expr ::= number PLUS number

Token declarations¶

The tokens attribute in the header block declares tokens with optional values:

{
  tokens = [
    id="regexp:\w+"       // regexp token (regexp: prefix)
    string                // name only
    PLUS_OP="+"           // text-matched token
    SWITCH="switch"       // keyword token
  ]
}

Tokens have three categories:

Regexp tokens use the regexp: prefix and define a lexer pattern. Required for Live Preview.
Text-matched tokens have a quoted string value (e.g., PLUS_OP="+").
Name-only tokens have no value and are matched by the lexer based on external configuration.

External Expressions¶

External expressions invoke methods not defined in the grammar. They are enclosed in << >>:

root ::= <<parseRoot item>>
meta comma_list ::= <<p>> (',' <<p>>) *
usage ::= <<comma_list expr>>

The first identifier inside << >> is the method name. Subsequent items are arguments passed to it. External expressions work with meta rules to implement parametrized parsing.

Attribute Blocks¶

Attribute blocks appear in curly braces. The header block at the top of the file sets global attributes. Inline attribute blocks on rules set rule-level attributes:

{
  parserClass="com.example.MyParser"
  extends(".*_expr")=expr
}

item ::= number {pin=1 recoverWhile=item_recover}

Attribute syntax:

attrs       ::= '{' attr* '}'
attr        ::= id attr_pattern? '=' attr_value ';'?
attr_pattern ::= '(' string ')'
attr_value  ::= string | number | boolean | value_list | id
value_list  ::= '[' list_entry* ']'
list_entry  ::= (id ('=' string)? | string) ';'?

For the complete list of attributes, see Attribute Reference.

Comments¶

Grammar-Kit supports both comment styles:

// Line comment: everything after // to end of line

/* Block comment:
   can span multiple lines */

Operators Summary¶

Symbol	Name	Meaning
`::=`	Definition	Defines a rule
`\\|`	Choice	Separates alternatives
`?`	Optional	Zero or one
`*`	Repetition	Zero or more
`+`	One-or-more	One or more
`&`	And-predicate	Positive lookahead
`!`	Not-predicate	Negative lookahead
`=`	Assignment	Assigns an attribute value
`( )`	Parentheses	Groups expressions
`[ ]`	Brackets	Optional group (same as `(...)?`)
`{ }`	Braces	Attribute blocks
`<< >>`	External	External expression call
`//`	Line comment	Comment to end of line
`/* */`	Block comment	Multi-line comment
`;`	Semicolon	Optional statement terminator

Reserved Identifiers¶

Grammar-Kit reserves these identifiers for internal use:

Identifier	Usage
`regexp:`	Prefix for regexp token definitions in the `tokens` attribute
`#auto`	Value for `recoverWhile` that means "not in NEXT set of this rule"
`TokenSets`	Generated inner class name for token set constants