B Syntax

B.1 Notational Conventions

These notational conventions are used for presenting syntax:

[pattern] optional
{pattern} zero or more repetitions
(pattern) grouping
pat1 | pat2 choice
pat<pat'> difference---elements generated by pat
except those generated by pat'
fibonacci terminal syntax in typewriter font

BNF-like syntax is used throughout, with productions having the form:

nonterm       -> alt1 | alt2 | ... | altn

There are some families of nonterminals indexed by precedence levels (written as a superscript). Similarly, the nonterminals op, varop, and conop may have a double index: a letter l, r, or n for left-, right- or nonassociativity and a precedence level. A precedence-level variable i ranges from 0 to 9; an associativity variable a varies over {l, r, n}. Thus, for example

aexp          -> ( expi+1 qop(a,i) )
actually stands for 30 productions, with 10 substitutions for i and 3 for a.

In both the lexical and the context-free syntax, there are some ambiguities that are to be resolved by making grammatical phrases as long as possible, proceeding from left to right (in shift-reduce parsing, resolving shift/reduce conflicts by shifting). In the lexical syntax, this is the "consume longest lexeme" rule. In the context-free syntax, this means that conditionals, let-expressions, and lambda abstractions extend to the right as far as possible.

B.2 Lexical Syntax

program       -> { lexeme | whitespace }
lexeme        -> varid | conid | varsym | consym | literal | special | reservedop | reservedid
literal       -> integer | float | char | string
special       -> ( | ) | , | ; | [ | ] | _ | ` | { | }

whitespace    -> whitestuff {whitestuff}
whitestuff    -> whitechar | comment | ncomment
whitechar     -> newline | vertab | formfeed | space | tab | nonbrkspc
newline       -> a newline (system dependent)
space         -> a space
tab           -> a horizontal tab
vertab        -> a vertical tab
formfeed      -> a form feed
nonbrkspc     -> a non-breaking space
comment       -> -- {any} newline
ncomment      -> {- ANYseq {ncomment ANYseq} -}
ANYseq        -> {ANY}<{ANY} ( {- | -} ) {ANY}>
ANY           -> any | newline | vertab | formfeed
any           -> graphic | space | tab | nonbrkspc
graphic       -> large | small | digit | symbol | special | : | " | '

small         -> ASCsmall | ISOsmall
ASCsmall      -> a | b | ... | z
ISOsmall      -> à | á | â | ã | ä | å | æ | ç | è | é | ê | ë
              |  ì | í | î | ï | ð | ñ | ò | ó | ô | õ | ö | ø 
              |  ù | ú | û | ü | ý | þ | ÿ | ß

large         -> ASClarge | ISOlarge
ASClarge      -> A | B | ... | Z
ISOlarge      -> À | Á | Â | Ã | Ä | Å | Æ | Ç | È | É | Ê | Ë
              |  Ì | Í | Î | Ï | Ð | Ñ | Ò | Ó | Ô | Õ | Ö | Ø
              |  Ù | Ú | Û | Ü | Ý | Þ

symbol        -> ASCsymbol | ISOsymbol
ASCsymbol     -> ! | # | $ | % | & | * | + | . | / | < | = | > | ? | @
              |  \ | ^ | | | - | ~
ISOsymbol     -> ¡ | ¢ | £ | ¤ | ¥ | ¦ | § | ¨ | © | ª | « 
              |  ¬ | ­ | ® | &hibar; | ° | ± | ² | ³ | ´ | µ | ¶
              |  · | ¸ | ¹| º | » | ¼ | ½ | ¾ | ¿ | × | ÷

digit         -> 0 | 1 | ... | 9
octit         -> 0 | 1 | ... | 7
hexit         -> digit | A | ... | F | a | ... | f

varid         -> (small {small | large | digit | ' | _})<reservedid>
conid         -> large {small | large | digit | ' | _} 
reservedid    -> case | class | data | default | deriving | do | else
              |  if | import | in | infix | infixl | infixr | instance
              |  let | module | newtype | of | then | type | where
specialid     -> as | qualified | hiding

varsym        -> ( symbol {symbol | :} )<reservedop>
consym        -> (: {symbol | :})<reservedop>
reservedop    -> .. | :: | = | \ | | | <- | -> |  @ | ~ | =>
specialop     -> - | !

varid                                                   (variables)
conid                                                   (constructors)
tyvar         -> varid                                 (type variables)
tycon         -> conid                                 (type constructors)
tycls         -> conid                                 (type classes)
modid         -> conid                                 (modules)

qvarid        -> [ modid . ] varid
qconid        -> [ modid . ] conid
qtycon        -> [ modid . ] tycon
qtycls        -> [ modid . ] tycls
qvarsym       -> [ modid . ] varsym
qconsym       -> [ modid . ] consym

decimal       -> digit{digit}
octal         -> octit{octit}
hexadecimal   -> hexit{hexit}

integer       -> decimal
              |  0o octal | 0O octal
              |  0x hexadecimal | 0X hexadecimal
float         -> decimal . decimal[(e | E)[- | +]decimal]

char          -> ' (graphic<' | \> | space | escape<\&>) '
string        -> " {graphic<" | \> | space | escape | gap} "
escape        -> \ ( charesc | ascii | decimal | o octal | x hexadecimal )
charesc       -> a | b | f | n | r | t | v | \ | " | ' | &
ascii         -> ^cntrl | NUL | SOH | STX | ETX | EOT | ENQ | ACK 
              |  BEL | BS | HT | LF | VT | FF | CR | SO | SI | DLE 
              |  DC1 | DC2 | DC3 | DC4 | NAK | SYN | ETB | CAN 
              |  EM | SUB | ESC | FS | GS | RS | US | SP | DEL
cntrl         -> ASClarge | @ | [ | \ | ] | ^ | _
gap           -> \ whitechar {whitechar} \

B.3 Layout

Definitions: The indentation of a lexeme is the column number indicating the start of that lexeme; the indentation of a line is the indentation of its leftmost lexeme. To determine the column number, assume a fixed-width font with this tab convention: tab stops are 8 characters apart, and a tab character causes the insertion of enough spaces to align the current position with the next tab stop.

In the syntax given in the rest of the report, declaration lists are always preceded by the keyword where, let, do, or of, and are enclosed within curly braces ({ }) with the individual declarations separated by semicolons (;). For example, the syntax of a let expression is:

let { decl1 ; decl2 ; ... ; decln [;] } in exp

Haskell permits the omission of the braces and semicolons by using layout to convey the same information. This allows both layout-sensitive and -insensitive styles of coding, which can be freely mixed within one program. Because layout is not required, Haskell programs can be straightforwardly produced by other programs.

The layout (or "off-side") rule takes effect whenever the open brace is omitted after the keyword where, let, do, or of. When this happens, the indentation of the next lexeme (whether or not on a new line) is remembered and the omitted open brace is inserted (the whitespace preceding the lexeme may include comments). For each subsequent line, if it contains only whitespace or is indented more, then the previous item is continued (nothing is inserted); if it is indented the same amount, then a new item begins (a semicolon is inserted); and if it is indented less, then the declaration list ends (a close brace is inserted). A close brace is also inserted whenever the syntactic category containing the declaration list ends; that is, if an illegal lexeme is encountered at a point where a close brace would be legal, a close brace is inserted. The layout rule matches only those open braces that it has inserted; an explicit open brace must be matched by an explicit close brace. Within these explicit open braces, no layout processing is performed for constructs outside the braces, even if a line is indented to the left of an earlier implicit open brace.

Given these rules, a single newline may actually terminate several declaration lists. Also, these rules permit:


f x = let a = 1; b = 2 
          g y = exp2
       in exp1
making a, b and g all part of the same declaration list.

To facilitate the use of layout at the top level of a module (an implementation may allow several modules may reside in one file), the keyword module and the end-of-file token are assumed to occur in column 0 (whereas normally the first column is 1). Otherwise, all top-level declarations would have to be indented.

Section 1.5 gives an example which uses the layout rule.

B.4 Context-Free Syntax

module        -> module modid [exports] where body
              |  body
body          -> { [impdecls ;] [[fixdecls ;] topdecls [;]] }
              |  { impdecls [;] }

impdecls      -> impdecl1 ; ... ; impdecln                

exports       -> ( export1 , ... , exportn [ , ] )        

export        -> qvar
              |  qtycon [(..) | ( qcname1 , ... , qcnamen )]     
              |  qtycls [(..) | ( qvar1 , ... , qvarn )]     
              |  module modid
qcname        -> qvar | qcon

impdecl       -> import [qualified] modid [as modid] [impspec]
impspec       -> ( import1 , ... , importn [ , ] )        
              |  hiding ( import1 , ... , importn [ , ] )     

import        -> var
              |  tycon [ (..) | ( cname1 , ... , cnamen )]     
              |  tycls [(..) | ( var1 , ... , varn )]     
cname         -> var | con

fixdecls      -> fix1 ; ... ; fixn                        
fix           -> infixl [digit] ops 
              |  infixr [digit] ops
              |  infix  [digit] ops 
ops           -> op1 , ... , opn                          

topdecls      -> topdecl1 ; ... ; topdecln                
topdecl       -> type simpletype = type
              |  data [context =>] simpletype = constrs [deriving]
              |  newtype [context =>] simpletype = con atype [deriving]
              |  class [context =>] simpleclass [where { cbody [;] }]
              |  instance [context =>] qtycls inst [where { valdefs [;] }]
              |  default (type1 , ... , typen)            
              |  decl

decls         -> decl1 ; ... ; decln                      
decl          -> signdecl
              |  valdef
decllist      -> { decls [;] }

signdecl      -> vars :: [context =>] type

vars          -> var1 , ..., varn                      (n>=1)

type          -> btype [-> type]                       (function type)

btype         -> [btype] atype                         (type application)

atype         -> gtycon
              |  tyvar
              |  ( type1 , ... , typek )               (tuple type, k>=2)
              |  [ type ]                              (list type)
              |  ( type )                              (parenthesised constructor)

gtycon        -> qtycon
              |  ()                                    (unit type)
              |  []                                    (list constructor)
              |  (->)                                  (function constructor)
              |  (,{,})                                (tupling constructors)

context       -> class
              |  ( class1 , ... , classn )             (n>=1)
class         -> qtycls tyvar 

simpletype    -> tycon tyvar1 ... tyvark
constrs       -> constr1 | ... | constrn               (n>=1)
constrs       -> constr1 | ... | constrn               (n>=1)
constr        -> con [!] atype1 ... [!] atypek         (arity con = k, k>=0)
              |  (btype | ! atype) conop (btype | ! atype)  (infix conop)
              |  con { fielddecl1 , ... , fielddecln }  (n>=1)
fielddecl     -> vars :: (type | ! atype)
deriving      -> deriving (dclass | (dclass1, ... , dclassn)) (n>=0)
dclass        -> qtycls

simpleclass   -> tycls tyvar 
cbody         -> [ cmethods [ ; cdefaults ] ]
cmethods      -> signdecl1 ; ... ; signdecln           (n >= 1)
cdefaults     -> valdef1 ; ... ; valdefn               (n >= 1)

inst          -> gtycon
              |  ( gtycon tyvar1 ... tyvark )          (k>=0, tyvars distinct)
              |  ( tyvar1 , ... , tyvark )             (k>=2, tyvars distinct)
              |  [ tyvar ]
              |  ( tyvar1 -> tyvar2 )                  tyvar1 and tyvar2 distinct
valdefs       -> valdef1 ; ... ; valdefn               (n>=0)

valdef        -> lhs = exp [where decllist]
              |  lhs gdrhs [where decllist]

lhs           -> pat0
              |  funlhs

funlhs        -> var apat { apat }
              |  pati+1 varop(a,i) pati+1
              |  lpati varop( l,i) pati+1
              |  pati+1 varop( r,i) rpati

gdrhs         -> gd = exp [gdrhs]

gd            -> | exp0 

exp           -> exp0 :: [context =>] type             (expression type signature)
              |  exp0
expi          -> expi+1 [qop( n,i) expi+1]
              |  lexpi
              |  rexpi
lexpi         -> (lexpi | expi+1) qop( l,i) expi+1
lexp6         -> - exp7
rexpi         -> expi+1 qop( r,i) (rexpi | expi+1)
exp10         -> \ apat1 ... apatn -> exp              (lambda abstraction, n>=1)
              |  let decllist in exp                   (let expression)
              |  if exp then exp else exp              (conditional)
              |  case exp of { alts [;] }              (case expression)
              |  do { stmts [;] }                      (do expression)
              |  fexp
fexp          -> [fexp] aexp                           (function application)

aexp          -> qvar                                  (variable)
              |  gcon                                  (general constructor)
              |  literal 
              |  ( exp )                               (parenthesised expression)
              |  ( exp1 , ... , expk )                 (tuple, k>=2)
              |  [ exp1 , ... , expk ]                 (list, k>=1)
              |  [ exp1 [, exp2] .. [exp3] ]           (arithmetic sequence)
              |  [ exp | qual1 , ... , qualn ]         (list comprehension, n>=1)
              |  ( expi+1 qop(a,i) )                   (left section)
              |  ( qop(a,i) expi+1 )                   (right section)
              |  qcon { fbind1 , ... , fbindn }        (labeled construction, n>=0)
              |  aexp{qcon} { fbind1 , ... , fbindn }  (labeled update, n >= 1)

qual          -> pat <- exp 
              |  let decllist
              |  exp 

alts          -> alt1 ; ... ; altn                     (n>=1)
alt           -> pat -> exp [where decllist]
              |  pat gdpat [where decllist]

gdpat         -> gd -> exp [ gdpat ]

stmts         -> exp [; stmts]
              |  pat <- exp ; stmts
              |  let decllist ; stmts

fbinds        -> { fbind1 , ... , fbindn }             (n>=0)
fbind         -> var | var = exp
 

pat           -> var + integer                         (successor pattern)
              |  pat0
pati          -> pati+1 [qconop( n,i) pati+1]
              |  lpati
              |  rpati
lpati         -> (lpati | pati+1) qconop( l,i) pati+1
lpat6         -> - (integer | float)                   (negative literal)
rpati         -> pati+1 qconop( r,i) (rpati | pati+1)
pat10         -> apat
              |  gcon apat1 ... apatk                  (arity gcon = k, k>=1)

apat          -> var [ @ apat]                         (as pattern)
              |  gcon                                  (arity gcon = 0) 
              |  qcon { fpat1 , ... , fpatk }          (labeled pattern, k>=0)
              |  literal
              |  _                                     (wildcard)
              |  ( pat )                               (parenthesised pattern)
              |  ( pat1 , ... , patk )                 (tuple pattern, k>=2)
              |  [ pat1 , ... , patk ]                 (list pattern, k>=1) 
              |  ~ apat                                (irrefutable pattern)

fpat          -> var = pat
              |  var

gcon          -> ()
              |  []
              |  (,{,})
              |  qcon

var           -> varid | ( varsym )                    (variable)
qvar          -> qvarid | ( qvarsym )                  (qualified variable)
con           -> conid | ( consym )                    (constructor)
qcon          -> qconid | ( qconsym )                  (qualified constructor)
varop         -> varsym | ` varid`                     (variable operator)
qvarop        -> qvarsym | ` qvarid`                   (qualified variable operator)
conop         -> consym | ` conid`                     (constructor operator)
qconop        -> qconsym | ` qconid`                   (qualified constructor operator)
op            -> varop | conop                         (operator)
qop           -> qvarop | qconop                       (qualified operator)

Next Section: Literate Comments
The Haskell 1.3 Report