There is no options in rflex.
$ rflex target.l
That command generates target.rs file in the same directory.
Syntax of rflex is very similar to flex. The first '%%' means the beginning of the rules. Second one means the end of the rules.
%%
%class Lexer
%result_type i32
abc println!("match abc rule"); return Ok(0i32);
[a-z]+ println!("'{}'", self.yytext());
return Ok(10i32); /* action can be defined in multiple lines that starts with white space */
" " /* Skip white space. This comment cannot be omitted. */
%%
The rule contains pattern and action in the lines.
Pattern is a regular-expression sequences.
Action is a Rust code block to execute when the pattern accepted.
In the above example, abc is pattern, println!("match abc rule"); return Ok(0i32); is action.
%class and %result_type is special directive to replace generated default struct name and return type.
abc and [a-z]+ patterns can both accept abc.
In the rflex, it takes priority that pattern appears first when ambiguous patterns defined.
Scanner code can be called some functions from action or user program.
For example, we can get length of accepted string by yylength function.
pub fn yylex(&mut self) -> Result<i32, Error>- Return next token in
i32.i32can be replaced with%result_typedirective.
- Return next token in
pub fn is_eof(&self) -> bool- Return is the scanner reached EOF.
pub fn yybegin(&mut self, new_state: usize)- Use
new_statelexer state in the next scan.
- Use
pub fn yystate(&self) -> usize- Return current lexer state.
pub fn yylength(&self) -> usize- Return the length of accepted string.
pub fn yytext(&self) -> String- Return the accepted string.
pub fn yytextpos(&self) -> std::ops::Range<usize>- Return the position of accepted string.
pub fn yybytepos(&self) -> std::ops::Range<usize>- Return the byte position of accepted string. It can be used for
str.
- Return the byte position of accepted string. It can be used for
pub fn yycharat(&self, pos: usize) -> Option<char>- Return the character at the relative position (0-origin) in the accepted string.
yylex function returns Result<any_type, Error>.
Error enum type is defined as follows.
When reached end of file, yylex returns Err(Error::EOF).
It returns Err(Error::Unmatch) if the input wasn't accepted.
#[derive(Debug, PartialEq)]
pub enum Error {
EOF,
Unmatch,
}See codes of example1 and example2, too.
// Write your own Rust code here.
// This code will be inserted into the header of generated lexer file.
use std::io; // example
%%
%class Lexer
%result_type i32
abc println!("match abc rule"); return Ok(0i32);
[a-z]+ println!("'{}'", self.yytext()); return Ok(10i32);
%%
// Write your own Rust code that will be inserted into
// `Lexer` impl.
// So this code can be executed like `lexer.remain();`.
pub fn remain(&self) -> usize {
self.current.clone().count()
}
See example1 code.
%field SpaceCounter space_counter
rflex has a %field directive to append any fields to lexer struct.
That makes Lexer struct have space_counter field and generated impl is below:
pub fn new(input: &'a str, space_counter: SpaceCounter) -> Lexer<'a> { /* omission */ }
pub fn get_space_counter(&mut self) -> &mut SpaceCounter { &mut self.space_counter } Then we can specify in new and access SpaceCounter struct via get_space_counter.
%field directive can be specified multiple times not only one.