Watch a lexer scan source code character by character, recognizing tokens through a deterministic finite automaton
A lexer (lexical analyzer) is the first phase of a compiler. It reads raw source code as a stream of characters and groups them into meaningful sequences called tokens.
Scanner States:
The scanner uses a Deterministic Finite Automaton (DFA) to decide state transitions. Each character determines the next state. When a token boundary is found, the buffered characters are emitted as a token.
Maximal munch: The lexer always reads as many characters as possible before emitting a token (e.g., "===" is one token, not three).