1. Lexical Analysis:
- Definition: Lexical analysis is the first phase of compiler design. It involves breaking the input source code into a sequence of meaningful units called lexemes or tokens.
- Goal: The primary goal of lexical analysis is to identify and categorize individual characters, keywords, operators, and other symbols based on the defined patterns and rules of the programming language.
- Output: Lexical analysis produces a stream of tokens, each associated with a specific type (e.g., identifier, keyword, operator, punctuation, etc.).
2. Syntax Analysis:
- Definition: Syntax analysis, also known as parsing, is the second phase of compiler design. It involves checking whether the sequence of tokens produced by the lexical analyzer conforms to the grammatical rules of the programming language.
- Goal: The objective of syntax analysis is to verify if the structure of the source code follows the expected syntactic rules and constructs defined for the language.
- Output: Syntax analysis generates a parse tree or an abstract syntax tree (AST) that represents the hierarchical structure and relationships between the syntactic components of the source code.
3. Semantic Analysis:
- Definition: Semantic analysis is the final and most complex phase of compiler design. It involves checking the semantic correctness and meaningfulness of the source code.
- Goal: Semantic analysis aims to ensure that the program follows logical and consistent rules, identifies type compatibility, checks for undefined or undeclared variables, and detects any semantic errors.
- Output: Semantic analysis produces an intermediate representation (IR) of the source code, such as a symbol table or an annotated AST, which captures the semantic information and types of entities within the program.
In summary:
- Lexical analysis identifies and categorizes individual characters and symbols into tokens.
- Syntax analysis checks the grammatical structure and syntax of the source code.
- Semantic analysis verifies the logical correctness and meaningfulness of the program.
These phases work sequentially to translate the source code into a machine-readable format and identify potential errors or ambiguities in the program.