461: Compiler Construction =========================== Theory and practice of designing and implementing a compiler front-end. Topics ------ - Formal language theory (regular and context-free grammars) - Lexical analysis and finite automata - Top-down and bottom-up parsing (LL, LR) - Abstract syntax trees and semantic analysis - Symbol tables and type checking Projects -------- Lexical Analyzer (``461/project/prac/scan.l``) ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ A lexer written using the Flex tool (``lex``-compatible). Tokenizes a custom language, recognizing keywords, identifiers, integer/float literals, operators, and comments. Demonstrates the use of regular expressions to define token patterns and associated semantic actions. .. code-block:: text ID_CHAR [a-zA-Z_] %% "stego" { printf("FOUND: stego\n"); return("stego"); } {ID_CHAR}({ID_CHAR})* { printf("IDENTIFIER\n"); return(0); } "asinine" { printf("FOUND: asinine\n"); return("asinine"); } . { /* ignore unrecognised characters */ } %% yywrap() { return(1); } main() { int i = yylex(); while (i > 0) i = yylex(); } Scan Exercises (``461/project/scan/``) ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ A progression of four Flex lexers (``scan1.l`` through ``scan4.l``) of increasing complexity, exploring whitespace handling, multi-character tokens, and state-based scanning. .. code-block:: text D [0-9] L [a-zA-Z_] H [a-fA-F0-9] %% {L}({L}|{D})* { return Table_Lookup(Table); } /* identifier */ 0[xX]{H}+ { return CONSTANT; } /* hex literal */ {D}+ { return CONSTANT; } /* decimal */ {D}*"."{D}+ { return CONSTANT; } /* float */ %% Memory Allocation Exercises (``461/hw/alloca/``) ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ Low-level homework programs (``alloca.c``, ``alloca2.c``, ``malloc.c``) stress-testing memory allocation by calling ``malloc`` in tight loops, illustrating heap behavior and the cost of dynamic allocation within a compiler runtime context. .. code-block:: c /* Stress-test the heap: allocate one int per iteration, 1 million times */ int call_alloca(void) { return (int)(intptr_t)malloc(sizeof(int)); } for (i = 0; i < 1000000; i++) call_alloca(); File Statistics (``461/hw/stat/``) ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ A series of programs (``stat.c`` through ``stat6.c``) progressively building a utility that uses ``stat(2)`` to query file metadata, then reads and prints file contents — an exercise in POSIX system calls used by compiler toolchains. .. code-block:: c struct stat *buf = malloc(sizeof(struct stat)); stat(argv[1], buf); /* query file metadata */ /* allocate a buffer the exact size of the file, then read it */ char *buffer = malloc(buf->st_size); fread(buffer, 1, buf->st_size, infile); printf("%s", buffer);