Toriskia 's Blog

13 篇文章 · 12 个标签 · 6 个友链

← 返回文章列表

2026.04.01

MLIR 笔记(一):Language & AST

1. "Toy" Language#

  • Tensors:rank 2\le 2
  • 数据类型:64-bit 浮点
  • Values 不可变
  • 内存自动管理
toy
1def main() { 2 # Define a variable `a` with shape <2, 3>, initialized with the literal value. 3 # The shape is inferred from the supplied literal. 4 var a = [[1, 2, 3], [4, 5, 6]]; 5 6 # b is identical to a, the literal tensor is implicitly reshaped: defining new 7 # variables is the way to reshape tensors (element count must match). 8 var b<2, 3> = [1, 2, 3, 4, 5, 6]; 9 10 # transpose() and print() are the only builtin, the following will transpose 11 # a and b and perform an element-wise multiplication before printing the result. 12 print(transpose(a) * transpose(b)); 13}
  • 类型检查:静态。只在需要的时候指定 Tensor 形状。
  • 函数:参数都是 unranked,我们不知道维度,每个调用的地方会根据实参的形状进行专门化(specialization)。
toy
1# User defined generic function that operates on unknown shaped arguments. 2def multiply_transpose(a, b) { 3 return transpose(a) * transpose(b); 4} 5 6def main() { 7 # Define a variable `a` with shape <2, 3>, initialized with the literal value. 8 var a = [[1, 2, 3], [4, 5, 6]]; 9 var b<2, 3> = [1, 2, 3, 4, 5, 6]; 10 11 # This call will specialize `multiply_transpose` with <2, 3> for both 12 # arguments and deduce a return type of <3, 2> in initialization of `c`. 13 var c = multiply_transpose(a, b); 14 15 # A second call to `multiply_transpose` with <2, 3> for both arguments will 16 # reuse the previously specialized and inferred version and return <3, 2>. 17 var d = multiply_transpose(b, a); 18 19 # A new call with <3, 2> (instead of <2, 3>) for both dimensions will 20 # trigger another specialization of `multiply_transpose`. 21 var e = multiply_transpose(c, d); 22 23 # Finally, calling into `multiply_transpose` with incompatible shapes 24 # (<2, 3> and <3, 2>) will trigger a shape inference error. 25 var f = multiply_transpose(a, c); 26}

2. AST#

ast
1Module: 2 Function 3 Proto 'multiply_transpose' @test/Examples/Toy/Ch1/ast.toy:4:1 4 Params: [a, b] 5 Block { 6 Return 7 BinOp: * @test/Examples/Toy/Ch1/ast.toy:5:25 8 Call 'transpose' [ @test/Examples/Toy/Ch1/ast.toy:5:10 9 var: a @test/Examples/Toy/Ch1/ast.toy:5:20 10 ] 11 Call 'transpose' [ @test/Examples/Toy/Ch1/ast.toy:5:25 12 var: b @test/Examples/Toy/Ch1/ast.toy:5:35 13 ] 14 } // Block 15 Function 16 Proto 'main' @test/Examples/Toy/Ch1/ast.toy:8:1 17 Params: [] 18 Block { 19 VarDecl a<> @test/Examples/Toy/Ch1/ast.toy:11:3 20 Literal: <2, 3>[ <3>[ 1.000000e+00, 2.000000e+00, 3.000000e+00], <3>[ 4.000000e+00, 5.000000e+00, 6.000000e+00]] @test/Examples/Toy/Ch1/ast.toy:11:11 21 VarDecl b<2, 3> @test/Examples/Toy/Ch1/ast.toy:15:3 22 Literal: <6>[ 1.000000e+00, 2.000000e+00, 3.000000e+00, 4.000000e+00, 5.000000e+00, 6.000000e+00] @test/Examples/Toy/Ch1/ast.toy:15:17 23 VarDecl c<> @test/Examples/Toy/Ch1/ast.toy:19:3 24 Call 'multiply_transpose' [ @test/Examples/Toy/Ch1/ast.toy:19:11 25 var: a @test/Examples/Toy/Ch1/ast.toy:19:30 26 var: b @test/Examples/Toy/Ch1/ast.toy:19:33 27 ] 28 VarDecl d<> @test/Examples/Toy/Ch1/ast.toy:22:3 29 Call 'multiply_transpose' [ @test/Examples/Toy/Ch1/ast.toy:22:11 30 var: b @test/Examples/Toy/Ch1/ast.toy:22:30 31 var: a @test/Examples/Toy/Ch1/ast.toy:22:33 32 ] 33 VarDecl e<> @test/Examples/Toy/Ch1/ast.toy:25:3 34 Call 'multiply_transpose' [ @test/Examples/Toy/Ch1/ast.toy:25:11 35 var: c @test/Examples/Toy/Ch1/ast.toy:25:30 36 var: d @test/Examples/Toy/Ch1/ast.toy:25:33 37 ] 38 VarDecl f<> @test/Examples/Toy/Ch1/ast.toy:28:3 39 Call 'multiply_transpose' [ @test/Examples/Toy/Ch1/ast.toy:28:11 40 var: a @test/Examples/Toy/Ch1/ast.toy:28:30 41 var: c @test/Examples/Toy/Ch1/ast.toy:28:33 42 ] 43 } // Block

3. 编译前端#

cpp
1auto buffer = fileOrErr.get()->getBuffer(); 2LexerBuffer lexer(buffer.begin(), buffer.end(), std::string(filename)); 3Parser parser(lexer); 4return parser.parseModule();

其中 Lexer 负责把字符变成 token,Parser 负责把 token 变成 AST。

到目前为止只确保语法结构正确,而没有保证程序语义正确。比如这些事情还没做:

  • 变量是否先声明后使用
  • 函数是否存在
  • 参数个数是否匹配
  • 类型是否兼容
  • return 是否符合函数语义