Difference between an AST and ASR

Let us take a simple Fortran code:

integer function f(a, b) result(r)
integer, intent(in) :: a, b
r = a + b
end function

and look at how the AST and ASR looks like.

AST

from lfortran.ast import src_to_ast, print_tree
from lfortran.ast.ast_to_src import ast_to_src
src = """\
integer function f(a, b) result(r)
integer, intent(in) :: a, b
r = a + b
end function
"""
ast = src_to_ast(src, translation_unit=False)
print_tree(ast)
Legend: Node, Field, Token
program_unit.Function
├─name='f'
├─args=↓
├─AST.arg
╰─arg='a'
╰─AST.arg
  ╰─arg='b'
├─return_type=None
├─return_var=expr.Name
╰─id='r'
├─bind=None
├─use=[]
├─decl=↓
╰─unit_decl2.Declaration
  ╰─vars=↓
    ├─AST.decl
    ├─sym='a'
    ├─sym_type='integer'
    ├─dims=[]
    ╰─attrs=↓
      ╰─attribute.Attribute
        ├─name='intent'
        ╰─args=↓
          ╰─AST.attribute_arg
            ╰─arg='in'
    ╰─AST.decl
      ├─sym='b'
      ├─sym_type='integer'
      ├─dims=[]
      ╰─attrs=↓
        ╰─attribute.Attribute
          ├─name='intent'
          ╰─args=↓
            ╰─AST.attribute_arg
              ╰─arg='in'
├─body=↓
╰─stmt.Assignment
  ├─target=expr.Name
  ╰─id='r'
  ╰─value=expr.BinOp
    ├─left=expr.Name
    ╰─id='a'
    ├─op=operator.Add
    ╰─right=expr.Name
      ╰─id='b'
╰─contains=[]

The AST does not have any semantic information, but has nodes to represent declarations such as integer, intent(in) :: a. Variables such as a are represented by a Name node, and are not connected to their declarations yet.

ASR

from lfortran.semantic.ast_to_asr import ast_to_asr
from lfortran.asr.pprint import pprint_asr
asr = ast_to_asr(ast)
pprint_asr(asr)
unit.TranslationUnit
├─global_scope=Scope
╰─f = fn.Function
  ├─name='f'
  ├─args=↓
  ├─a
  ╰─b
  ├─body=↓
  ╰─stmt.Assignment
    ├─target=r
    ╰─value=expr.BinOp
      ├─left=a
      ├─op=operator.Add
      ├─right=b
      ╰─type=ttype.Integer
        ├─kind=4
        ╰─dims=[]
  ├─bind=None
  ├─return_var=r
  ├─module=None
  ╰─symtab=Scope
    ├─a = expr.Variable
    ├─name='a'
    ├─intent='in'
    ├─dummy=True
    ╰─type=ttype.Integer
      ├─kind=4
      ╰─dims=[]
    ├─b = expr.Variable
    ├─name='b'
    ├─intent='in'
    ├─dummy=True
    ╰─type=ttype.Integer
      ├─kind=4
      ╰─dims=[]
    ╰─r = expr.Variable
      ├─name='r'
      ├─intent=None
      ├─dummy=True
      ╰─type=ttype.Integer
        ├─kind=4
        ╰─dims=[]
╰─items=[]

The ASR has all the semantic information (types, etc.), nodes like Function have a symbol table and do not have any declaration nodes. Variables are simply pointers to the symbol table.

Discussion

The above was a simple example. Things get more apparent for more complicated examples, such as:

integer function f2b(a) result(r)
use gfort_interop, only: c_desc1_int32
integer, intent(in) :: a(:)
interface
    integer function f2b_c_wrapper(a) bind(c, name="__mod1_MOD_f2b")
    use gfort_interop, only: c_desc1_t
    type(c_desc1_t), intent(in) :: a
    end function
end interface
r = f2b_c_wrapper(c_desc1_int32(a))
end function

AST must represent all the use statements and the interface block, and keep things semantically consistent.

ASR, on the other hand, keeps track of the c_desc1_int32, c_desc1_t and f2b_c_wrapper in the symbol table and it knows they are defined in the gfort_interop module, and so ASR does not have any of these declaration nodes.

When converting from ASR to AST, LFortran will create all the appropriate AST declaration nodes automatically and correctly.