Small lan­guages: Handling la­bels in Awk

Created: Tue Sep 18 00:29:32 CEST 2018

Last mod­i­fied: Tue Dec 4 04:27:38 CET 2018


I re­cently pur­chased The Awk Programming Language for a few bucks.

Running the lit­tle as­sem­bler ex­am­ple around page 130 I did a few mod­i­fi­ca­tions with a view to print the code in­stead of in­ter­pret­ing it. (I wanted to ex­per­i­ment with the sym­bol table.)

At first I sim­ply added a print $0 state­ment at the end of the sec­ond pass loop. I also added nextmem = 0 right be­fore the first pass.

Without this sec­ond change, Awk won’t be able to tell wether to in­ter­pret symtab[some label] as just 0 or "" when there is a la­bel on the first line. (On FreeBSD 11.2, Awk choose the empty string, which is not what I wanted there.)

So here is the full code:

BEGIN {
  srcfile = ARGV[1]
  tempfile = "asm.temp"

  FS = "[ \t]+"
  nextmem = 0
  while (getline <srcfile > 0) {
    sub(/#.*/, "")
    symtab[$1] = nextmem
    if ($2 != "") {
      print $2 "\t" $3 >tempfile
      nextmem++
    }
  }
  close(tempfile)

  while (getline <tempfile > 0) {
    if ($2 !~ /^[0-9]*$/)
      $2 = symtab[$2]
    print $0
  }
}

Side note: if you ex­pect la­bels to be ref­er­enced only in later in­struc­tions, you can get away with a sin­gle pass.

Input:

lbl  opc
     opc
lbl2 opc lbl
     opc lbl2

Output:

opc  
opc 
opc 0
opc 2

source code