Thursday, 20 September 2018 - Björn Gustavsson
This blog post continues the exploration of the new SSA-based intermediate representation through multiple examples. Make sure to read the Introduction to SSA if you missed it.
The first example calls a guard BIF that may fail with an exception:
element_body(T) ->
element(2, T).
The (optimized) SSA code looks like this:
function blog:element_body(_0) {
0:
%% blog.erl:5
_1 = bif:element literal 2, _0
@ssa_bool = succeeded _1
br @ssa_bool, label 3, label 1
3:
ret _1
1:
@ssa_ret = call remote (literal erlang):(literal error)/1, literal badarg
ret @ssa_ret
}
Let's go through the code a few lines at a time:
%% blog.erl:5
_1 = bif:element literal 2, _0
@ssa_bool = succeeded _1
The bif:element
instruction calls the guard BIF element/2
, assigning
the value to the variable _1
if the call is successful.
What if the call is not successful?
The succeeded _1
instruction tests whether the previous instruction
assigning to _1
was successful. true
will be assigned to @ssa_bool
if the second element of the tuple was successfully fetched from the tuple,
and false
will be assigned otherwise.
br @ssa_bool, label 3, label 1
The br
instruction tests whether @ssa_bool
is true
. If true
, execution
continues at block 3, which returns the value of the second element from the
tuple. If false
, execution continues at block 1.
It was mentioned in the previous blog post that block 1 is a special block that the SSA code generator always emits. In the previous examples, it was never referenced and therefore removed by one of the optimization passes.
In this example, it is used as the target when the call to element/2
fails.
The BEAM code generator treats references to block 1 specially. Here follows the BEAM code for the function. As usual, I have omitted the function header.
%% Block 0.
{line,[{location,"blog.erl",5}]}.
{bif,element,{f,0},[{integer,2},{x,0}],{x,0}}.
return.
Note that no code has been generated for block 1.
The line
instructions gives the file name and line number of the source file.
It will be used in the stack backtrace if the following instruction fails.
The bif
instruction calls the given guard BIF, element/2
in this case.
The {f,0}
operand gives the action to take if the element/2
fails. The number
0
is a special case, meaning that a badarg
exception should be raised if the
call of element/2
fails.
In the next example, element/2
is called in a guard:
element_guard(T) when element(2, T) =:= true ->
ok;
element_guard(_) ->
error.
The SSA code looks like this:
function blog:element_guard(_0) {
0:
%% blog.erl:7
_1 = bif:element literal 2, _0
@ssa_bool = succeeded _1
br @ssa_bool, label 4, label 3
4:
@ssa_bool:5 = bif:'=:=' _1, literal true
br @ssa_bool:5, label 6, label 3
6:
ret literal ok
3:
ret literal error
}
The first two instructions in block 0 are the same as in the previous
example. The br
instruction has different labels, though. The
failure label refers to block 3, which returns the value error
. The
success label continues execution at block 4.
4:
@ssa_bool:5 = bif:'=:=' _1, literal true
br @ssa_bool:5, label 6, label 3
Block 4 is the translation of =:= true
part of the Erlang code.
If the second element in the tuple is equal to true
, execution
continues at block 6, which returns the value ok
. Otherwise
execution continues at block 3, which returns the value error
.
Here is the BEAM code:
{bif,element,{f,5},[{integer,2},{x,0}],{x,0}}.
{test,is_eq_exact,{f,5},[{x,0},{atom,true}]}.
{move,{atom,ok},{x,0}}.
return.
{label,5}.
{move,{atom,error},{x,0}}.
return.
In the bif
instruction, {f,5}
means that execution should
continue at label 5 if the element/2
call fails. Otherwise
execution will continue at the next instruction.
Here is the next example:
case1(X) ->
case X of
1 -> a;
2 -> b;
_ -> c
end.
Translated to SSA code:
function blog:case1(_0) {
0:
switch _0, label 3, [ { literal 2, label 5 }, { literal 1, label 4 } ]
4:
ret literal a
5:
ret literal b
3:
ret literal c
}
The switch
instruction is a multi-way branch to one of any number of
other blocks, based on the value of a variable. In this example, it
branches based on the value of the variable _0
. If _0
is equal
to 2, execution continues at block 5. If _0
is equal to 1,
execution continues at block 4. If the value is not equal to any
of the values in the switch list, execution continues at the block
referred to by the failure label, in this example block 3.
The BEAM code looks like this:
{select_val,{x,0},{f,10},{list,[{integer,2},{f,9},{integer,1},{f,8}]}}.
{label,8}.
{move,{atom,a},{x,0}}.
return.
{label,9}.
{move,{atom,b},{x,0}}.
return.
{label,10}.
{move,{atom,c},{x,0}}.
return.
As mentioned in the previous blog post, the last instruction in a block is
called a terminator. A terminator either returns from the function or transfers
control to another block. With the introduction of switch
, the terminator story
is complete. To summarize, a block can end in one of the following terminators:
ret
to return a value from the function.
br
to either branch to another block (one-way branch), or branch to one
of two possible other blocks based on a variable (two-way branch).
switch
to branch to one of any number of other blocks.
Here is a slightly different example:
case2(X) ->
case X of
1 -> a;
2 -> b;
3 -> c
end.
In this case, X
must be one of the integers 1, 2, or 3. Otherwise, there will
be a {case_clause,X}
exception. Here is the SSA code:
function blog:case2(_0) {
0:
switch _0, label 3, [ { literal 3, label 6 }, { literal 2, label 5 }, { literal 1, label 4 } ]
4:
ret literal a
5:
ret literal b
6:
ret literal c
3:
_2 = put_tuple literal case_clause, _0
%% blog.erl:20
@ssa_ret:7 = call remote (literal erlang):(literal error)/1, _2
ret @ssa_ret:7
}
The failure label for the switch
is 3. Block 3 builds the {case_clause,X}
tuple and calls erlang:error/1
.
Here is the BEAM code:
{select_val,{x,0},
{f,16},
{list,[{integer,3},
{f,15},
{integer,2},
{f,14},
{integer,1},
{f,13}]}}.
{label,13}.
{move,{atom,a},{x,0}}.
return.
{label,14}.
{move,{atom,b},{x,0}}.
return.
{label,15}.
{move,{atom,c},{x,0}}.
return.
{label,16}.
{line,[{location,"blog.erl",20}]}.
{case_end,{x,0}}.
The case_end
instruction is an optimization to save space. It is shorter than
the equivalent:
{test_heap,3,1}.
{put_tuple2,{x,0},{list,[{atom,case_clause},{x,0}]}}.
{line,[{location,"blog.erl",20}]}.
{call_ext_only,1,{extfunc,erlang,error,1}}.
(The put_tuple2
instruction was introduced in
#1947: Introduce a put_tuple2 instruction, which was recently merged
to master
.)
It's time to address the kind of case
similar to what was teased at the
end of the previous blog post.
In this example, the variable Y
will be assigned different values in
each clause of the case
:
case3a(X) ->
case X of
zero ->
Y = 0;
something ->
Y = X;
_ ->
Y = no_idea
end,
{ok,Y}.
Perhaps a more common way to write this case
would be:
case3b(X) ->
Y = case X of
zero -> 0;
something -> X;
_ -> no_idea
end,
{ok,Y}.
In either case, the problem remains. Static Single Assignment means that each variable can only be given a value once. So how can this example be translated to SSA code?
Here follows the SSA code for case3a/1
. The SSA code for case3b/1
is almost
identical except for variable naming.
function blog:case3a(_0) {
0:
switch _0, label 4, [ { literal something, label 6 }, { literal zero, label 5 } ]
5:
br label 3
6:
br label 3
4:
br label 3
3:
Y = phi { literal no_idea, 4 }, { literal 0, 5 }, { _0, 6 }
_7 = put_tuple literal ok, Y
ret _7
}
Let's jump right to the interesting (and confusing) part of the code:
3:
Y = phi { literal no_idea, 4 }, { literal 0, 5 }, { _0, 6 }
Clearly, Y
is only given a value once, so the SSA property is
preserved.
That's good, but exactly what is the value that is being assigned?
The name of the instruction is phi
, which is the name of the
Greek letter φ. Having an unusual name, the instruction
deserves to have unusual operands, too. Each operand is a pair, the
first element in the pair being a value and the second element a block
number of a predecessor block. The value of the phi
node will be one
of the values from one the pairs. But from which pair? That depends on
the number of the previous block that branched to the phi
instruction.
To make that somewhat clearer, let's look at all operands:
{ literal no_idea, 4 }
: If the number of block that executed br label 3
was 4, the value of the phi
instruction will be the value in this pair,
that is, the atom no_idea
. The failure label for the switch
instruction
is 4, so this pair will be chosen when _0
does not match any of the values
in the switch list.
{ literal 0, 5 }
: If the number of block that executed br label 3
was 5, the value of the phi
instruction will be the integer 0. The
switch
instruction will transfer control to block 5 if the value of
_0
is the atom zero
.
{ _0, 6 }
: Finally, if _0
is the atom something
, the switch
will transfer control to block 6, which will transfer control to
block 3. The value of the phi
instruction will be the value of the
variable _0
.
The concept of phi
instructions probably feels a bit strange at
first sight (and at second sight), and one might also think they
must be terribly inefficient.
Leaving the strangeness aside, let's talk about the efficiency. phi
instructions is a fiction convenient for representing and optimizing
the code. When translating to BEAM code, the phi
instructions are
eliminated.
Here follows an example that is not SSA code, because it assigns
the variable Y
three times, but gives an idea how the phi
instruction is eliminated:
%% Not SSA code!
function blog:case3a(_0) {
0:
switch _0, label 4, [ { literal something, label 6 }, { literal zero, label 5 } ]
5:
Y := literal 0
br label 3
6:
Y := _0
br label 3
4:
Y := no_idea
br label 3
3:
_7 = put_tuple literal ok, Y
ret _7
}
The BEAM code generator (beam_ssa_codegen
) does a similar rewrite
during code generation.
Here is the unoptimized BEAM code, slightly edited for clarity:
%% Block 0.
{select_val,{x,0},
{f,53},
{list,[{atom,something},{f,55},{atom,zero},{f,57}]}}.
%% Block 5.
{label,57}.
{move,{integer,0},{x,0}}.
{jump,{f,59}}.
%% Block 6.
{label,55}.
%% The result is already in {x,0}.
{jump,{f,59}}.
%% Block 4.
{label,53}.
{move,{atom,no_idea},{x,0}}.
{jump,{f,59}}.
%% Block 3.
{label,59}.
{test_heap,3,1}.
{put_tuple2,{x,0},{list,[{atom,ok},{x,0}]}}.
return.
Here is the final BEAM code after some more optimizations:
{label,18}.
{select_val,{x,0},
{f,20},
{list,[{atom,something},{f,21},{atom,zero},{f,19}]}}.
{label,19}.
{move,{integer,0},{x,0}}.
{jump,{f,21}}.
{label,20}.
{move,{atom,no_idea},{x,0}}.
{label,21}.
{test_heap,3,1}.
{put_tuple2,{x,0},{list,[{atom,ok},{x,0}]}}.
return.
Here is the example from the end of the previous blog post:
bar(X) ->
case X of
none ->
Y = 0;
_ ->
Y = X
end,
Y + 1.
And here is the SSA code:
function blog:bar(_0) {
0:
@ssa_bool = bif:'=:=' _0, literal none
br @ssa_bool, label 5, label 4
5:
br label 3
4:
br label 3
3:
Y = phi { _0, 4 }, { literal 0, 5 }
%% blog.erl:52
_6 = bif:'+' Y, literal 1
@ssa_bool:6 = succeeded _6
br @ssa_bool:6, label 7, label 1
7:
ret _6
1:
@ssa_ret = call remote (literal erlang):(literal error)/1, literal badarg
ret @ssa_ret
}
It is left as an exercise to the reader to read and understand the code.
Here is the BEAM code:
{label,28}.
{test,is_eq_exact,{f,29},[{x,0},{atom,none}]}.
{move,{integer,0},{x,0}}.
{label,29}.
{line,[{location,"blog.erl",52}]}.
{gc_bif,'+',{f,0},1,[{x,0},{integer,1}],{x,0}}.
return.
The gc_bif
instruction calls a guard BIF that might need to do a
garbage collection. Since integers can be of essentially unlimited
size in Erlang, the result of +
might not fit in a word. The
1
following {f,0}
is the number of registers that must be
preserved; in this case, only {x,0}
.