Author: Björn Gustavsson <bjorn(at)erlang(dot)org>
Status: Accepted/R13A Proposal is to be implemented in OTP release R13A
Type: Standards Track
Erlang-Version: R12B-5
Created: 28-Jan-2009
Post-History:

EEP 26: Make andalso and orelse tail-recursive

Abstract

Erlang 5.1 added the ability to use 'andalso', 'orelse', 'and', and 'or' in guards. However, the semantics for 'andalso' and 'orelse' differs from that in other related languages, causing confusion and inefficiency.

I propose making 'andalso' and 'orelse' tail-recursive.

This EEP is partly based on Richard O'Keefe's EEP 17, but has a narrower scope.

Specification

Currently, (E1 andalso E2) as an expression acts like

case E1 of
   false -> false;
   true  -> case E2 of
       false -> false;
       true  -> true
   end
end

except that the former raises {badarg,NonBool} exceptions and the latter raises {case_clause,NonBool} ones.

This should be changed to

case E1 of
   false -> false;
   true  -> E2
end.

Currently, (E1 orelse E2) as an expression acts like

case E1 of
    true -> true
    false -> case E2 of
        true  -> true
        false -> false
    end
end

except that the former raises {badarg,NonBool} exceptions and the latter raises {case_clause,NonBool} ones.

This should be changed to

case E1 of
    true  -> true;
    false -> E2
end

Motivation

To unlock the full potential of 'andalso'/'orelse' in Erlang.

Given the current implementation, you either have to make rewrite code that is naturally written using AND and OR operators using 'case', or only use 'andalso'/'orelse' when you know that your lists are relatively short.

For instance, the function all/2 that returns 'true' if all elements of a list satisfies a predicate and 'false' otherwise, can be written like this:

all(Pred, [Hd|Tail]) -> Pred(Hd) and all(Pred, Tail); all(_, []) -> true.

In each recursion, we test that the current element Hd satisfies the predicate AND that the rest of the list also matches the predicate. The code reads almost like English.

Of course, 'and' evaluates both of its operand, so the entire list will be traversed even if the first element of the list fails to satisfy the predicate. Furthermore, 'and' is not tail-recursive, so the function will use stack space proportional to the length of the list.

To avoid the traversing the rest of the list if one element fails to satisfy the predicate, we can use 'andalso':

all(Pred, [Hd|Tail]) -> Pred(Hd) andalso all(Pred, Tail); all(_, []) -> true.

As soon as Pred(Hd) returns false, the recursion will stop and the rest of the list need not be traversed. Since 'andalso' is not tail-recursive, however, the function will need stack space proportional to the number of list elements that are traversed.

To see more clearly that 'andalso' is not tail-recursive, here is all/1 with 'andalso' expanded out to a nested 'case' expression (as it would be in R12B-5):

all(Pred, [Hd|Tail]) ->
    case Pred(Hd) of
        false -> false;
        true  -> case all(Pred, Tail) of
        false -> false;
        true  -> true
        end
    end;
all(_, []) ->
    true.

To make all/1 tail-recursive in R12B-5, you would have to write a 'case' expression yourself:

all(Pred, [Hd|Tail]) ->
    case Pred(Hd) of
        false -> false;
        true  -> all(Pred, Tail)
    end;
all(_, []) ->
    true.

If this EEP is accepted, in R13B we could write like this

all(Pred, [Hd|Tail]) ->
    Pred(Hd) andalso all(Pred, Tail);
all(_, []) ->
    true.

and the all/1 function would be tail-recursive.

In my opinion, the latter is easier to read and write. The 'case' expression is mostly boiler-plate code where 'true' and 'false' must be correctly spelled several times. (Misspellings like 'ture' and 'flase' are quite common, but are in most cases found the first time the program is tested.)

It could be argued that because Erlang has clearly defined truth values (unlike some other languages where 0 is false and everything else true), all operators that operate on booleans should make sure that their arguments are booleans.

Testing both arguments of 'and' and 'or' makes sense, because the code executed for those operators always GETS the values of both operands. But 'andalso' and 'orelse' only test their second operand SOME of the time.

X = 1, X >= 0 andalso X    % checked error
X = 1, X < 0 andalso X     % unchecked error

There doesn't seem to be much point in checking SOME of the time, especially when it does something as dramatic as blocking tail recursion.

Richard O'Keefe's motivation in EEP 17 is "Cultural consistency" with other languages. See EEP 17.

Rationale

Surprisingly (for me), the subject of this EEP turned out to be controversial.

I will start this rationale by listing some of the more serious arguments against this proposal and my counter-arguments, and finish with the arguments for this proposal.

One argument against is to be that the new construct will be confusing for users. 'andalso'/'orelse' can no longer be described as a "boolean operator", but is now a "control structure".

Yes, 'andalso'/'orelse' is no longer a boolean operator in the sense that it no longer GUARANTEES that it returns a boolean. However, using 'andalso'/'orelse' as a 'case' expression

case E1 orelse E2 of
    true -> ....;
    false -> ...
end

works in the same way as before. Most users certainly will not notice any difference. And if an operator is not allowed to not evaluate both of its arguments, it certainly wasn't an operator before either.

Another argument against is that 'andalso'/'orelse' can be used in one-liners to write "ugly code", such as

Debug andalso io:format("...", [...])

instead of

if
    Debug -> io:format("...", [...]);
    true -> ok
end

The code might be "ugly" (according to someone's taste or some definition of "ugly"), but the one-liner is not hard to understand and I don't see how it could turn into a code-maintenance problem.

The main argument for making 'andalso'/'orelse' tail-recursive: The current implementation is dangerous. You could very easily write non-tail-recursive code, for instance

all(Pred, [Hd|Tail]) ->
    Pred(Hd) andalso all(Pred, Tail);
all(_, []) ->
    true.

without realizing it and introduce serious performance problems. (Which has happened in practice).

If you cannot use 'andalso'/'orelse' in this way, these operators become pretty useless. (Some would say "utterly useless".) You have to rewrite beautiful code (in my opinion) to uglier code (in comparison, in my opinion) and more error-prone code (misspelling of 'true'/'false' in the boiler-plate code):

all(Pred, [Hd|Tail]) ->
    case Pred(Hd) of
        false -> false;
        true  -> all(Pred, Tail)
    end;
all(_, []) ->
   true.

Backwards Compatibility

Any code that ran without raising exceptions will continue to produce the same results, except for running faster.

Code that did raise exceptions may raise different exceptions elsewhere later, or may quietly complete in unexpected ways. I believe it to be unlikely that anyone deliberately relied on (E1 andalso 0) raising an exception.

Code that was previously broken because these operators have such surprising behavior will now work in more cases.

Reference Implementation

The proposed change has been implemented and run in our daily builds without finding any code in Erlang/OTP that needed to be updated. One test case in the compiler test suite that that test 'andalso'/'orelse' needed to be updated.

Copyright

This document has been placed in the public domain.