Software Development

−1

The use of the terms expression and statement could vary between programming languages. However, the following distinction is widely used:

Expressions are syntactic forms that allow software authors to describe computations. They intentionally only bring few constraints with respect to ordering.

Statements are syntactic forms that allow software authors to define the order / sequence and conditions in which computations and state changes take place.

An early description of this distinction between expressions and statements and their respective uses is found in the Report on the Algorithmic Language ALGOL 60:

The basic concept used for the description of calculating rules is the well-known arithmetic expression containing as constituents numbers, variables, and functions. From such expressions are compounded, by applying rules of arithmetic composition, self-contained units of the language - explicit formulae - called assignment statements.

To show the flow of computational processes, certain nonarithmetic statements and statement clauses are added which may describe, e.g., alternatives, or iterative repetitions of computing statements.

The concepts of ALGOL have influenced many later languages (https://en.wikipedia.org/wiki/ALGOL, https://en.wikipedia.org/wiki/Generational_list_of_programming_languages).

Looking at the ordering aspect first, assume the following expression (which could be syntactically valid in C, Java, Python, Haskell and many other programming languages if the respective names a-h are properly declared):

f(a + b) + g(c + d) * (e - h)

It is normally not defined by the language specification, whether a+b has to be computed before c+d or e-h. It is also left undefined whether the function f will be called before g. This is intentionally left unspecified to give the compiler or interpreter of the code the possibility to choose an order with a good performance. Some rules exist: there is a precedence defined (* binds stronger than +, computations in parentheses have precedence, ...), but these are not really ordering rules, but define the meaning of the expression. In fact, a compiler could re-arrange the expression by applying valid algebraic transformations.

For pure mathematical computations the ordering would have no impact on the result of the computation. However, as soon as state changes are possible, the order becomes important. Consider the following expression from C (and a few other languages which have adopted the concept from C), where ++i represents the situation that the variable i is incremented by one and the incremented value is used:

++i * 3 + (e - h)

In this example, the internal state of the program is modified by the expression: Whatever value i had before, afterwards the value stored within i is larger by one. Which does not bring a problem in the expression above: Due to C's precedence rules it is known that ++ binds stronger than *. Thus, with the meaning of ++i*3 being clearly defined, it is not relevant, if ++i*3 is computed before e-h. But what if the expression looks as follows:

++i * 3 + (e - h - i)

Which value of i would be used in e-h-i? The one before or the one after incrementing i? Suddenly, ordering becomes important (the expression is in fact invalid in C as it has undefined behavior because of this ambiguity).

Since ordering is important when it comes to state changes, and since for expressions the ordering of computations is only partially defined, most programming languages define - in addition to expressions - statements, which handle (among other things) the ordering. The ordering of computations (which can involve state changes) between statements is clearly defined, such that also the order of state changes happening between statements or from within expressions becomes defined.

Going back to the example ++i * 3 + (e - h - i) from above. In C, this expression with undefined behavior can be re-written in the following ways:

++i; // a statement consisting of a single state-changing expression
z = i * 3 + (e - h - i); // another statement, executed afterwards

Here it is clearly defined that both occurrences of i in the expression i * 3 + (e - h - i) use the incremented value, because the sequence of statements defines the ordering. In contrast, the example expression ++i * 3 + (e - h - i) could have also been re-written as:

y = e - h - i; // still using the old value of i
z = ++i * 3 + y; // second statement with the state change

Here, the ordering makes it clear that e-h-i uses the value of i before the state change, while in ++i*3 the new value is used.

As mentioned initially, statements are often also about conditions under which computations (including state changes) are performed. In C and related languages there are conditional statements (if, switch, ...) and loop statements (for, while, do, ...) that define if and how often certain computations are performed.

As others have mentioned, there are languages which solve these problems differently, like, having if expressions, rather than if statements. Or, even solving the ordering problem in completely different ways (like, the IO monad in Haskell for defining the order of changes to program external state).

posted almost 2 years ago

CC BY-SA 4.0

2y ago

Dirk Herrmann‭

1397 reputation 1 31 142 48

Copy Link

Raw

Markdown

History

1 comment thread

This is arguably not a distinction and certainly not a widely used one (1 comment)

Communities

Comments on What are statements and expressions?

What are statements and expressions?

1 comment thread

1 comment thread