Welcome to Software Development on Codidact!
Will you help us build our independent community of developers helping developers? We're small and trying to grow. We welcome questions about all aspects of software development, from design to code to QA and more. Got questions? Got answers? Got code you'd like someone to review? Please join us.
Comments on What are statements and expressions?
Parent
What are statements and expressions?
When I have tried to read technical explanations of the syntax rules for programming languages, and when I am trying to decipher error messages, I often encounter the terms expression and statement. It comes across that these two are related to each other somehow.
I understand that these terms have something to do with the actual code written in a programming language - not, for example, special sorts of values calculated by the program when it runs - right? But what do they mean exactly? How can I use these concepts to improve my understanding of a programming language?
Post
The use of the terms expression and statement could vary between programming languages. However, the following distinction is widely used:
Expressions are syntactic forms that allow software authors to describe computations. They intentionally only bring few constraints with respect to ordering.
Statements are syntactic forms that allow software authors to define the order / sequence and conditions in which computations and state changes take place.
An early description of this distinction between expressions and statements and their respective uses is found in the Report on the Algorithmic Language ALGOL 60:
The basic concept used for the description of calculating rules is the well-known arithmetic expression containing as constituents numbers, variables, and functions. From such expressions are compounded, by applying rules of arithmetic composition, self-contained units of the language - explicit formulae - called assignment statements.
To show the flow of computational processes, certain nonarithmetic statements and statement clauses are added which may describe, e.g., alternatives, or iterative repetitions of computing statements.
The concepts of ALGOL have influenced many later languages (https://en.wikipedia.org/wiki/ALGOL, https://en.wikipedia.org/wiki/Generational_list_of_programming_languages).
Looking at the ordering aspect first, assume the following expression (which could be syntactically valid in C, Java, Python, Haskell and many other programming languages if the respective names a
-h
are properly declared):
f(a + b) + g(c + d) * (e - h)
It is normally not defined by the language specification, whether a+b
has to be computed before c+d
or e-h
. It is also left undefined whether the function f
will be called before g
. This is intentionally left unspecified to give the compiler or interpreter of the code the possibility to choose an order with a good performance. Some rules exist: there is a precedence defined (*
binds stronger than +
, computations in parentheses have precedence, ...), but these are not really ordering rules, but define the meaning of the expression. In fact, a compiler could re-arrange the expression by applying valid algebraic transformations.
For pure mathematical computations the ordering would have no impact on the result of the computation. However, as soon as state changes are possible, the order becomes important. Consider the following expression from C (and a few other languages which have adopted the concept from C), where ++i
represents the situation that the variable i
is incremented by one and the incremented value is used:
++i * 3 + (e - h)
In this example, the internal state of the program is modified by the expression: Whatever value i
had before, afterwards the value stored within i
is larger by one. Which does not bring a problem in the expression above: Due to C's precedence rules it is known that ++
binds stronger than *
. Thus, with the meaning of ++i*3
being clearly defined, it is not relevant, if ++i*3
is computed before e-h
. But what if the expression looks as follows:
++i * 3 + (e - h - i)
Which value of i
would be used in e-h-i
? The one before or the one after incrementing i
? Suddenly, ordering becomes important (the expression is in fact invalid in C as it has undefined behavior because of this ambiguity).
Since ordering is important when it comes to state changes, and since for expressions the ordering of computations is only partially defined, most programming languages define - in addition to expressions - statements, which handle (among other things) the ordering. The ordering of computations (which can involve state changes) between statements is clearly defined, such that also the order of state changes happening between statements or from within expressions becomes defined.
Going back to the example ++i * 3 + (e - h - i)
from above. In C, this expression with undefined behavior can be re-written in the following ways:
++i; // a statement consisting of a single state-changing expression
z = i * 3 + (e - h - i); // another statement, executed afterwards
Here it is clearly defined that both occurrences of i
in the expression i * 3 + (e - h - i)
use the incremented value, because the sequence of statements defines the ordering. In contrast, the example expression ++i * 3 + (e - h - i)
could have also been re-written as:
y = e - h - i; // still using the old value of i
z = ++i * 3 + y; // second statement with the state change
Here, the ordering makes it clear that e-h-i
uses the value of i
before the state change, while in ++i*3
the new value is used.
As mentioned initially, statements are often also about conditions under which computations (including state changes) are performed. In C and related languages there are conditional statements (if
, switch
, ...) and loop statements (for
, while
, do
, ...) that define if and how often certain computations are performed.
As others have mentioned, there are languages which solve these problems differently, like, having if
expressions, rather than if
statements. Or, even solving the ordering problem in completely different ways (like, the IO
monad in Haskell for defining the order of changes to program external state).
1 comment thread