Communities

Writing
Writing
Codidact Meta
Codidact Meta
The Great Outdoors
The Great Outdoors
Photography & Video
Photography & Video
Scientific Speculation
Scientific Speculation
Cooking
Cooking
Electrical Engineering
Electrical Engineering
Judaism
Judaism
Languages & Linguistics
Languages & Linguistics
Software Development
Software Development
Mathematics
Mathematics
Christianity
Christianity
Code Golf
Code Golf
Music
Music
Physics
Physics
Linux Systems
Linux Systems
Power Users
Power Users
Tabletop RPGs
Tabletop RPGs
Community Proposals
Community Proposals
tag:snake search within a tag
answers:0 unanswered questions
user:xxxx search by author id
score:0.5 posts with 0.5+ score
"snake oil" exact phrase
votes:4 posts with 4+ votes
created:<1w created < 1 week ago
post_type:xxxx type of post
Search help
Notifications
Mark all as read See all your notifications »
Q&A

Welcome to Software Development on Codidact!

Will you help us build our independent community of developers helping developers? We're small and trying to grow. We welcome questions about all aspects of software development, from design to code to QA and more. Got questions? Got answers? Got code you'd like someone to review? Please join us.

Comments on What are statements and expressions?

Parent

What are statements and expressions?

+12
−0

When I have tried to read technical explanations of the syntax rules for programming languages, and when I am trying to decipher error messages, I often encounter the terms expression and statement. It comes across that these two are related to each other somehow.

I understand that these terms have something to do with the actual code written in a programming language - not, for example, special sorts of values calculated by the program when it runs - right? But what do they mean exactly? How can I use these concepts to improve my understanding of a programming language?

History
Why does this post require moderator attention?
You might want to add some details to your flag.
Why should this post be closed?

1 comment thread

A friendly challenge (2 comments)
Post
+0
−1

The use of the terms expression and statement could vary between programming languages. However, the following distinction is widely used:

Expressions are syntactic forms that allow software authors to describe computations. They intentionally only bring few constraints with respect to ordering.

Statements are syntactic forms that allow software authors to define the order / sequence and conditions in which computations and state changes take place.

An early description of this distinction between expressions and statements and their respective uses is found in the Report on the Algorithmic Language ALGOL 60:

The basic concept used for the description of calculating rules is the well-known arithmetic expression containing as constituents numbers, variables, and functions. From such expressions are compounded, by applying rules of arithmetic composition, self-contained units of the language - explicit formulae - called assignment statements.

To show the flow of computational processes, certain nonarithmetic statements and statement clauses are added which may describe, e.g., alternatives, or iterative repetitions of computing statements.

The concepts of ALGOL have influenced many later languages (https://en.wikipedia.org/wiki/ALGOL, https://en.wikipedia.org/wiki/Generational_list_of_programming_languages).

Looking at the ordering aspect first, assume the following expression (which could be syntactically valid in C, Java, Python, Haskell and many other programming languages if the respective names a-h are properly declared):

f(a + b) + g(c + d) * (e - h)

It is normally not defined by the language specification, whether a+b has to be computed before c+d or e-h. It is also left undefined whether the function f will be called before g. This is intentionally left unspecified to give the compiler or interpreter of the code the possibility to choose an order with a good performance. Some rules exist: there is a precedence defined (* binds stronger than +, computations in parentheses have precedence, ...), but these are not really ordering rules, but define the meaning of the expression. In fact, a compiler could re-arrange the expression by applying valid algebraic transformations.

For pure mathematical computations the ordering would have no impact on the result of the computation. However, as soon as state changes are possible, the order becomes important. Consider the following expression from C (and a few other languages which have adopted the concept from C), where ++i represents the situation that the variable i is incremented by one and the incremented value is used:

++i * 3 + (e - h)

In this example, the internal state of the program is modified by the expression: Whatever value i had before, afterwards the value stored within i is larger by one. Which does not bring a problem in the expression above: Due to C's precedence rules it is known that ++ binds stronger than *. Thus, with the meaning of ++i*3 being clearly defined, it is not relevant, if ++i*3 is computed before e-h. But what if the expression looks as follows:

++i * 3 + (e - h - i)

Which value of i would be used in e-h-i? The one before or the one after incrementing i? Suddenly, ordering becomes important (the expression is in fact invalid in C as it has undefined behavior because of this ambiguity).

Since ordering is important when it comes to state changes, and since for expressions the ordering of computations is only partially defined, most programming languages define - in addition to expressions - statements, which handle (among other things) the ordering. The ordering of computations (which can involve state changes) between statements is clearly defined, such that also the order of state changes happening between statements or from within expressions becomes defined.

Going back to the example ++i * 3 + (e - h - i) from above. In C, this expression with undefined behavior can be re-written in the following ways:

++i; // a statement consisting of a single state-changing expression
z = i * 3 + (e - h - i); // another statement, executed afterwards

Here it is clearly defined that both occurrences of i in the expression i * 3 + (e - h - i) use the incremented value, because the sequence of statements defines the ordering. In contrast, the example expression ++i * 3 + (e - h - i) could have also been re-written as:

y = e - h - i; // still using the old value of i
z = ++i * 3 + y; // second statement with the state change

Here, the ordering makes it clear that e-h-i uses the value of i before the state change, while in ++i*3 the new value is used.

As mentioned initially, statements are often also about conditions under which computations (including state changes) are performed. In C and related languages there are conditional statements (if, switch, ...) and loop statements (for, while, do, ...) that define if and how often certain computations are performed.

As others have mentioned, there are languages which solve these problems differently, like, having if expressions, rather than if statements. Or, even solving the ordering problem in completely different ways (like, the IO monad in Haskell for defining the order of changes to program external state).

History
Why does this post require moderator attention?
You might want to add some details to your flag.

1 comment thread

This is arguably not a distinction and certainly not a widely used one (1 comment)
This is arguably not a distinction and certainly not a widely used one
Derek Elkins‭ wrote 9 months ago

The distinction you mention is 1) not the main (or arguably even a) distinction between statements and expressions, 2) it's certainly not widely used, and 3) as you mention yourself there are plenty of exceptions.

Most programming languages specify the evaluation order of subexpressions, some, such as Haskell, don't because it's irrelevant and arguably meaningless, and some, such as C and C++, only partially specify it and make ambiguous cases illegal. Similarly, many (most?) languages allow expressions specify some amount of control flow. The most recognizable is the ternary operator of C used in many languages with C-derived syntax. Haskell mostly doesn't have statements, but still has control flow. (There is a syntactic class of statements for do-notation, but do-notation is a thin syntactic sugar that desugars to expressions.) Rust allows pretty much every control structure as an expression including loops.