Communities

Writing
Writing
Codidact Meta
Codidact Meta
The Great Outdoors
The Great Outdoors
Photography & Video
Photography & Video
Scientific Speculation
Scientific Speculation
Cooking
Cooking
Electrical Engineering
Electrical Engineering
Judaism
Judaism
Languages & Linguistics
Languages & Linguistics
Software Development
Software Development
Mathematics
Mathematics
Christianity
Christianity
Code Golf
Code Golf
Music
Music
Physics
Physics
Linux Systems
Linux Systems
Power Users
Power Users
Tabletop RPGs
Tabletop RPGs
Community Proposals
Community Proposals
tag:snake search within a tag
answers:0 unanswered questions
user:xxxx search by author id
score:0.5 posts with 0.5+ score
"snake oil" exact phrase
votes:4 posts with 4+ votes
created:<1w created < 1 week ago
post_type:xxxx type of post
Search help
Notifications
Mark all as read See all your notifications »
Q&A

Welcome to Software Development on Codidact!

Will you help us build our independent community of developers helping developers? We're small and trying to grow. We welcome questions about all aspects of software development, from design to code to QA and more. Got questions? Got answers? Got code you'd like someone to review? Please join us.

Comments on Simultaneous comparison in Python

Parent

Simultaneous comparison in Python

+1
−0

I want to make multiple comparisons at once, of the same type, in a Python program. For example, to check whether all of a certain group of strings are in a longer test string; or whether a specific variable is equal to any of some test values; etc.

I discovered that these naive approaches won't work:

if my_name and your_name in email:
    print("The email is about both of us")

if cheese == "cheddar" or "edam" or "havarti":
    print("Yum!")

The other Q&A explains why not. The question now is: what should I do instead, in Python? How can I write code that does these sorts of comparisons - and more generally, how can I figure out how to write the code?

And what if I want multiple possibilities on both sides of the comparison, or have the possibilities stored in a list (or other sequence)? Are there any special cases or other tricks?

History
Why does this post require attention from curators or moderators?
You might want to add some details to your flag.
Why should this post be closed?

0 comment threads

Post
+2
−0

Using and and or correctly

The left-hand side and right-hand side of these operators should both be valid operands, which should each come from a comparison. Thus, we should repeat the comparison on each side:

if my_name in email and your_name in email:
    print("The email is about both of us")

if cheese == "cheddar" or cheese == "edam" or cheese == "havarti":
    print("Yum!")

The general approach: all and any

If you have many values to test against, it could get annoying to repeat the comparison each time. Also, sometimes you don't know in advance how many values there are to compare against - for example, you might want to create a list of those values somewhere else in the program, and then compare to everything in the list (without knowing its length).

Python provides built-in functions called all and any which help with this task.

all is analogous to and (all([a, b, c]) works like a and b and c):

Help on built-in function all in module builtins:

all(iterable, /)
    Return True if bool(x) is True for all values x in the iterable.
    
    If the iterable is empty, return True.

any is analogous to or (any([a, b, c]) works like a or b or c):

Help on built-in function any in module builtins:

any(iterable, /)
    Return True if bool(x) is True for any x in the iterable.
    
    If the iterable is empty, return False.

The described results for empty inputs might seem counterintuitive, but they're mathematically valid. If unicorns don't exist, then we can make any generalization we like about "all unicorns", because there won't be anything to contradict us; but we can't say "there is some unicorn that..." because we are already defeated before we consider the restriction.

Naively, this only lets us avoid repeating the operator:

if all([my_name in email, your_name in email]):
    print("The email is about both of us")

if any([cheese == "cheddar", cheese == "edam", cheese == "havarti"]):
    print("Yum!")

This probably doesn't seem like any benefit at all, since the operators had to be replaced with commas anyway.

But the real power comes when we use a generator expression to create the input sequences:

if all(name in email for name in (my_name, your_name)):
    print("The email is about both of us")

if any(cheese == kind for kind in ("cheddar", "edam", "havarti")):
    print("Yum!")

Python gives us this expressive way to describe, abstractly, the group of values that need to be checked with all or any (i.e., as if they had ands or ors, respectively, written in between them).

Meanwhile, the inner logic of all and any will automatically return as soon as the answer is known - it can "short circuit" the same way that hard-coded and and or operators do. (TODO: make sure the Q&A about and and or discusses this!) Of course, if we pass lists (whether we create them "manually" or with a list comprehension), then the whole list has to be created first anyway, which defeats the purpose. But with generator expressions, we can preserve the short-circuiting. Python will only evaluate the generator as far as is needed. (TODO: links for more Q&A about these concepts)

de Morgan's laws with all and any

As noted, all is analogous to and, and any is analogous to or. This entails that a form of de Morgan's laws apply to them. We can:

  • negate each input element;
  • swap all for and or vice-versa;
  • and then negate the result

to get an equivalent expression.

For example, if we want to know whether any of our balls is not red(), this is the same as finding out whether not all of them are red(). And with the generator-expression trick, there's only one place to write the negation for the inputs:

# one way
any(not red(ball) for ball in balls)

# equivalent!
not all(red(ball) for ball in balls)

Notice how naturally it reads.

Using all and any "two-dimensionally"

A generator expression can also use multiple for clauses to iterate over all the pairs from the left-hand side and right-hand side of the comparison:

botanical_fruits = ['raspberry', 'strawberry', 'tomato']
culinary_vegetables = ['radish', 'spinach', 'tomato']
# is any fruit a vegetable (i.e., equal to some vegetable)?
any(f == v for f in botanical_fruits for v in culinary_vegetables)

However, it generally doesn't make sense to do this for comparison.

In the above example, what we're really trying to figure out is whether the two lists have any overlap - i.e., whether there's some value that's in both.

A simpler and more efficient approach is to use sets instead, and see about their intersection:

botanical_fruits = {'raspberry', 'strawberry', 'tomato'}
culinary_vegetables = {'radish', 'spinach', 'tomato'}
# Empty sets are "falsey"; all others are "truthy".
bool(botanical_fruits.intersection(culinary_vegetables))
# (Or we could e.g. check the `len()` of the result.)

This can't short-circuit, but it's worst-case O(N) instead of O(N^2) (Python's hash-based sets can be built and intersected in linear time). And of course empty sets are automatically handled correctly.

On the other hand, checking whether every value is equal to every other value is... for normal types, just checking whether all the values are the same. So you could just reorganize them to put a single element on one side of the comparison, and use all "normally". (Be careful of the case where there are no values on either side! Then you won't have a comparison value to use.) Or you could put them all in a set and check if the result has at most a single value (as an exercise, convince yourself that "at most" is correct).

Other special cases

Depending on the operation and on the types of the individual values being compared, there may be a more efficient, clever or simple way to do it. These could involve replacing comparisons with in, using set operations, using regular expressions, and more.

(TODO: start a table of common tricks - perhaps with links to other Q&A as they're asked).

History
Why does this post require attention from curators or moderators?
You might want to add some details to your flag.

1 comment thread

`if any(cheese == kind for kind in ("cheddar", "edam", "havarti"))` is a bit overkill. For this speci... (2 comments)
`if any(cheese == kind for kind in ("cheddar", "edam", "havarti"))` is a bit overkill. For this speci...
hkotsubo‭ wrote 2 months ago

if any(cheese == kind for kind in ("cheddar", "edam", "havarti")) is a bit overkill. For this specific case, you could simply do if cheese in ("cheddar", "edam", "havarti").

Using any would be suitable for non-exact matches, such as looking for a phrase that contains the word:

cheese = 'cheddar'
if any(cheese in phrase for phrase in ("I like cheddar", "I don't like edam", "I like havarti")):
    print("Yum!")

Or any other cases where the condition is not a simple == comparison. But for exact matches (AKA "check if the string is one of the values"), any isn't necessary.

Regarding all(name in email for name in (my_name, your_name)), I think I'd use this only if there are "a lot" of names to check, or in a function that receives the list of names, so I don't know in advance how many there will be.

But if it's a fixed list of just 2 or 3 names, using my_name in email and your_name in email won't hurt, IMO.

Karl Knechtel‭ wrote 2 months ago

I agree that any and all are overkill here - they're general-purpose tools, meant to illustrate that there's always that fall-back option.

The best approaches for various simple cases are... varied, because there are a lot of them. I didn't get to filling out the table because it's a daunting task. Perhaps it would be better to have separate answers to describe the most common simple cases, since a table summary might not make it obvious how to apply other techniques.