Communities

Writing
Writing
Codidact Meta
Codidact Meta
The Great Outdoors
The Great Outdoors
Photography & Video
Photography & Video
Scientific Speculation
Scientific Speculation
Cooking
Cooking
Electrical Engineering
Electrical Engineering
Judaism
Judaism
Languages & Linguistics
Languages & Linguistics
Software Development
Software Development
Mathematics
Mathematics
Christianity
Christianity
Code Golf
Code Golf
Music
Music
Physics
Physics
Linux Systems
Linux Systems
Power Users
Power Users
Tabletop RPGs
Tabletop RPGs
Community Proposals
Community Proposals
tag:snake search within a tag
answers:0 unanswered questions
user:xxxx search by author id
score:0.5 posts with 0.5+ score
"snake oil" exact phrase
votes:4 posts with 4+ votes
created:<1w created < 1 week ago
post_type:xxxx type of post
Search help
Notifications
Mark all as read See all your notifications »
Q&A

Welcome to Software Development on Codidact!

Will you help us build our independent community of developers helping developers? We're small and trying to grow. We welcome questions about all aspects of software development, from design to code to QA and more. Got questions? Got answers? Got code you'd like someone to review? Please join us.

Why is global evil?

+4
−0

Many languages discourage global variables.

Why is this?

History
Why does this post require attention from curators or moderators?
You might want to add some details to your flag.
Why should this post be closed?

0 comment threads

3 answers

You are accessing this answer with a direct link, so it's being shown above all other answers regardless of its score. You can return to the normal view.

+5
−1

The basics of good vs bad program design

All programs are divided in classes. (Or modules/abstract data types/interfaces etc - a rose by any other name.) Each class should only be concerned with its own designated task and not with unrelated parts of the program.

Similarly, each class is autonomous and other unrelated parts of the program do not dictate how the class should do its job internally, nor do they interfere with it by accessing internals of the class directly and meddling with things which is none of their busy. Rather, other parts of the program should request that the class jobs for them, in a manner that the other classes don't know or care about, by only accessing your class through the provided interface.

This is sometimes called loose coupling - a low amount of dependencies across unrelated parts of the program. The opposite is tight coupling, when unrelated parts of the code depend on each other or directly interfere with each other. Tight coupling is a very bad phenomenon because not only does it cause needless, unintuitive dependencies, it also causes bugs to escalate across the program and knock out unrelated parts of it.

For example, lets say that we have a vending machine software and there is a bug in the display class, causing the display to go black. But if that bug also causes the payment transaction class to act out and start to withdraw the wrong amount of money from the customer, then there is tight coupling between unrelated parts of the program. And because of it, the bugs were not just restricted to the class where they appeared, but escalated to other parts and therefore caused much more severe problems. If there had been loose coupling then only the display would only have gone black - annoying for the customer sure, but they could still use the vending machine and get the correct amount of money drawn.

The standard way to ensure loose coupling is to use the object-oriented concept of private encapsulation. Private encapsulation means that only the class itself has access to its own data and no other class can meddle with it, neither intentionally nor accidentally. Most modern programming languages provide means for this through language keywords like private. Older programming languages can often implement it too, but in more crude ways.

Access through privately encapsulated data is done by the class as it executes its designated job. But in some cases, we may let the user of the class get a copy of that data, through so-called "getter" functions, which is typically just a simple function returning a copy of a private variable. Similarly, we may let the user change the data in a controlled manner through a "setter" function.

Properly designed, our program should only have two kinds of dependencies/coupling remaining: either "class x uses class y" or "class x is a y" (inheritance). At the program design stage we draw out these dependencies and question if they make sense.

For the vending machine example, does it make sense to have "display class uses a payment transaction class"? Surely not - the purpose of a display is to display stuff, it should not know anything else such as transaction business logic. But it might make perfect sense to have "payment transaction uses display", to display the cost.

Spaghetti programming

Another problematic example of program design is when the program flow gets very complex and hard to follow. The classic example of how one can turn the program flow into a nightmare to read and maintain, is through excessive use of "goto" keywords, causing non-conditional branching to another place of the program. That was discovered early on in the history of programming, famously through a paper back in 1968: "Go To Statement Considered Harmful" by the esteemed computer scientist/pioneer Edsger Dijkstra.

Programmers have debated this endlessly ever since and the phenomenon where you follow the program counter to one place of the program, only to end up at a different place entirely was named "spaghetti programming", where the program is compared with a plate of spaghetti stands. Basically a form of chaos. The consensus among programmers have landed somewhere around: spaghetti programming is always bad, but the goto keyword as such does not always create spaghetti programming.

However, many programming languages provide alternative means to goto which are just as efficient ways as goto for the purpose of creating spaghetti programming. All manner of branching, breaking/resuming loops, returning from subroutines, complex uses of exception handling etc etc.

One particularly nasty way of doing so is to have global state variables shared across several classes/modules and then change that variable from all over the place. In this case the spaghetti isn't the program counter jumping back and forth, but rather the value of the global variable ("stateghetti/flagghetti"). This is perhaps the most effective way of all to creating severe tight coupling and general chaos in a program.

A design with private encapsulation is the best way to avoid that problem, or at least reduce the problem to a local one, inside one particular class.

Namespace clutter

Another issue with global variables or identifiers in general is that they are shared across the whole program, meaning that their particular name gets reserved all over it. Or that we get name collisions when two different identifiers have the same name, often referred to as "namespace collisions", resulting in compiler and/or linker errors. The term "namespace clutter" is about needlessly "polluting" the global namespace of the program with identifiers, when there is no real reason for it. If a variable is to be used by one class only, then we can reduce namespace clutter with private encapsulation.

Perhaps most infamously two library functions in *nix and other OS named read and write. The names were so generic and poorly picked, that they always collide with other identifiers in user applications.

But to have functions "pollute" the global namespace isn't that severe. You get a compiler/linker error, grumble a bit and then rename your read function to something better, end of story. With variables it quickly gets more severe, because they can be directly changed with read/write access. So in some scenarios it might be possible that other parts of the program writes to a variable by accident. Or more likely, someone starts to write to it on purpose, and then the tight coupling circus starts.

The general good practice to combat these problems is to reduce scope. Declare variables as locally as possible and use private encapsulation. That way, only the parts of the code which needs to access the variable gets access to it.

Thread safety

Yet another issue with global variables is that they aren't safe to access directly in programs utilizing multi-threading/multi-processing/parallelism/interrupts and similar. In such cases the problem isn't as much the global aspect of it, but rather that there only exists one single instance of the variable and access to it is unlikely to be atomic (non-interruptible access). Meaning that if two threads do so at the same time, we get so-called race condition bugs.

In this case, simply making the variable private isn't necessarily the fix. You either need to ensure that each instance of the class has it's own caller-allocated copy of the variable. Or you need to protect the variable with whatever thread safety protection measures your system provides: semaphores/mutex/atomic access/critical section/disabling interrupts etc etc.

Now if you have only one instance of the variable across multiple instances of the class, the previously mentioned setter/getter functions can be given an additional purpose. Not only can you use them as means to reduce coupling dependencies and namespace clutter etc, you can also use them as "wrapper" functions for the thread safety mechanisms. Because just like the variable itself is no business of another class/caller, neither is the thread safety mechanism. It too should be privately encapsulated if possible.

Conclusion

From all the above examples, we can see how the use of global variables can create many different, severe problems, where the most serious one is perhaps rampant escalation of errors throughout the program, so that modifying one part of the code causes a completely unrelated part of it to fail.

Global variables should almost never be used. In most cases they should get encapsulated inside classes. In some cases they should perhaps get declared at the top application tier of the program, from where they either form the lower tiers (class instances) or get passed to the lower tiers.

The perfect dependency graph of the program should be able to illustrate so that it looks like an umbrella or binary search tree, with the top tier entry point at the very top, and all dependencies pointing downwards towards the lowest tiers where pure algorithms, library functions or drivers sit.

If the dependency graph rather looks like a crossword puzzle or a plate of spaghetti, then global variables is one of the most likely reasons for it. And problems are dead certain to follow, sooner or later.

History
Why does this post require attention from curators or moderators?
You might want to add some details to your flag.

0 comment threads

+6
−0

Global variables make the code hard to reason about

This is especially visible when debugging. Say you have a function which errors. The stacktrace tells you where the function got it's arguments, but not who last modified the global variables it might have read. They could have been modified literally anywhere in the program, perhaps asynchronously. Reconstructing the chain of events that led to this particular errorous state may be next to impossible.

Use function arguments for passing everything. Keep your functions refenrentially transparent. I.e. the same arguments should always produce the same output. (I recommend looking into functional programing more broadly too while you're at it.)

History
Why does this post require attention from curators or moderators?
You might want to add some details to your flag.

0 comment threads

+1
−0

A global variable or object is in scope everywhere. That means it's possible to modify it from any part of your program.

Imagine a mature program, made up of thousands of lines of code and dozens of files. A statement that does something with that global variable could be found anywhere in those. That means if you want to understand the impact of doing something with that variable, or you want to understand how its value will change. You will have to check every line of the program to look for references to the global variable.

Sometimes global variables can get reused, because the global namespace is limited. So for example, maybe the variable is changed in 50 places. But for the particular usage of the variable that you're interested in, only 3 of those are relevant, and the others happen in a different context that does not apply to your particular usage. You can't easily tell which ones are relevant or not, so even when trying to understand a simple pattern, you will still have to deal with many complex code paths - worse, they'll be harder to understand because they don't tie into the thing you're thinking.

Related - because global variable names are active across the program, you end up having to give them descriptive names, which are long and harder to come up with.

Local objects are much more limited. For example, a local variable in a function is created after the function starts, is destroyed when it exits/returns, and can only be accessed from inside the function. That massively narrows down how much code you need to read to understand it. The namespace is limited, so you can give shorter, more generic names (that are still clear in context).

It's a lot easier to build software when it is made up of self-contained modules. If you can treat each part of the program as a black box, with simple inputs and outputs and predictable behavior, it becomes easier to understand how the whole thing works. Global variables are effectively additional, poorly defined inputs/outputs that hampers this. It will also be harder to take one module and reuse it in another program, because now you must make sure the global variable is re-created in the new place as well.

It's rare to encounter problems in software engineering that cannot be solved with local variables. They do exist, but they are rare. Often, people use globals not because they're needed, but because they're a sort of nuclear option to the scope problem. They're a way to avoid thinking about scope. It's like giving everyone the key to a room, instead of trying to figure out who actually needs that key. This shortcut saves a little effort in the short term, but leads to a lot of headache in the long term. This is why the advice exists to avoid global variables where possible.

History
Why does this post require attention from curators or moderators?
You might want to add some details to your flag.

0 comment threads

Sign up to answer this question »