Sharing Data without Using Globals

Global data has been beaten down in software projects for decades. There are some cases where it is useful and/or tempting, but coders should only use global data as a last resort. In most cases, a safer pattern will allow the desired data to be shared more securely.

The False Allure of Globals 

The motivation behind data sharing stems from the realization that the same data is needed in multiple contexts of an application. Making something available globally is very easy in many programming languages, and once you do so, BOOM! It is now available…everywhere.

However, that’s exactly the problem. When data is available everywhere, it can be mutated from anywhere. This will likely lead to issues down the road, unless great care is taken to ensure that all interactions with this data are continually scrutinized.

The only time global data is safe is when it’s populated before consumers access it and it remains constant throughout the life of the application. In any other case, developers should choose safer method(s) to access and mutate the shared data.

The Problems of Global Data 

Unpredictable state

If the shared data is not constant, it can be mutated at any place in your program. Therefore, the state of this data can become unpredictable, leading to synchronization issues or worse behaviors in the application. Complexities may also arise from not recalling why the data has changed and what, if any, actions should be taken by the consumers of this data.

Tight coupling

Use of global data, combined with multiple mutators and consumers, leads to very tight coupling between all of these counterparts. Strong coupling results in code that is difficult to maintain and debug since it is hard to determine the culprit and/or scenario under which a bug may arise.

Namespace pollution/collisions

By design, when a global variable is used, that name is taken from the global namespace of the application. This can cause issues when linking in other libraries since the library being linked may have a global of the same name, and the linker cannot protect against this problem. Another developer in the same application may also choose the same global variable name elsewhere, leading to a similar result and very strange, head-scratching bugs.

The Safer Options

Encapsulation is the name of the game when it comes to providing protection for shared data. Accessors should be used for the shared data, an object dedicated to holding them, and various operations that are provided for the data at hand. This precaution creates a single portal through which data can be accessed and/or mutated, giving you a first level of protection that paves the way for adding synchronization mechanisms if your app is running in a multi-threaded context as well.

Singletons are a popular pattern for encapsulating shared data/services. A singleton provides basic encapsulation and an object with a defined lifetime, as prescribed above. It is instantiated only once in an application. This prevents a wrapping object from being instantiated in multiple places, a problem which can result in additional state/synchronization issues.

Dependency injection provides another step of protection beyond a singleton. The instantiation of injected objects is controlled by a framework, ensuring that nested dependencies are handled in a sane manner and that necessary dependent resources are ready for use. The object being injected in this case still needs to handle concurrency and other concerns, but you do gain some additional control.

Services are very close cousins to dependency injection. They can be a little more convenient, since they simplify construction of the consuming object by allowing it to specify what services it wants to acquire from the system.

The Refactoring Process

Refactoring out a global must be done carefully, but following the steps below provides a safe method:

  1. Create a method (getter) which returns the data of interest, and replace all reads of the global with this accessor.
  2. Identify all write accesses/mutators to the data, and move their access and mutation of the data into accessors. Use common accessors only if they are identical.
  3. Change the global data to be local to the encapsulation object.
  4. Refactor the API for the new object to eliminate duplication in accessors.

Most importantly, test stability of the application as you progress through the sequence above. You should ONLY see improvements as you progress, so tackle any issues discovered along the way.

Conversation
  • Tim Walker says:

    Really interesting. It seems the biggest drawback of globals is unforeseen instability “down the road” when the architecture complexity surpasses the ad-hoc control of globals. How then can you test this stability during development, since it is not (by definition) “down the road”? Thanks!

    • Greg Williams Greg Williams says:

      Tim,

      That is an excellent observation. With most long-running / complex projects, the risk and affects of taking shortcuts and not addressing the root causes grows exponentially. As deadlines are looming, pushes are made to squeeze in more and more features rather than to maintain the health/malleability of a codebase.

      The real dangers of global variables is a “smell” that developers seem to gain over time, and by actually making these mistakes, and having to live with the repercussions.

      From a project management perspective, I strongly advocate for tracking velocity over time and developer happiness/sanity as giving clues that the system is unstable and inflexible.

      It should be made clear to developers that time can and should be attributed toward identifying AND tackling technical debt throughout the normal course of development. And sometimes, dedicated periods of time may be necessarily devoted in order to conquer bigger/uglier beasts.

      Automated tests are very handy to keep tabs on a system, especially during times of internal refactoring, where outward-facing behavior changes are undesired.

  • Comments are closed.