Preview Chapter 1
Rule 1 - “Different Things Vary”
“It is a damn poor mind that can think of only one way to spell a word.”
― Andrew Jackson
When I graduated college, I was lucky that my first mentor was an old school Yankee engineer named Brian. Brian was a natural designer - he only had an associates degree but he could walk into his office with a pencil, a sheet of vellum, a perf-board and a box of components and walk out a day later with a working design no one had ever thought of before.
As I was fresh out of school, I hadn’t yet learned the subtleties of Being An Engineer - how minute differences between “A” and “B” could be the difference between something working and not working. When I brought such problems to Brian, he usually at least pointed me in the right direction and, when I found the problem, we would both proclaim that “Different things vary!”.
As I transitioned to being a team leader and beyond this axiom never failed to prove true. Many a time someone would come to me with a problem that was “impossible” - because the code “hadn’t changed” and “was the same as before”. Or perhaps the code “worked in the debugger” or maybe it worked on the development machine, but not the production server.
In the latter two cases, clearly the “difference” was in the environment. But when the code “was the same”, all I’d ask is to put the current version of the code up side-by-side with the last one that worked. Looking through it usually found a small tidbit that caused the problem - something had changed which “shouldn’t have made a difference” but did. Or something that was a subtle coding typo which gave the system an aneurysm. And then I’d say: “Different things vary.”
The classic case of this is something like the following:
if ( something_important = 1234 ) {
print ‘Life is Good’
}
You’ll stare at the code in the file over and over and over and completely gloss over this kind of simple bug. What makes it so insidious is that not only will it perform the conditional logic when maybe it isn’t supposed to, but it sets the value of something_important
to Lord knows what … which can cause all manner of problems downstream.
What makes this example even more insidious is that sometimes you want this kind of logic. Sometimes you want to make a decision based on an assignment (though obviously not to a constant). So your logical mind will scan this and go: “Looks good to me”. It’s only when you put it up side-by-side with the prior version that you’ll notice the difference.
Another typical occurrence is when someone either takes over or starts maintaining code you wrote and decides to “clean up” some of the global definitions or enums
. It all looks fine when you look at the code, because your logical/lexical mind automatically translates something like prospect_message
to buyer_message
. But the code isn’t quite as prescient. Especially when tokens can be derived from string variables.
So something like the following would crash and burn in epic fashion if the enum
strings were changed.
Messages = {
:buyer_message => 'Buyer message ...',
:agent_message => 'Agent message ...',
:owner_message => 'Owner message ...'
}
def foo( recipient_type )
message = Messages[ "#{recipient_type}_message".to_sym ]
return( message )
end
# ... meanwhile, somewhere else in the system:
send_message( foo('prospect') ) # Should be 'buyer'!!!!
This will error out in whatever calls foo()
because it’s returning a nil
value. So it will look like your messaging code is broken. But it isn’t. As with the first case, you’ll end up staring at the code over and over, pounding your desk like Krushchev at UN. And when you see the problem - the subtle change that just wrecked your whole morning - you’ll utter a few more profanities and then repeat the mantra: “Different Things Vary”.
The bottom line is: if your system is behaving differently, something changed. You may not be able to see it right away. But it’s there.
And what about when it’s not the code?
When the code actually hasn’t changed but the behavior has, then there’s still something different. It may be subtle. It may be something you didn’t do. It may not even make sense. But something is there. Some areas to investigate include:
- Has the core stack been updated? A new version of Ruby/PHP/Node can lead to different behavior.
- Have any Gems/packages/plug-ins/libraries been updated? Again, any changes here can lead to different outcome.
- Did you add any debug code or print statements? In some environments and languages, queries aren’t executed until something is going to be done with the result. Such as printing it out to debug. There can be other similar issues in other environments where just looking at something can change the way the system behaves. Kind of like a “quantum bug”. Yes, this sounds totally bizarre and impossible - but it happens.
- Has your core stack and Gems/plugins/etc. not been updated in a long while? When your stack gets very old, it kind of accumulates unfixed bugs. Bug fixes you’d have if you, you know, updated your stack regularly. At some companies this isn’t possible - there aren’t resources to stop forward progress for a few days to update the stack. But the longer you go, the more detritus your stack will be saddled with. And eventually, like any over-taxed immune system, the bugs win.
- Is there any asynchronous stuff going on that may have had it’s timing shift ever so slightly? This is also known as a “race condition” where an asynchronous process returns a result before the main application is ready to receive it. If you’ve recently recoded some things to work in the background, this can be a delayed-reaction bug. Something you changed weeks ago is now acting up.
- For UI-related issues, the first thing that could have changed is the browser you’re viewing the site on.
Here’s a fine example of the last case mentioned: You’re adding Google Analytics to a site. You copy the code you used elsewhere to pull the token-string from a config and paste it, along with the Google code, into your layout. The exact same thing you did on another project.
Exact. Same. Code.
Annnnnnd … it doesn’t work. Google Analytics isn’t recording page views.
You look at the page source - it’s fine. You try swapping in a token-string from another site that you know is working - still won’t work. “But it’s the exact same code!!!!!”, you scream to the heavens.
Next you open the console and see that the scripts are loading. Oh wait … what’s this? The analytics.js script isn’t getting a 200 load status. It’s being blocked by an extension.
Oh damn. There’s a privacy plugin running on the browser. Switch it off annnnnnnd … everything suddenly works fine.
Different. Things. Vary.
Learn it. Know it. Live it.
The lesson here is that when you’re proclaiming that “Nothing has changed!” or “It’s the same exact %$#@ing code!”, the question you should be asking is “What has changed?”. Because something almost certainly is different. And the bigger the puddle of digital vomit your system has suddenly turned into, the smaller and subtler the problem probably is. Kind of like the old “Butterfly Effect”, except in code.