Christian M. Mackeprang

Web development and computer programming

How terrible code gets written by perfectly sane people

When I found out I would be working on porting an old Python codebase to Node, I was slightly excited. These kinds of projects always give you more creative freedom than the ordinary code maintenance gig, and something about the challenge of rewriting other people’s code makes it fun as hell.

The excitement significantly faded once I actually got a look at what we were going to work with. Legacy code can be nasty, but I’ve been programming for 15 years and only a couple of times had I seen something like this. The authors had created their own framework, and it was a perfect storm of anti-patterns: no separation of concerns, mixed spaces/tabs for indentation, multiple names for the same concept, variables being overwritten by the exact same data coming from a different yet nearly identical method, magic strings… this mess could only have been the product of a room full of babbling monkeys copying code randomly from Google.

And yet, it was not the code’s dismal quality that piked my interest and led me to write this article. What I discovered after some months working there, was that the authors were actually an experienced group of senior developers with good technical skills. What could lead a team of competent developers to produce and actually deliver something like this? What I’ve come up is a list. These are some bad habits that even experienced teams can get into which will severely affect your end product, more than any static code checker or development methodology could rescue it from.

Giving excessive importance to estimates

An important component of this project was the focus on deadlines, even to the detriment of code quality. If your developers have to focus on delivering rather than on writing good code, they will eventually have to compensate to make you happy. The two ways in which they can do this are over-estimating and over-promising, and they both come with added baggage.

Typically it’s more difficult to over-estimate because the effects are immediately evidenced on project cost, so your developers might choose to over-promise instead, and then skip important work such as thinking about architectural problems, or how to automate tasks, in order to meet an unrealistic deadline. These tasks are often seen as added value, so they get cut without notice. Product quality will decline as you accumulate technical debt, and you’ll find out about it later than you’d really want to, because reorganizing code later in a project costs exponentially more.

As an example, on this project I would find code that was obviously duplicated elsewhere, but it seemed that people were in such a rush to deliver that some developers would not bother to check if someone else had written the same method or SQL query before.

Sometimes estimates can even be deceiving. For example, Agile has a term called “velocity”. The idea is to calculate how fast your team can deliver, and make the necessary changes to go faster. The problem is that it’s not possible to come up with an accurate velocity in the short to mid term. The law of averages says that you can’t look at past performance to gauge how fast you can go right now, because past performance is not a good indicator of future results.

The law of averages is a layman’s term for a belief that the statistical distribution of outcomes among members of a small sample must reflect the distribution of outcomes across the population as a whole.

In truth, a developer can write a large amount of code one day, and she can take three days to write five lines of code after reading documentation and collaborating with teammates. Averaging estimates will not net you valuable information in the short or mid term.

Giving no importance to project knowledge

As your project progresses, your team learns about the business, the concepts behind it and how they connect together. They also learn about the implementation as they write code, because you can never fully visualize how things will turn out and which challenges you will face. Some business fields even have an inherent complexity that takes a long time to digest.

As this was a full rewrite of old code, it was particularly interesting in this regard, because it serves to show whether your team’s management understands project knowledge and how it impacts development. If you’re in a large project and there are modules for which you have no expert, no-one to ask about, that’s a big red flag.  The value in rewriting code rests entirely on taking advantage of what you learned the first time around, so value that knowledge dearly.

If you put a different team to do the rewriting, as was done in my case, you are ignoring all your learning and relying solely on your new team’s skills, which likely won’t make up for the lack of information. A mediocre developer is going to do a better job at rewriting something he wrote himself, and he’ll do it faster, than if you give the job to someone else entirely.

Even hiring is impacted by project knowledge. The more information the project carriers, the longer it will take to bring people up to speed, so not only is domain knowledge more valuable then, it’s also important to focus on hiring good people. If you don’t hire well, you’ll get tied up to a bad team that will go nowhere for months.

Focusing on poor metrics such as “issues closed” or “commits per day”

When a measure becomes a target, it ceases to be a good measure.
– Goodhart’s law

At some point while I was getting up to speed on this project, somebody asked me why another developer was closing issues much faster than me, as if delivering faster is a good thing. As you can imagine, it took me a glance at his code to find four bugs in a single line. Focusing on unreliable metrics such as this will completely derail your project, and cause people as much stress as deadlines.

One metric that few seem to focus on is regression rate of issues. There are bugs such as null pointer exceptions that might show up much later, and if you’re not tracking regressions, it will seem as if bugs keep springing up out of nowhere. In this situation, you’ll never find the source of your problems, because you’re looking in the wrong place.

Ultimately what matters is what is delivered to the client, how happy they are with the product, and how it affects their bottom line, but it takes a lot of self-control to focus on delivered quality and ignore juicy metrics such as commit rate or issues closed.

A good way to know if a metric is useful or not, is to try to understand what personal values it outlines. Concentrate on metrics that advertise good attention to details, good communication skills and good attitude, especially if they require great effort to cheat.

Assuming that good process fixes bad people

Good process is portrayed as a kind of silver bullet in business. In my experience some companies, especially large ones with a poor hiring methodology, end up making their process increasingly strict to put toxic people in line, in turn restricting the creative freedom of the ones carrying the team. Not only that, but you still need people to carry out your process properly in the first place for it to work.

It never ends, and this entire problem can be disregarded by just fixing the hiring. Talent makes up for any other inefficiency in your team. That’s the entire point of favoring smart work over hard work.

Developers can be especially tricky to communicate with. In such a complex codebase, I had to constantly ask others for help, and would not often have people happily set time aside to give me a hand. That does not reflect a good attitude, and in tough tasks things can get especially stressful if you have to ask for help and can only count on a few of your teammates to have both the knowledge and the will to give you a hand.

You need people who can apply common sense and good taste, and who will call you out if your process is holding them back. Every build tool, every static checker and every communication tool has good and bad use cases, and you need people to tell you about it, not to blindly apply something that looked good in a different context months ago.

Ignoring proven practices such as code reviews and unit testing

Staying up to date on modern software development processes might not be enough to put a derailed project back on it’s tracks, but it’s certainly necessary anyway if you want your team to stay competitive. This is where proven practices step in, and there are a few of them. Test-driven development has been shown to reduce defect rates 40% to 90%, with an increase in development time in the 15%-35% range. Code reviews have also been shown to decrease defect rates, in some cases up to 80% over manual testing.

Imagine my dismay when I had to collaborate with a colleague on that legacy project and his screen displayed Notepad in its full glory. Using “search” to find methods might have been rad back in the nineties, but these days, refraining from using tools such as modern IDEs, version control and code inspection will set you back tremendously. They are now  absolutely required for projects of any size.

For an in-depth look at what is proven to work in software development, check out the book Making Software: What Really Works, and Why We Believe It. It has a nice list of valuable and proven practices which have been backed by studies over the years.

Hiring developers with no “people” skills

It’s not that developers can’t talk to other people. I was once a shy developer myself and eventually managed to stand in front of an audience and give a talk just fine.

The problem comes when someone isn’t willing to even try, or becomes annoyed by requests to improve communication. If there’s one thing that can speed up development time more than anything else that I’ve mentioned, it’s improving communication. Particularly when you reduce distances and work on informational proximity, you enable a more fervent and articulate connection in the workplace. It doesn’t matter that the other guy might be ten thousand miles away. One Skype call can turn a long coding marathon into a five-minute fix.

Conclusions

When you enable and encourage working smart by using the best tools, proven techniques and great communication, software development will definitely flow more naturally. What you can’t assume is that just because you’ve signed up to apply Agile or some other tool, that nothing else matters and things will sort themselves out. There’s a synergistic effect in action here which can make a team exponentially more productive if they’re set up right, and terribly slow and sloppy when no attention is paid to the details.

Some definitions I've been studying recently

Memorizing APIs and other tips for coding fluently

On “Your Brain At Work“, Dr. David Rock gives a quick introduction to the current state of neuroscience and goes on to give a great deal of advice on how to handle the perils of office life, which most of us developers can probably relate to. In this post, will attempt to take the main ideas from this book and apply them to common software development situations.

Your CPU at work

It turns out that the brain does work somewhat like a CPU. Think of it as a processor that can only handle a limited number of concurrent tasks, but has a really large amount of storage. Let’s see how this analogy can help us develop software better.

Multitasking

As a general rule, you should work on no more than one or two short tasks at once. The brain gets overloaded more easily than you might think.

Your mental CPU has a limited stack, with enough space for 3 or 4 items at the most. Just like an OS switching between programs, when you work on more than one thing, your brain has to repeatedly unload relevant information for one task and load things for another one. We can’t do this as fast as a CPU, and certainly nowhere near as often, so take it easy and focus on one thing.

When you switch tasks, your mind has to figure out what you were thinking about previously

Muscle memory

Repetitive training in a task will eventually make your brain run it on its “default network”. This is like the brain’s muscle memory or a CPU’s internal cache. Once you repeat a task enough times, it will become almost effortless.

It’s important to note that even when you’re mentally exhausted, you can still do tasks you’ve got a lot of practice with.

You can take advantage of this process by reviewing things you do once in a while, and training yourself on that a bit every day. There are code challenge websites which help with this, but from what I’ve seen, they are usually focused on rather useless algorithms, like sorting, which you can find in a library most of the time, and memorizing them will offer little benefit.

A more useful training task might be writing an SQL query, or the code for a website’s controller, or string handling, or practicing regular expressions, or even command line tools such as grep or awk.

I suggest that, when starting a new project which will be using a new framework, you take some time every day to go through each one of the tutorials. Eventually, you will be able to concern yourself with just your project’s code. The rest, the glue, you will be able to do blind.

It can be tempting to use code generation tools in this case, such as Yeoman. They can be useful, but if you use them while learning, you just won’t know the code nearly as well as if you had built everything yourself.

Vocabulary

I have found that having a strong vocabulary on a particular subject can greatly increase the speed at which you process ideas on it. It makes sense since language is our main tool when interpreting thoughts and events.

As an example, it is indeed far easier to think “referential transparency” or “pure function” than to think “those functions that always return the same value and don’t have side-effects”.

For building a strong vocabulary, there is the obvious habit of reading a lot, which I think at this point most developers already know that they have to. Tools like Pocket and Feedly come in handy.

I like to complement this with memorization of the most difficult concepts, especially things I don’t run into very often while reading. You can use Anki for this (or any other spaced-repetition software). Load it up with lots of interesting definitions. Be creative: anything from databases, Big O formulas, C library functions, anything that might turn out to be useful.

Some definitions I've been studying recently

Some definitions I’ve been studying recently

The key takeaway from this is that you don’t know that you need something until you know that it exists so, for instance, the benefit of memorizing the C standard library is not just to remember parameter orders, but also to have more tools at your disposal by knowing about more functions when you run into new problems.

Wrapping it up

  1. Streamline your work by splitting it into small tasks
  2. Avoid multitasking
  3. Practice coding problems every day, beyond just your project’s tasks
  4. Memorize

Additional reading

If you’re interested in this, I strongly recommend that you give Dr. Rock’s book a shot. It goes into a lot more detail and is certainly more authoritative and eloquent than me. Don’t miss it!

Being a versatile hacker is becoming more important than knowing frameworks

Back around 2013, the term Full Stack developer started to come up in job descriptions and blog posts. Companies were realizing that hiring developers with expertise in only one language just wasn’t enough anymore. A web developer that can handle a variety of tasks and environments is considerably more useful, and was starting to become the norm.

In spite of this, knowledge about web architecture itself did not become widespread. Many developers have been building websites without having a good grasp of how things work behind the scenes. Web forms, caching, the HTTP protocol, Apache. All of these were secondary good-to-haves.

How e-learning affects the job market

Perhaps as a consequence of the online learning boom that had started a few years earlier, the self-taught web developer knows surprisingly little about the web’s underlying technology. Language-oriented courses cannot cover the complete web stack, and students will end up clueless about what an htaccess file does, or how to restart a Unix daemon, or how the different types of POST encoding work.

What is a full stack developer supposed to know, anyway? Job descriptions frequently mention combinations of frontend and backend technologies such as JavaScript and Node, PHP and jQuery, Angular and Spring, and many others. In reality there is a significant amount of information outside those realms that would improve someone’s ability to build a website, and gone are the days when you could stick with what you know and make a career out of a single technology.

The future of web developmentIf sticking to your guns won’t suffice anymore, then what can we do, and how can we keep up with the exponential multiplication of web libraries? There is so much software being released today, that the number of possible combinations between technologies is increasing very rapidly. This combinatorial explosion will drive software development into a more ad-hoc territory. Your chances of knowing how to integrate two random libraries X and Y are ever diminishing, and any help that googling can provide is diminishing at the same rate. The window is about to close, and a time will soon come when we’ll be required to figure these tough problems out on the spot, every time. Not something for the lazy ones among us.

Hackers: the antifragile programmers

I was introduced to this very interesting concept in an article by programming rockstar John Carmack. It’s described in the following quote from the Antifragile book:

“Just as human bones get stronger when subjected to stress and tension, and rumors or riots intensify when someone tries to repress them, many things in life benefit from stress, disorder, volatility, and turmoil. What Taleb has identified and calls “antifragile” is that category of things that not only gain from chaos but need it in order to survive and flourish.”

This idea reflects the attitude shared by those that used to be called hackers. Today the word has a negative connotation, but in the early days, it referred to a person with a certain attitude towards technology. As defined by the jargon file, a hacker is: A person who enjoys exploring the details of programmable systems and how to stretch their capabilities, as opposed to most users, who prefer to learn only the minimum necessary.”

Antifragile - Things that gain from disorderThere was a time when looking things up on Stack Overflow whenever you had a problem just wasn’t an option, and many pieces of software had unreadable documentation, if they had any at all. I remember trying to fix a sound card issue as a kid, and reading the card’s manual, only to find assembly code listings there, with interrupt codes and all. That is the environment where hackers thrived, and that’s what we are going back to, sooner or later. If your first instinct when dealing with a complex issue that affects multiple technologies is to start with a Google search, you should reconsider your working habits.

Granted, being too curious can many times lead you down the wrong path, especially in the corporate environment where time is always short. As an example, it can be very enlightening to write test code for the basic use cases when learning about a new library, but coders looking to impress the boss will take the more pragmatic approach of copying the examples from the documentation, fully unaware of how they work. Giving value as a developer requires a certain amount of skill in time management and in setting expectations, as to allow you to seek the knowledge you need and to save the company money in the long term.

Rethinking the roles

How do you find the hackers? You need to find someone with a particular mindset, the particular curiosity and persistence that I’ve described. This has nothing to do with analytical intelligence, or with being able to memorize a particular set of academic algorithms, so whiteboard coding is out, and Fermi estimation problems don’t look too promising, either. Ask a candidate what he likes to do on his spare time, or what fun projects he worked on as a hobby, and you might be onto something. I have met many programmers that don’t write code in their spare time, and that can sometimes be a factor that tells you that they just don’t enjoy it. Look for clues which reveal their personality.

If you’re a developer, you might be worried that you don’t have that kind of drive or curiosity yourself, so what can you do about it?

Here are some pointers:

  • Whenever you have to google some error message or problem, read all the answers. Get as much context as possible on your problem, and do not be satisfied just with having come across a solution.
  • Learn about the technology, but also about the trade-offs that were made during its design and development.
  • Ask yourself what it would take for you to consider yourself a “complete” developer, and write down a path for you to get there.
  • Do what other people don’t like doing, go where they don’t want to go, and often enough you will be enlightened by the experience.

Software development is growing fast. Learning to code is easier than ever, and soon enough we will be in a survival of the fittest environment. But the guy that makes it is not going to be the guy that first learned about the cool new framework. It’s going to be the guy that asked himself what’s new about it, and what’s different this time. If you want to stay up to date with technology stacks, then stop worrying so much about being up to date, and start hacking.

Note: this article first appeared on TechBeacon.

The other kind of JavaScript fatigue

People have been talking about JavaScript fatigue lately, referring to how we’re continuously having to learn about libraries that don’t usually last very long.

I think this discussion evidences how good programmers cannot be lazy people. You have to get a liking for learning new things. Its not that big of a deal, being forced into it: at worst, you’ll become better at identifying good libraries and clean software architecture. Embrace change, and it will make you a better developer.

There is another kind of JavaScript fatigue which doesn’t have as many positive elements to it, and it hasn’t received as much attention. I want to talk about it because I believe it is talking a toll on the JS ecosystem in particular.

Richard Stallman envisioned free software as an environment where people shared their code and other people would improve on it and this would gradually drive progress in the software world in more ways than a closed source environment ever could. I think there are some requirements for such an ecosystem to work properly. A certain balance has to be maintained.

The Cathedral and the Bazaar, which should be a required reading for any open source developer, also hinted at this problem when saying that good programmers know what to write, but great ones know what to rewrite (and reuse).

JavaScript as a language does not have the correct focus that is necessary for that to happen. It is simply far too easy to create a new library or package, and since anyone can do it in a few seconds now, this generates a lot of noise. The language has even evolved to make it as easy as possible to wrap a mess of DOM-manipulating stateful code in a couple of export lines and ship the module to the world.

It is just too easy now to share poor code

Think about the process that you undergo when you want to find a package that solves your problem. You search GitHub, or the npm registry, see what is popular for that, take a look at comparisons between the top candidates, maybe check which ones have unit tests, good code size, reasonable architecture, some minimal documentation, responsive project leaders… it can take a hours, even days, to decide.

The forgotten edge cases

Recently I needed a library to build query strings, just a small one so that I wouldn’t have to include jQuery just for that. After a couple of hours of research, I had found several candidates, but it quickly became evident that even though they were working and updated libraries, none of them was able to handle nested objects and arrays properly, in the way that jQuery could.

If you’re familiar with the 80/20 rule: edge cases are the 20 of the job that takes 80% of the time, and unfortunately most programmers are not going to put in the effort if most of the benefit of releasing their library has already been garnered. Writing complete and bug-free programs is challenging. Sharing poor code is not. Slap a cool animation on your Readme file and everyone will assume your project is awesome.

There are upwards of 200,000 JS projects out there, with language features that focus on increasing that number, and tooling that makes it even easier still. It’s easy to share bad code, and hard to identify and improve it.

Oh, so you’ve spent a couple of days fixing that bug on the library you downloaded? Great job, and I hope you remembered to manually check how active the project is, and if they accept PRs, and if a pending PR doesn’t already fix the bug. Project maintainers don’t have enough tools to make their jobs easy and, unfortunately, the tools that they have are often complex and off-putting.

Consider how difficult it is to keep track of which forks are interesting, even though making a new fork only takes the click of a button.

In truth, some tools have been created which aim to facilitate writing good code, such as linters, code checkers, and so on, but they are not nearly as numerous, and building tools to parse ASTs can be tough. It’s an uphill battle.

An ode to productivity

Programmers love being productive. Once you learn that a couple of lines of code can save you hours of work, you might fall into a trap of believing that productivity is the end goal and any library or framework that makes you solve problems faster must be great. In fact, many people believe this, and tools that simplify your work will often become fads with many people swearing by them.

This can go so far as to reach project managers and recruiters and get them to actively seek people who have experience with the current trendy framework, because all the top tech companies are using it, and their developers swear by it, so clearly it must be good.

This drive towards productivity is the cause for the existence of an increasing number of build tools, code generation tools, and services which aim to make shipping as easy as possible, and nobody will question if you should be shipping your package at all.

Top result from SO disregards the most common edge case

The fragmented nature of large open source ecosystems has become evident since the advent of Android. It’s a far-reaching issue and it is also self-reinforcing because languages with smaller communities such as Go and Nim don’t yet suffer from it, and that makes them more appealing, leading to fragmentation on an even larger scale as developers rally to greener pastures.

Fragmentation is like a virus which is driven by the human passion for creation, and we need to join forces and actively fight it if we are ever going to stop it.

Remember Google’s Polymer? Angular 1? Express? Perhaps the organizations and individuals which cause these abrupt termination events should carry a stigma. After all, misleading thousands of people does not generate a positive attitude toward open source, and can lead to bitterness and more fragmentation [Edit: I’ve been called out for this statement, but just look at how Google has abused open source to keep other companies under their belt]

Instead of nurturing narcissistic language ambassadors that drop their projects like they change fedora hats every time they get a new idea, let’s focus on improving code quality and foster a sense of community. Human progress is not going to happen by default. That was not the case with clean energy, or with quality education, and it will not be different with open source fragmentation. These problems require active monitoring and organized effort.

Writing good code: how to reduce the cognitive load of your code

Low bug count, good performance, easy modification. Good code is high-impact, and is perhaps the main reason behind the existence of the proverbial 10x developer. And yet, despite it’s importance, it eludes new developers. Literature on the subject usually amounts to disconnected collections of tips. How can a new developer just memorize all that stuff? “Code Complete“, the greatest exponent in this matter, is 960 pages long!

Read More

Page 1 of 2

Powered by WordPress & Theme by Anders Norén