The Smell of Hell

How often have you been here?

You’re given an archive to unpack; something to port and run and see how high it gets before it crashes. What you find is a bunch of nested directories full of source code. There’s a makefile or a project file, maybe a directory called ‘doc’ and a ‘readme.txt’ explaining some obvious stuff, but you don’t go there. Instead you open up ‘include’ and find a file named something like ‘types.h’ and see that the very first line of that file is:

#define INT int

and a little farther down is the clearly dubious

#define FOREVER while(1)

… and you confirm that you are in a world of hurt when you encounter

#define FI }

It’s not code smell. What experience has taught us to pay attention to is the involuntary little cold shiver we get in the presence of evil.

It’s gonna be another FORTRAN day. How often have you wished Moses had written a bunch of extra commandments, just for programming?

—-

Desperate homeowners who are trying to sell their underwater mansions will sometimes bake cookies before an open house in order to entice potential buyers at the subconscious level. Likewise, outhouses have a distinct and unpleasant odor indicating that they are full of stuff that is not good for you. Smells send deep, terse and very effective messages to some of the oldest processing hardware we own, namely the bit of our brains responsible for smells. The “smell brain” belongs to one of the oldest parts of our brains, in an evolutionary sense, and smells are a great way to for us to quickly and efficiently summarize things like “The last time your great-great-great-grandpa Thog smelled what you’re smelling right now, half his tribe got eaten by something really impolite, so it’d be a good idea to leave really fast right now.”

What does this have to do with software?

I hate the term “code smell.” It seems so unprofessional. I mean, you walk up to a piece of unfamiliar code, take a peek and pronounce it as either poop or freakin’ Turing Award material in a minute or two?

Often, yeah.

Desperate developers who are running on too little experience and insufficient intellectual horsepower will sometimes try to skate by on poor design and bad quality issues by pretending to know more than they actually do. When I can’t see the core logic for all the factories, singletons, fly-weights, visitors and inheritance crap, I get the feeling that someone has “baked cookies”, and when I look at the details, often the core of the code in question is exposed as a rat burger, grilled over a burning outhouse and smothered in long expired kimchi.

The reason I hate the smell analogy is that while real-world smells can universally encode “it’s time to run like hell, but first leave some of your feces behind,” this is not universally true with code. For all but egregious examples, with code smells you need a practiced nose. It’s not a question of “Pasta or Poop?” but rather “Cabernet, or Night Train?” And depending on the task you have in mind, Night Train might do the job just fine.

So you usually can’t go to your lead and say, “This stuff is rotten, fire this coder and erase their work from the face of the earth before there is more catastrophic damage” because most managers will just see curly braces in the right spots and not get down and really try to understand the inner sickness of a diseased hunk of software that is ready to metastasize its putrid API all over the soul of your project. Your bad attitude is easily mistaken for “Not Invented Here.” And your lead might be right, too. Sometimes just rebooting the damned server every eight hours works.

Here are my top five bad portents.

1. Fucked-up looking code. Things like inconsistent brace style and indentation, declaring things (types, variables, and especially parameters) that aren’t used, mixed use of tabs and spaces, bad names, and comments like the classic “increment i” and “print result” and “harry loves lucy.”

2. Mixed error handling strategies. For instance, half the code returns a simple BOOL result in case of failure, the other half returns some set of enums or #define’d error codes, and probably someone is throwing exceptions around all of that. Mix in a healthy dose of some functions returning ints, others unsigned values, others leaving errors around in globals (yes, errno, I’m talking about you) and you’ve got a good idea of why there’s no literary prize for programs, just halfway houses and work release programs.

3. Design pattern overload. If I were king I’d just start beheading people for writing factories that make factories. It’d collectively save us billions of dollars. And every time you make a singleton, God kills a start-up, two if you think you’ve made it thread-safe.

4. Anything that claims it is standards compliant. A comment like “This implements X.509 cert cracking” means that you should go fill your Super Soaker with a 50/50 mix of holy water and battery acid because you’re going to be hosing down demons and crazy undead-bugs until the sun goes out (and I mean entropy out, not just sundown). Working with an ISO networking standard? Save time by jumping off a high place onto someone in management.

5. Abstraction layers that just pass the buck, with each layer adding, recomputing, removing or re-arranging parameters. Abstractions are like peanut butter; a layer or two is fine, but once you’re down five or six layers and nothing has fucking happened yet you’re in definite “even if I could get my mouth open to scream now, that would just let in the evil stuff I’m swimming in” territory.

Sometimes Night Train will do the job, but that doesn’t make mornings any more pleasant.

Does anything else send waves of cold up your spine?

This entry was posted in Rantage. Bookmark the permalink.

39 Responses to The Smell of Hell

  1. Shannon says:

    How about having to work with a vendor’s DDK library and “special” build tools that force you to write some of this kind of code… blech.

  2. Unsigner says:

    I cringe when I see 30-line copy-pasted standard comment headers before 5-line functions; it usually doesn’t take long to find they are rarely updated and document what the function was when it was written.

  3. Arthur says:

    I know I’m in trouble when I see seemingly random use of references and the surrounding code copying values around in a clear attempt at achieving _something_ but it’s not clear what and when pressed the author also cannot give clear reasons for it.

    Then, of course, there is my father, whose javascript code uses functions more like chapter headers, form over function..

  4. pinbacker says:

    Couldn’t agree more with you on how fucked up ISO networking standards are. Been there, done that.

  5. Tom says:

    More a developer smell than a code smell, but: evangelism.

    If you’re not careful before you know it you’re having to wrestle a hammer and handful of square pegs from the previously diligent developer of Project Round Hole.

  6. james says:

    Boilerplate GNU ‘README’ and ‘INSTALL’ files that don’t actually tell you anything. They’re almost as bad as automatic Doxygen created “documentation” and the instructions “To see how this works, go into the ‘examples’ directory of the SDK”.

    Can’t stand documentation that doesn’t tell me anything meaningful.

  7. Peter Ibbotson says:

    Biggest pet hate in C code (although your FI example comes close)

    #define TRUE 1

    There is just so much wrong with that.

  8. Fersis says:

    Well my friend you better dont look at out code base. :3

  9. Jim Tilander says:

    Hi Landon,

    I’ve been enjoying your blog now for a while, keep on the good work!

    As for smells. I think this one indicates that you’re going to need some hokey gear to protect you from all the hurt that’s going to ensue:

    if( ptr )
    {
    // stuff that reads from ptr, but doesn’t change the rvalue of ptr.
    if( ptr )
    {
    // more stuff that reads from ptr.
    }
    }

    You’ve just stepped on a thread mine… :)

  10. Kyzer says:

    Interesting views, James.

    I include a boilerplate GNU INSTALL file because it explains some of the standard GNU build/install features to someone who doesn’t know them, and it also clues in an experienced user that I haven’t rolled my own build system, which means

    - cross-compiling will work
    - site defaults will work
    - builddirs will work
    - ./configure && make install will install into /usr/local
    - and many other GNU standards

    Perhaps you meant “software that doesn’t contain any documentation at all” by your example, rather than “software which is well documented and additionally contains boilerplate docs”.

  11. klange says:

    #include
    #include
    #define Begin {
    #define End }
    #define If if (
    #define Then )
    #define Equals =
    #define Print std::cout <>
    #define Not !
    #define IsNot !=
    #define For for(
    #define While while(
    #define Do )
    #define Less
    #define Increment ++
    #define HaltAndCatchFire for(;;) fork();
    #define Split fork();
    #define Or ||
    #define And &&
    #define BitwiseAnd &
    #define BitwiseOr |

    Your move.

  12. Jim Perry says:

    While factories that make factories is a bit silly, there’s nothing wrong with singletons if used “properly”.

  13. wheels says:

    Not so much a generic problem, but I once had to work with a customer app that was getting too large for the target EPROM. To save space, one of the customer’s programmer had stripped all the comments out of the code.

    What bothers me more as a generic problem is lack of functional decomposition. Maybe it’s a routine to copy a string from one location to another, but based on passed-in arguments, it has options for converting character case, translating between ASCII and EBCDIC, reversing the string, sorting the characters, initializing an attached LCD, and taking the square root of the ADC input – all in a single, three-page routine with no comments.

  14. JD says:

    You are dead on with respect to the Singleton Pattern. It has the super-pattern ability to make any codebase turn to spaghetti and become as brittle as uncooked pasta. There is nothing as tortuous as having to track down phantom errors due to global variables being written and read by everyone , everywhere.

  15. Bill Smith says:

    Always a bad idea to attempt to use macros to postpone learning a new language. At best, the macros create a leaky abstraction, as in “leak like a sieve”.

  16. mark says:

    hmmm. 1 & 2. don’t be such a goddamn princess! if the code is formatted poorly, use a freakin’ macro to straighten it out. bad comments. ha. that’s all comments, they only vary in the degree to which they suck. inconsitent naming? when more than one person works on the code, its inconsistent. when the project gets big its inconsistent. mixed error handling? so what. be happy that there IS error handling and move on. besides, most code is a melding of different libraries and API’s and the 3rd party libraries and API’s means a hodge podge of error handling.

    3 & 5 are dead right on. my favorite line: “And every time you make a singleton, God kills a start-up, two if you think you’ve made it thread-safe.” LOL.

  17. Mike Ellery says:

    Here’s mine:

    if (a < b) return true;
    return false;

    …which only portends further misery in the code-base.

  18. noodlesgc says:

    I’m not sure if I understand #2.

    Shouldn’t different functions return different types?

    An example:

    Some software I’m working on does text processing. The text must be verified, with a set of functions. Some verification functions can either pass or fail. Other verification functions can fail with a status, so I used a set of #defines to list the possible failure statuses.
    Also, the text manipulation functions can fail, so I made a set of #defines for those too.

    How else would this be done? And if each function and error code is properly documented, why is this a bad system?

  19. Mark says:

    I get nervous when I see:
    – code that checks all parameters for null…in *every single method*
    – methods with hundreds (sometimes thousands) of lines
    – inheritance hierarchies with > 3 levels

  20. ImNotWorthy says:

    I’ll just collectively apologize for myself and everyone other self-taught programmer who isn’t hitting the nail on the head with the first swing.

    Not meeting your standards truly is inexcusable. Sorry we polluted your world with incompetence.

  21. Tux2000 says:

    void somefunc(void) + all input + all output via global variables

    goto

    goto to break out of loops

    goto into loops

    goto end instead of return

    goto end, goto end2, goto end3, goto really_end, goto really_really_end

    missing error checks on file I/O, DB I/O

    commit without any conditional rollback around

    sql glued together with unverified user input instead of using placeholders

    #define FALSE 1
    #define TRUE 0

    copy-and-paste instead of subroutines

    i++ /* increment i */

    if (function_that_may_fail()==OK) {
    /* 1000 lines of code */
    } else {
    /* handle error */
    return FAILED;
    }

    int parse_file(char * name, int flag)
    {
    FILE * f;
    int i=0;
    GetOpenFilename(name);
    CopyFile(name,”C:\\some\\where\\backup.txt”);
    f=fopen(“C:\\some\\where\\backup.txt”);
    while(!eof(f)) {
    lines[++i]=fgets(f);
    }
    fclose(f);
    return 0;
    }

  22. brone says:

    1. Any system that has an extension mechanism that is totally a special case, with the default functionality written in some privileged way. You can basically guarantee that extensions will be crippled, and unable to do anything useful, because even the original author couldn’t be bothered to use them. If you don’t have the source, you’re SOL; if you’ve got the source, probably best to just add your extensions in directly — and delete the extensions system while you’re there, better all round. (Some might suggest that it’s best to clean things up; this is because so far they’ve only had to do this once or twice.)

    2. Any system that has its own main function, that calls your code as some kind of callback, “for your convenience”. You can be sure that this means the thing is 100% impossible to shut down cleanly, and that none of its authors have ever actually tried to use it from the other side in anger. Most likely, it’s probably got most of its tentacles round everything even before main starts anyway, so I never understood why it has to stamp its boot in your face quite so hard, when it’s already kicked you in the balls.

    3. No hooks for memory allocation or file I/O. There are possibly no hooks because it’s using STL and nobody could be bothered making typedefs for all the allocators. Maybe it was put together by programmers who’ve never had to actually put together a product that has to go in a box in the shops. Maybe the original coders were just lazy. Or perhaps they’re all, like, “ahahaha, that’s so old school, maybe we should write it in assembly language lol”? (See: never had to actually put together…). Anyway, too late to care why, because it’s your problem now. It will allocate memory at random times, and read files at random times. There will probably be a thread you didn’t ask for, that does more of the same. Every time you run the program it will behave subtly differently, and everything will be at a different address. It will probably create caches that you don’t want and grow lists that can’t be made smaller. Even if there aren’t any actual memory leaks in there, it will take you a week to work that out.

    4. Libraries with typedefs like “int32″ or “BOOL”. When they conflict with something in your own code, the library author will probably tut at you for not namespacing your code properly.

  23. Paul says:

    Personal all-time favourite:

    #define STRCMP(L,OP,R) (strcmp((L), (R)) OP 0)

    This tells me everything I will ever need to know about the person who wrote it.

  24. NevilleDNZ says:

    re: “unpack … and run and see how high it gets before it crashes.”

    c.f. http://rosettacode.org/wiki/Stack_traces#Using_no_extensions

    { Code implements C’s #define BEGIN & #define END with “macro magic” }

  25. SomeGuy says:

    1) EnterCriticalSection() and LeaveCriticalSection() scattered randomly in an apparent attempt to fix threading issues no one ever understood.

    2) A single class dozens of CRITICAL_SECTION members.

  26. NCC says:

    How about a 31,000 line header file. You ever see one of those?

  27. Josh says:

    int i, i1, i2, i3, i4, i5;

    This is from a real application, and most functions of any importance use at least “i”, “i1″, and “i2″.

    Ugh.

  28. EvanM says:

    My “favorite” bad portent is the presence of large, dense, comment blocks, that go into painstaking detail about why the code does things the way it does. Granted, if accurate, such comments can be helpful, but in my experience they typically are a symptom of poor design. Good code shouldn’t require huge paragraphs to be written to explain its function.

  29. landon says:

    The debugger for CP/M 68K was written by a linguist, and he’d used #define to declare new control structures as well as naming all of his variables in different languages.

    @NCC: I think there are some aggregated headers (for the Mac, as well as Windows) that are tens-of-KSLOCs.

    @Josh: I’ve seen “high performance arithmetic code” (some bignum stuff, for instance) where I honestly think the author thought that using smaller variable names and crunching out whitespace made the code run faster.

  30. Josh says:

    Here’s a more common smell that I see a lot:

    int f (…) {
    if (x) {
    if (y) {
    if (z) {
    /* Do something */

    return 0;
    }
    }
    }
    return 1;
    }

    Somebody doesn’t realize that they can check x, y, and z without all of that nesting.

  31. NM says:

    Exception handlers which catch _everything_ … and then ignore it. Particularly bad in Python, because so many things cause runtime exceptions:

    try:
    x = foo.bar()
    except:
    pass

    even better is when further down the code you encounter:

    try:
    do_something_with(x)
    except:
    pass

    because when the first exception happens, “x” isn’t defined, so do_something_with(x) fails and what the hell, let’s just wrap that in an exception handler too.

    The horror, the horror.

  32. Alexander says:

    I also like this type of function definition

    virtual inline int GetData() { return m_data; }

  33. brone says:

    Tastes differ but expanding && out over several clauses is usually something I like. It suggests that whoever wrote the code is aware that not all platforms have particularly good debuggers, and that people will want to step through unfamiliar code at times without tearing their hair out trying to anticipate what will happen ahead of time because there’s no way to step over one thing at a time. Same thing goes for stuff like:

    if(!flag)
    return false;

    return true;

    This often raises eyebrows but makes breakpointing super-easy (because not all targets have conditional breakpoints that work reliably).

    Alternatively, yes, maybe the author of the code WAS just clueless. Hard to say until you see the actual code :)

  34. rshepherd says:

    Absolutely hysterical. A really great post.. I love some of the analogies. I will be stealing them.

  35. Ashley Sands says:

    Ha ha ha!

    I love “If I were king I’d just start beheading people for writing factories that make factories”.

  36. Sverre says:

    I appreciate all you points, but 1) strikes me. Either you haven’t experienced many multi-year, multi-person projects, or you know some nazi that is actually able to force people to follow their style consistently. The third option is that you in fact have experience in many multi-year, multi-person projects in which follow the code standards, but then, those people are most likely morons that wouldn’t make good code anyway.

    Pick one out of two: Perfect world, or good programmers. You cannot choose both.

  37. landon says:

    @Sverre: Been there; big projects, lots of engineers, millions of SLOCs. Nazis can’t scale fast enough to keep up with entropy, but they’re still good to have. Best you can do is to have lots of eyes and a sane schedule.

    My experience is that coding standards don’t work well, but peer pressure does, so pick your cow-orkers carefully.

    I don’t ask for a perfect world any more. I ask for humility and excellence in the face of impossible odds, certain doom, and the vast uncaring maw of the modern software consumer.

  38. Frank says:

    You are dead on in regards to design patterns. I don’t know exactly when it happened, but some time in the last 10 years everything has to use a documented pattern at all times, or it’s complete crap.

    Code still gets processed from the top-down, right? I don’t think I missed any geek memos. Pretty sure the linear approach is still the best way to go for a lot things…

  39. Mike Albaugh says:

    #define int short

    In code from a very good game designer who ended up much richer than me.

    Of course, some blame should fall on Greenhills C for encouraging that sort of silliness instead of admitting that the 68000 was not _exactly_ native 32-bit.
    (and then having some very interesting bugs if you depended too heavily on
    their optimizations)

Leave a Reply

Your email address will not be published. Required fields are marked *

You may use these HTML tags and attributes: <a href="" title=""> <abbr title=""> <acronym title=""> <b> <blockquote cite=""> <cite> <code> <del datetime=""> <em> <i> <q cite=""> <strike> <strong>