Building a mystery

Once upon a time I thought that makefiles were a cool idea. Okay, this was the early 80s, rocks were still young, and I didn’t have a version of make on any of the platforms I was using, so I wrote one. My own version of make wasn’t very good, but it was simple, did what I needed at the time, and I gave it away for free (you can probably still find it on the net. One of the reasons that Richard Stallman doesn’t like me much is that it’s close to my total contribution to Free Software. Trust me, it’s not worth the hunt).

Fast-forward a few decades and I’m wrasslin’ with makefiles large enough to have detectable gravitational pull, with dizzying levels of nested includes, wrapper programs bolting together metric buttloads of definitions from auxiliary files that were first cut in clay tablets Hammurabi’s scribes, macro systems from hell that wrap back on themselves through higher dimensions to form legal XML, and default rules that actually reach back in time and break builds that have already succeeded (which sure explains a lot, doesn’t it?).

And yet, with these hundreds of thousands of lines of intricate and fragile declarations accreted over uncountable hardscrabble engineer-years, with with the multi-hour-turnaround time, the only friend at my back is an ECHO statement that lets me go back and stick tracing statements where I think the problem might have been. I don’t need ECHO, I need a time machine.

What I’d actually like to have is a fapping debugger, but I suspect it’s easier to build a gizmo to tear apart and reconstruct the elementary fabric of the universe than it is to interrogate the infernal interiors of NMAKE after things have gone sour. (Yes, NMAKE. Don’t get all superior on me: I’ve used GNU make as well, and while GNU make is better, this is still like expressing a preference for a particular brand of cyanide in your coffee).

—-

A person from outside the culture of modern software engineering would respond with something like “Pull the other one,” or more likely, “Stop whining,” and that person would, sadly, be right on the money in both cases. We have no one to blame but ourselves. With all the fancy languages we employ, all the type-safety and exception safety and interface meta-languages and theorem proving, why are we jack-legging software together with rubber bands and duct tape?

Make was written in 1977, and it hasn’t fundamentally improved in decades. Instead of improvements, we got features: Powerful macro expander syntax, looping constructs, electric and non-electric variable definitions, some lame attempts at parallelism, but all of these extras just added complexity and didn’t address the everyday problem of figuring out why a build is failing two hours in.

If I were to tell someone “I’m going to design a programming language that’s going to be used by millions of programmers every day: It’s going to have tons of hidden state, no obvious control flow, obscure and terse syntax, and programs written in it are going to run for upwards of six hours before bombing with an error message like ‘File not found’ — oh, and I’m not going to write a debugger, and all the state will be hidden and completely lost when things go wrong” — I’d be strung up in the stairwell alongside the guy who invented trigraphs.

Don’t get me started on autoconf. (Someone else wrote a nice flame; there have been others). Tools like this just paper over what’s really wrong: We have too much crap and we have to build it all the time.

—-

Make is only part of the problem. Modern compilers are still rooted in the smelly primordial ooze of the paper tape era of computing.  Well, maybe magnetic tape.

Imagine I’m building a house; to achieve this, I will be nailing some boards together. Given that I am a relatively savvy and modern software engineer, what I do is:

1. Grow a tree

2. Cut it down and drag it out of the woods behind my ox

3. Extract a board, using a pit and a great big bloody saw

4. Dry the board out in a kiln

5. Cut and plane the board to proper dimensions

6. Repeat 1-5 with another tree, resulting in another board

(I’ve omitted steps involving mining iron ore, making coke, refining the ore, smelting same, making steel, and pounding out a nail)

7. Nail the stupid boards together. Oh, you wanted glue? I don’t do glue; there are good reasons for that.

… and in about forty thousand years that house is finally assembled (which is about par for how late a lot of software projects are). This is pretty much the life-cycle of a compiler: Suck in several megabytes of header files and/or precompiled headers, process a miserable handful of ten or fifteen functions and methods, spew out some object code and a fuck-ton of debugging info, then do the whole thing over again with the next set of sources. After all that’s done, you feed it to a linker. (Don’t get me started on linkers; I did a lot of work on linkers in the 80s and 90s, talk about a thankless job…).

Of course, modern build systems get rid of some of the duplication of effort here, since they will precompile headers for you and do some dependency analysis. But I dare you to change one common structure, or touch one common header file containing, say, a list of error codes. It’s time to recompile the world; see you in a few hours.

C and C++ need a module system so badly that we should pretty much stop adding features (yes, Mr. Freaky Template Thingy I’ll Never Use in a Responsible Real-World Project, I’m looking at you) and do nothing else to these languages until this is fixed. Architecturally. We need to ban #include (and no, precompiled headers are not the answer) and get a type definition and importing system that actually fucking works and that scales to tens of millions of lines of code. Once we have that, I’ll hunt down every single use of #include and #if/else/endif and club them to death.

—-

Something absolutely magical happens when you have turnaround time that is less than about five seconds. It almost doesn’t matter what language you are programming in. If you’ve got a system that gives you five seconds from source change to running code, it’s possible you’ll forget to eat and starve at the keyboard, even if you’re hacking away in assembly.

Build times sneak up on you. Pretty much every project I’ve worked on from scratch has gone from that magical “seconds” window to “minutes” (tolerable), to ten minutes (get coffee), and somehow reaches 45 minutes to an hour (go to lunch, surf the web, do email, write documentation, attend meetings, play video games). Around the two or three hour mark and you’re talking about doing SCA re-enactments in the hallways using parts from build servers as props.

Frankly I don’t see this problem being solved any time soon, at least for the kind of dead-bits “EXE” development that happens in embedded work and high-performance cores of video games or operating systems. While it may be possible for hardware to get to the point where we can JIT and message-pass ourselves to Nirvana and forget about cache line awareness and punt global optimization, weren’t we saying that ten years ago, too?

The essential core of makefiles and text-based includes were 70s-era hacks of convenience that went only so far. Speed of turnaround is a language feature, and you don’t have to be a dope-addled Smalltalk hacker to appreciate the beauty of dropping into the debugger, changing the structure of a structure, and continuing blithely along as if nothing extraordinary had happened. We’ll never be there with C (at least, a language that supports that probably doesn’t look very much like C), but it’s interesting to contemplate.

This entry was posted in Rantage. Bookmark the permalink.

46 Responses to Building a mystery

  1. Max says:

    I guess I should consider myself fortunate to work in several languages, at the same time, that all compile automatically in the background. It’s like defying causality. I save the file and all the work is done before I can command it to act.

    Of course, technically, these languages are dirt slow.

    Perhaps Google Go will save us?

  2. Karl says:

    And while you’re off being interestedly contemplative about the wonderful future ahead of you, PHP and other web language developers (and trust me, there is plenty wrong with those languages, but we digress…) are sitting there going, “Wow, those guys put up with that because… ?!”

  3. abhijeet says:

    Very well written.. I know the pain you have faced as I also face it every second day,.. those 1 -2 hours of build times… which sometimes stretch to 4 -5 hours..

    When this becomes a every day affair then coffee and cooler talk are of no value…

    Very well written indeed..

    Abhijeet

  4. William says:

    That’s a beautiful rant.

    I forget where I read this, but somebody was making the argument that the successor to a complex system is never a somewhat simpler system. It’s always a slightly more complicated system. That, or total collapse.

    He was talking about political systems, but it seems to apply to technical ones. Although in the case you describe, I imagine that C and make are really more a political system than a technical one at this point, at least as far as improving core issues.

  5. creaothceann says:

    I’ve been fortunate enough to never experience build times exceeding a few seconds, thanks to Turbo Pascal & Delphi and small project sizes.

    Would be interesting to know if others had different experiences.

  6. Oren says:

    Make itself hasn’t made much progress. But. There are a zillion build tools that have. cake, cook, rake, jam, … give you a dizzying array of options, including smarter rules, generic programming languages, tracing, debugging, time stamped vs. checksum based dependencies, dynamic vs. static dependency lists, etc. etc. Admittedly most of these are for UNIX variants but most of these would work on Windows (at the worst case, via Cygwin, which is a good idea to have on Windows anyway :-)

    These days the only reason to use make (even GNU make) is for “simple enough” projects where its shortcomings aren’t an issue. For large projects like you describe, it is the wrong tool for the job.

  7. UmberGryphon says:

    Have you tried http://www.scons.org/? It scales so much better than make, although the underlying problem is still there.

  8. softie says:

    Any thoughts on the xml based system that msbuild uses?

  9. Phil says:

    D is a C++-like language with a module system rather than #includes. I will say that it makes things much nicer.

  10. Stu says:

    The phrase “Pull the other one” is a different way of saying “You’re pulling my leg”, i.e. “I don’t believe you”

  11. Dave says:

    I saw Rob Pike compile the elementary fabric of the universe in about ten seconds with Go. I got a shirt too so I was sold.

  12. Joe Ganley says:

    You could always use ant or scons instead of make … except they suck too.

  13. wrm says:

    Hey, we took the BASIC out of C, and then we had all these left-over bits, so we put it into make…

  14. Next big project I have to compile, I hope to use memoize.py instead of make. Much simpler and smarter.

  15. sdf says:

    yeah, ant and even maven are only loved because they aren’t a cactus to the crotch like make was. something better is definitely needed.

  16. WarWeasle says:

    With Lisp, the turnaround from compiling a function and calling it can (sometimes) be measured in microseconds.

  17. Scott says:

    I agree with you about make. However, your dependency problems are a result of your own badly structured code. It’s a pain in the ass, but you need to go through your code and redesign things so everything doesn’t depend on everything else.

  18. OldMrTim says:

    I have seen build times tumble. I remember an old x86 library (around 90 source files), compiled and linked for four memory models (remember SMALL, COMPACT, MEDIUM and LARGE) where a full build used to take around 12 mintes on a 286 in 1990. Ran it again a few years ago on an 800mhz pentium and it took 12 seconds! (MSC v5). Today I’m building a large multi-application project with Visual Studio 2008 (C/C++, approx 6 applications two DLLs and a very large static library). Total full rebuild time (Release and Debug, approx 250 source files) is approx 5 minutes.

  19. DGentry says:

    Your screed could form the mission statement for the Go language: fast compilation of a statically typed language. Though it might be a bit long for a mission statement. Maybe more of a backgrounder.

    Plus, Go is Ken Thompson’s project. Ken “I’m the giant upon whose shoulders you all stand” Thompson.

  20. AChacha says:

    For how much grief people give Microsoft, their built-in system for building projects using solution files is so far ahead of make that I dread the days I have to backport things to linux. msdev will precompile headers, and build projects in parallel. My 300,000 line project builds in under a minute and if something fails I just open the solution in the GUI and compile when it breaks I just click on the error and it takes me to the error and recommendation. Like I said, I dread the days I have to deal with linux builds and makefiles, I immediately get heartburn.

  21. Jose says:

    It would be wonderful. A C language without #include header and with import library. Is it too much to ask for this minor change?

  22. Mike Samuel says:

    Prebake is a new build system that attempts to address the problems that cause make to fail on large projects. It’s very much alpha software at this point, but if you’re looking at alternatives, it might be worth keeping an eye on and I’d love any feedback early in the design process.

  23. Jim Schweizer says:

    Ya know, in a perverted sense I miss my Open Source days, but I don’t miss make files. I don’t miss gcc or make or C coding. Everybody around me uses Outlook and Excel and is hiding behind a VPN. Gone for me are the days when the pioneers were doing something cool.

  24. amrox says:

    I’m familiar with the pain of the C/C++ build system myself. I spent a lot of time cross-complining existing code for embedded systems. Barrels of fun.

    However, going back and fixing the build and include system in C/C++ might not be the way to go. That’s a monumental task given the huge legacy and existing codebases of the languages.

    The language dictates the build system, so perhaps it would be better to focus effort on a new systems-level language rather than yet another build system for C/C++. I hear good things about Go, and I wouldn’t be surprised if there are other similar efforts out there.

  25. Nick Gaens says:

    Did you take a look at qmake*? It’s Nokia’s Qt building program and, despite not being optimal, provides a more ‘human’ approach to writing configuration files for (large) projects.

    *: http://doc.trolltech.com/qmake-tutorial.html

  26. Johnicholas says:

    I’d like to advocate for the “Make is awesome” point of view.

    Yes, flaws include: GNU make (and many of its successors) are suffering from feature creep. People frequently tie themselves in knots with Makefiles (for the same reason that programming is hard – writing Makefiles is hard). GNU make (and many of its successors) insist on mucking up the semantics with huge gobs of implicit rules and defaults. Timestamp-based recomputation was a mistake. “Recursive Make considered Harmful”.

    However, a pure Makefile is a declarative expression of the dependencies among the artifacts of your project – and it can add incremental recomputation to a nearly arbitrarily complex dataflow. Having a lovely clean Makefile indicates that you’ve spent the effort to figure out what depends on what, and you have to document that structure somewhere.

  27. TheAdmiral says:

    Rewrite the linker to contain run-time declarations; then build with /bin/sh scripts. Easy.

  28. Ori Peleg says:

    Listen to all the people recommending make alternatives based on real programming languages, such as Rake, Groovy+Ant, etc.

    Make itself gets difficult when the logic gets complex, and tools to work with it are limited.

    When your build file is a real program in a real programming language, you implicitly gain 2 benefits: complex logic is much more natural, and the auxiliary tooling is great (want a debugger? you have one!).

  29. mjanes says:

    uh, http://gmd.sourceforge.net/ ?
    it’s faster to google than to rant lots of pages

    make is better when it doesn’t get to this point, but if you have to debug, gmd is functional.

  30. ajuc says:

    This is one reason why dynamic languages are so nice to program in – change and test cycle is so short you make much more iterations in the same time, you don’t loose focus while rebuilding.

    I’m now working in Java EE, our server application consists of tens of subprojects, build by maven, hot redeployment almost works == doesn’t (PermGenSpace, stale dependencies, etc). When I have to add something on server I find myself thinking how to do it in client just to not have to rebuild everything, restart jboss, and test it once again. It is pain and it takes fun and experimenting out of programming, it also makes it hard to properly test everything (unit test only takes you so far – there is still need to do tests “by hand”).

  31. Txabi says:

    The replies in this thread remind me of my personal life when it comes to food: I happen to be just about the only person in my country who does hate with a passion our national dish.

    How do people around me react upon that revelation? Almost without exception my friends who share my nationality assume it is not that I “hate” that dish, but that I have not tried the proper version of that dish. The “proper version” is undoubtably their version of the dish, which they diligently prepare and serve me even though as I told them earlier I happen to hate that dish. And so it happened that for a few years, going to visit the family in the old country involved me losing plenty of weight having to endure serving after serving of a dish which I despise.

    When someone writes complaining about Make in the manner that the author of this blog did (and which I agree with wholeheartedly) and I read replies pointing out at things like rake etc. It leads me to believe about how many people (the majority I’d dare to say) in this tech field are prone to perpetually miss the point.

    Modern day personal computers offer aggressive out-of-order multiprocessing, with superscalar operating integer, floating point, and vector facilities, coupled with massive data-parallel processors which offer in excess of 1 Teraflops and can push billions of pixels per second, with extremely large local memories and storage devices and all connected with very high speed networks.

    And yet we still develop using tools that were designed around the assumption that the programmer was using a paper terminal connected serially to a PDP sharing its meager resources to a ton of other people waiting in line…..

  32. Kaishaku says:

    Txabi: well said.

    DH: Great post.

  33. Pal-Kristian Engstad says:

    Everyone that is suggesting “other” make systems are majorly missing the point. The point is that it shouldn’t even *be* an external system for building code!

    In order to reach nirvana, we need the following:

    1. Have a language based on C, that has a basic module support.
    2. Move dependency and build logic into the compiler itself.
    3. Let the “compiler” be a *server*, such that you can interact with it.
    4. Have the “compiler” be a *debugger*, such that you can inspect data, add new code and patch in code while the program still is running.

    It’s a tall order, but it is totally doable. Except point 1, we already did that! It was called GOAL, and it was a wonder to behold. Instead of 1, we instead had a Lisp/Scheme-ish language, which gave us benefit 1a) a proper macro system which helped in reducing complexity in the compiler proper, and 1b) an easy to parse language so that tools were easy to write on top of it.

    We just need GOAL w/syntax. Let’s call it GOAL++. :-)

  34. Kris says:

    If you’re looking something like a make debugger, you might be interested in: http://sailhome.cs.queensu.ca/~bram/makao/index.html . It helps you inspect what happened during a build by interpreting the dependencies in a graphical way.

    Anyway, just wanted to say that there’s other stuff out there than echo’s. :-)

  35. Mike Albaugh says:

    I’m one of those who actually used Landon’s MAKE (on VMS, which meant “everything you know about characters special to make, except TAB, is wrong”).

    It served a purpose. I’m currently dealing with a project that needs to build on a small, stand-alone FreeBSD machine, but I do compile-checks on a PPC Mac. Ah, the wonders of the subtle differences between the two makes. And then there’s jam. But I digress.

    I really wanted to mention BUILD, which was a form used by a fair
    bit of DECUS stuff. It consisted of some scripts and a set of
    stylized comments that were actually in the source-code itself.

    So, when you said “build foo” it looked for a file foo.* (it had a
    list of suffixes to try), and parsed the build-rules in that file
    (e.g. foo.c) indicating its direct dependencies and what to do
    when they were “up to date”, recursing appropriately.

    Yes, it used timestamps, but on a machine that did not mount anything off a filer with a spread-spectrum clock (Lookin’ at _you_, NetAPP), that was OK.

    We’re talking “2Meg? How can you use that much memory?” machines here.

    Anyway, yeah, lots of stuff about that was like Make, but the take-away was that dependencies were actually in the files themselves. No version-skew
    hell between the code and the makefile. Single Source of Truth. Not nirvana, but closer.

    Now, if this was _my_ blog, I’d rant about folks who decide the “nightly build” takes too long and “fix that” by leaving out a few unimportant regression checks, where “unimportant” means “yours, not theirs”

  36. David P says:

    Scott says:
    July 14, 2010 at 5:27 am

    I agree with you about make. However, your dependency problems are a result of your own badly structured code. It’s a pain in the ass, but you need to go through your code and redesign things so everything doesn’t depend on everything else.

    I say:

    Wow. Good thing no one every inherits a finicky codebase coded by someone who believes coments are only for the weak, that’s scattered across hell’s half acre, with responsibility to maintain and continue development, and no one is ever faced with deadlines that don’t leave you the luxury of spending a month or two cleaning up parts of the codebase (introducing a few pesky new errors along the way), then explaining to the non-technical manager that that month was productive – see, the product now has more errors than when I began! And the other products depending on it now glitch because they include work-arounds that now cause some obscure error.

  37. Anon says:

    Ages ago someone tested a couple of different build systems for speed http://gamesfromwithin.com/the-quest-for-the-perfect-build-system . The problem seems to be these days there are only more build systems and I guess they all break in different ways…

    It does look like using a different language may be the only way out.

  38. Sajith says:

    Wonderful. It’s absolutely amazing that in all these years there are so few people that haven’t have these “teeth marks in the rear end” (those aren’t battle scars, silly!) and yet so little progress has been made in the primitiveness of tools we use.

  39. mike says:

    that’s why i prototype anything simple enough in PureData or Max/MSP, even if it’s just a small piece of a bigger project. usually by the time i’m done what i’m working on looks like spaghetti, but i’ve tried doing things a bunch of different ways quickly and have a better developed idea of how i want to execute it.

  40. Anon says:

    As suggested by others elsewhere do things like SparkBuild ( http://www.sparkbuild.com/ ) or remake ( http://bashdb.sourceforge.net/remake/mdb.html ) help?

  41. Ashleigh says:

    ” C language without #include header and with import library…”

    Ahhh…. you want ADA.

    That kind of thing was the whole reason-for-being for Ada. No #includes, no conditional compilation, no header files (but there are specs and bodies).

    Ahhh Ada, what a joy to use. I spent a mere 8 years programming in Ada and even got to be quite good at it.

    You still need build systems for it though. But don’t even contemplate using make (unless using gnat – ada based on the gcc engine). Most other Ada’s compile into an el-magico library and their haint no way to even contemplate make. Just built it all over again…. I well remember the joys of a 100K source line Ada project that built in a mere 5 hours. Bliss.

  42. Oisín says:

    I’m not sure if newer versions of make and your build tools do this already, but I used ccache (http://ccache.samba.org/) a while ago when repeatedly doing long exploratory builds in Linux that often failed. Distcc came in (slightly) handy as well, since there were a few Linux boxes on our LAN.

    Neither of these address the problem that make/scons/jam/maven/[insert your build tool] suck, but at least it’s nice to recover some of the compile time from failed builds.

  43. landon says:

    @Ashleigh: ADA would definitely fix the problem, because I would probably go back to washing dishes for a living.

  44. Programmers are masochists says:

    Programmers are paid, professional masochists. Who else would put themselves through this crap?

  45. By following these rules, my make process is simple, fast, and reliable:

    * All C, all POSIX.
    * No dependency cycles.

  46. cxseven says:

    Quote: (I’ve omitted steps involving mining iron ore, making coke, refining the ore, smelting same, making steel, and pounding out a nail)

    Sounds like somebody’s been playing Minecraft.

Leave a Reply

Your email address will not be published. Required fields are marked *

You may use these HTML tags and attributes: <a href="" title=""> <abbr title=""> <acronym title=""> <b> <blockquote cite=""> <cite> <code> <del datetime=""> <em> <i> <q cite=""> <strike> <strong>