Some assembly required

A little (very little) more history.

I spent more time at Atari waiting for assemblies to finish than you’d probably believe. I mean, assembly language; how hard can it be? Yet the Assembler/Editor cartridge was famous for its lack of speed, the cross-assemblers on the MV/8000 could take 45 minutes to crunch through 16K of output during loaded hours, and even the CoinOp assemblers on the Vaxes were not remarkably fast.

My roommate and I found the Synapse Assembler on the 800 to be incredibly fast. It would process 8K of output in just a few seconds. Combined with a 128K RAMDisk, and a parallel cable to a slave Atari 800 and a debugger, you could turn around a piece of code in a couple of quick keystrokes. I wrote a (very) tiny Emacs patch for the SynAssembler, and for a few months we were in fast turnaround heaven. It almost doesn’t matter what language you’re working in if the turnaround time is quick enough.

I started writing assemblers as a hobby. I hated the slow tools we had and really wanted something better. Pre-tokenization sped things up a lot. I got some other people to actually use my second or third efforts at “really fast” assemblers, and got some good feedback (e.g., when I added a listings output feature, people started taking the assemblers seriously — there’s something about hexadecimal numbers on 132-column fan-fold paper that gives assembly programmers a warm fuzzy feeling).

Things (like the company nearly going belly-up, and the cessation of 6502-based development pretty much everywhere at Atari) intervened, and I didn’t return to that hobby for a couple of years.

The 68000 assembler we used for the Atari ST was really intended to be used as a back-end to a C compiler. It had very few creature comforts; no macros, no includes, no real listings mode or cross-reference generation. Writing assembly in it was moderately painful; doable, but not fun. It was also not very fast.

So I got pissed off at it and wrote MadMac. Mission #1, be a decent tool for writing assembly (macros, etc.) because we were still writing at that level a lot in those days. Mission #2, be fast. So MadMac uses some smart buffering (it tries hard not to copy a string out of the disk buffer unless it has to), uses DFAs to recognize keywords, boils input text down to easily processed tokens as early as possible, and so on. I’m sure it could be faster (just as I’m sure there’s plenty of too-complex premature optimization), but it was pretty good for its time (I remember measuring it at 50,000 lines/minute on an 8Mhz 68000, but it’s possibly my memory is exaggerating things).

But MadMac has a 6502 mode. WTF? Who ever heard of an assembler doing both 68000 and 6502 code generation?

Around the time I was finishing-up MadMac, unbeknownst to the ST software group another group had hired a contractor to do some work on a new development system; I think it was for the 7800 console, but it might have been another project. Some 6502-based thing, anyway. I noticed this guy’s printouts in the machine room and couldn’t resist leafing through them; he had finished the design of a pretty vanilla 6502 assembler and was starting to write code. His partially completed work included pretty much all the stuff that I’d already done in MadMac, but his stuff wasn’t as good (his assembler was going to be slow, and he’d made some bad compromises in functionality — no macros or listings, for instance).

I got mad that we were paying someone for months of work that I could a better job of in like a week. So MadMac got a 6502 mode, I cost a contractor his job, and I guess it saved Jack Tramiel some thousands of dollars. Later I heard that the people using MadMac were mostly using it for 6502 development, and that they loved it.


Today, for the most part you can just hack away in Java or C# or C/C++ and not worry about the underpinnings of things, but when it comes to the performance-sensitive bottlenecks of modern systems, out come the assemblers.  For a “real” OS there’s always more of it than you think, and for modern systems things can get pretty complex.  We had a decent macro-assembler for the Apple Newton that made the kernel development tons easier, and I’ve seen other systems since then that have more assembly language than you’d expect.  Assembly is still relevant and it makes sense to have decent tools at the bottom.  [I get a chuckle out of people questioning whether C is still relevant . . . little do they know…]

This entry was posted in Rantage. Bookmark the permalink.

23 Responses to Some assembly required

  1. DGentry says:

    Absolutely! I still do some work in assembly, though for MIPS rather than 6502 or 68k. I also occasionally disassemble the output of the C compiler, to see where the compiler had a difficult time and try to restructure that code to get better results. There are places where squeezing that last few percent out is worth a few hours of effort.

  2. PeterI says:

    Hmm I wrote my own assembler for the Apple II ‘cos I was too poor to afford the proper Apple one, I used the mini-assembler in the integer basic ROM to bootstrap it over and IIRC Applewriter I as an editor to type in source code. I wrote a shoot em up (that got published on an magazine disk), asteriods and a Forth.

    I seem to remember code crunching the games so they’d fit on the first track of the disk in which case the Apple boot rom would load the whole lot into memory without any OS.

    The Forth was interesting as I did it by using Loeligers threaded interpreted languages book and a copy of leo brodies starting forth without having a working Forth to compare to. I seem to remember it ran pretty well and I got as far as a working breakout clone, complete with a disk block system and editor that used a sparse file under Prodos.

    In the end I paid for the Prodos developers kit which came with the official apple assembler and a proper debugger with breakpoints which made life a lot easier.

  3. landon says:

    @PeterI: I had a FORTH working, too, based on the Digital Group “Convers” system. Well, the manual for Convers, which I had somehow acquired a copy of. There was also a DDJ article on FORTH that was inspirational.

    I think that everyone should write one. It has some great ideas.

    Loeliger’s book is pretty neat (as was the BYTE issue devoted to FORTH), but I read that after I was pretty much done (had a visual editor and some other stuff working, then I discovered C and no longer cared…).

  4. John says:

    We have an amazingly hard time finding good C/C++ programmers where I work. Right here in the heart of Microsoftland and we can’t get a good ones and we pay a premium. Seems most programmers thing that C# is the future and it replaces C++. I can only imagine what they think of assembly if they even know about it. 🙂

  5. rick says:

    Wow, I am just enamored at historic talk about computers. I just read the ‘Dream Machine’ the store of Licklider. It was amazing, everything from Ethernet to Xerox Parc.

    Can anyone recommend another interesting historic computer book?

  6. landon says:


    Kidder, _Soul of a New Machine_

    (?), _The Supermen_ (mostly about Seymour Cray)

    (?), _A Few Good Men from Univac_ (ditto)

    Pugh, various books on the IBM 360 systems

    The ACM books from the History of Programming Language conferences (HOPL III should be out soon)

    … and many others.

  7. Dru Nelson says:

    hey landon,

    GREAT STUFF. Keep it coming! Occasionally, I do miss that
    era of computing.

    Quick question: what did you run Emacs on if you used the Synapse
    Assembler on one Atari 800 and had the slave with a debugger on
    the other 800?

  8. landon says:

    @Dru: I wrote a very, very small version of Emacs (maybe a couple dozen commands, just the basics, but enough to be productive) that patched itself into the SynAssembler command handler. Type ‘EDI’ and you’d be editing in a line-number stripped environment; ^X^C would return to the SynAssembler command processor. I think I wired up ^X^M to do an assembly and download.

    The debugger was in a ROM cartridge on the target machine. No source level debugging, but hardly necessary in assembly language. 🙂

    I thought the setup was slicker than hell. But the owner of Synapse wasn’t interested, and a year or two later it was irrelevant.

  9. e-tate says:

    > [I get a chuckle out of people questioning whether C is still relevant . . . > little do they know…]

    @OP & John

    Sure C and ASM are sometimes good tools to obtain performance from the hardware, when you really need to be accessing hardware directly. But really would you not rather program(for the large % of the time where development time > than importance of efficiency) in a language which a) lets you write 2 lines of code to produce efficient *enough* assembly(that you may later think about optimizing at the low-leve), b) doesn’t allow you to make costly mistakes — evil side effects eeevil :), and c) forces you to write coherent programs that are maintainable by x and Lemmy Tweakit?

  10. landon says:

    @e-tate: In principle I agree. I’d love it if systems factored out as 99.99% high level stuff, with just a couple of lines of asm. In practice this doesn’t seem to happen.

    On the PDP-11 the uses of asm were “trivial” and you could bounce back into C pretty quickly. On the PowerPC (for instance) you have to write all kinds of stuff that truly needs to be as small and as fast as possible (e.g., TLB miss handlers) because they impact performance system-wide. Interrupt handlers, deferral mechanisms (you *do* want to be able to handle device interrupts very, very quickly), system timers, etc. — it’s a larger list than you’d think.

    The Newton had maybe 3,000 lines of assembly (it might have been more). A fair amount of code was spent speeding up inter-process communication and the other stuff that lives between the C-level kernel and the gnarly hardware-level.

  11. John says:

    Having to support legacy OSes has limited our ability to adopt certain technologies (C# for instance). Only now do we have the luxury of not supporting NT 4.0! (yeah finally!!) Unfortunately for me we have a product based on Windows PE where the .Net framework is not supported and all of our code has to work on there as well.

  12. Dru Nelson says:

    @e-tate – sure, but does C even move the needle that much anymore in terms of programmer productivity. A good macro assembler and dev env is almost as productive (almost). where C could really shine (and where assembler starts to fall short ) is in the libraries area. They have such limited support for the interesting
    data structures that people need. (Hastables, Lists, etc.) To me, this is still
    an interesting area to keep an eye on.

  13. landon says:

    @Dru: I’d much rather write in C, even K&R C, than the best macro assembler in the world. I prefer C++ to C (well, a safe subset of C++ that doesn’t include MI, exceptions, operator overloading, and only a smattering of the useful things in the STL). I much, much prefer C# any time I can get away with it. (I did a bunch of Java eight or nine years ago and hated it).

    So that’s the heirarchy of my prejudices.

    I don’t do web programming at all. JavaScript? I’m a dinosaur. No idea where to start.

  14. Brad says:

    Off Topic – I relocated to a new building at work recently, and as I was heading towards my new office (lab) last Friday, I happened to glance at a door, and saw your name on it. The name didn’t click right away, but I knew I had seen it somewhere before, and I just realized it was you. Small world!

    On Topic – Just out of curiousity, do you recall which implementation/version of Emacs you used with SynAssembler?

    re: 6502 assembly
    There’s a fantastic 6502 simulator at “”, which is an offshoot of the Xscreeensaver package for X-Windows. It runs graphical demos rather well, but I’m not certain about how fast it can run anything complex.

  15. Thomas says:

    Thanks for another post that took me right back to my programming roots… I haven’t done any assembler programming in about ten years, but back in the late 80s and early 90s I did a lot of 68K assembly on Atari STs, Macs and before that on Sinclair QLs.

    I had a summer job around 1990 where I helped a gang of Hungarian coders build a 68030-based multitasking OS in assembler (it was a very complex system that had to fit with A LOT of other hardware onto a PCI card) with a row of shiny Mega-STs as development environment. It was three months of the best coding ever! Those guys were amazing engineers and for months we did long days of nothing but 68K assembler… Good times…

    Nowadays it’s all JavaScript all the time for me and while I enjoy that too, it sure doesn’t feel as direct and fresh as running assembly straight on a chip.

  16. xor ax,ax says:

    Yeah, I miss having my grubby fingers on the metal…. It’s a major accomplishment to get even the most trivial of tasks done in asm. Anyway, that’s what it felt like when we were ~10.

    After three layers of garbage collection and VMs, how much performance has a 2 GHz dual core over a 80 MHz 386? Why do I need more than two freaking gigabytes of RAM just to run a browser and a word processor? I’m talking to you, Vista! Don’t be like that fat smelly guest that won’t let you sit on your own couch and surprise you in the morning with an empty fridge!

  17. Great stuff. Cross-assembly happened on other platforms too. There was at least one Amiga assembler that had a 6502 mode, and some of the best Commodore 8-bit software was written in that. I can see why; there were several pretty good C-64/128 assemblers, but none had a prayer of keeping up with a 68000, and a 68K-based machine had a lot more horsepower to give you a better editor.

    I gave up programming soon after college, but I get frustrated with most of the developers I have to work with. Everything’s abstracted so much these days, many developers know very little about computer internals. In the 8-bit days, the programmers knew the machine better than anyone, including the engineers who designed it. Today? My worst horror story is a developer wanting me to check the firewall because his program was giving a weird error communicating with

  18. Jay Craswell says:

    “Hardware Abstraction Layers” is one of the reasons I lost heart with writing software. That and the argument that high level compilers are so much more effective (or readable) then low level languages. I think an analogy of today’e printers is similar. I think we all used to get excited about claims that a printer would do 8 (or whatever) pages per minute. Right? Now we know that this is BS because during the first minute most chug around doing god knows what until the first page is printed. And even if you discount the startup fiddling the only way it prints 8 pages per minute is if the text is one line that says “Sucker…..”

    I stuck with micro controller development and tried to stay isolated in my own world. To save my sanity I tossed all (but one) of the machines running Windows and went to a small distro of Linux. All the things I hated about Unix in the old days Xenix are now good features. It took 14 minutes to start up when the “fast” machine was a 80286 clocked at 6 (or was it 12?) MHz. I think it needed 4 Megabytes of ram to be “happy” Now the most awful piece or “worthless” computer makes that lash up look like it was used by Fred Flintstone. Thank god some of the folks who maintain Unix/Linix have not “upgrading” it like Microsoft. Although some of the windows “managers” are good attempts.

    A couple of nights ago I saw a friend from the very old days who is still plugging along in the biz and he showed me some source in C (Or was it C++) that was actually readable!?! I think the quality of the code (Source) has sunk down so low that “crap” is the standard. And so many people seem to cut and paste code from the web without any clue what its doing or how it works.

    A few years ago guy I just mentioned wrote some test routines to measure how many lines per second (arcs, Filled areas etc) things run at, at various levels of “abstraction” and its unbelievable. He wanted me to explain this (Like I could?) how things could be so bad? We ended up talking to a “game” programmer who said that Microsoft had some huge graphics developer library that we needed. The only call he used in this giganto bunch of junk was “At what address is the start of screen memory?” Arghhhhhhh……..

    Why people continue to buy faster and faster hardware to maintain the ability to peck out a letter or browse the web is beyond me.

    Oh well. This all sounds like sour grapes. I was tasked to write a program that would work with (only) X11 and use the “accepted” high level language. If nothing else X11 has the right credo. And I’ve seen some source that doesn’t make me hurl so life is good!

    Interrupt what am dat?

  19. Chris says:

    I remember being shocked at how poorly the MPW (Mac) 68k assembler performed. And it was considered fast by people at Apple who’d used other assemblers.

    (I’m still waiting for the HOPL-III proceedings to appear in dead-tree form. I have the HOPL-I and -II books.)

    • landon says:

      The sad thing is that the engineer who wrote the MPW assembler had filled it with micro-optimizations. Like, adding a new keyword involved lots of bit-level magic, and nobody but him could do it.

      Two lessons there:

      (A) Measure your cleverness to see if you’re actually being clever.

      (B) Never, ever be in a situation where people in your group are afraid of your code. Corollary: Never be afraid of someone’s code.

      Both of those behaviors are slow death in a big organization — fast death in a small one, or one that takes people skills seriously.

  20. Speaking of MadMac, the person who took the MadMac code that you posted on your blog and ran with it added an .incbin directive (among other things) that read from the included file one byte at a time. I couldn’t believe my eyes… This caused a fairly small project (~5K lines of code) with modest assets to assemble in around five or so seconds. Once I patched that directive to read the file and stuff it into the assembly all in one go, the assembly time went down to somewhere in the neighborhood of two tenths of second. It’s wicked fast. 🙂

    I found other fun stuff in there, like putting pointers (!) in the token stream that prevented it from being usable on 64-bit platforms, but will refrain from making accusations since I don’t know who the guilty party is. 😉

Leave a Reply

Your email address will not be published. Required fields are marked *