A classic bug

We were at lunch, and I was bitching to my cow-orkers about the USB device I was trying to get running. “It’s flaky as hell,” I said.  “One moment it’s working, the next it’s in the weeds and not transferring data at all.”

It was true; the system would transfer data for a while, then things would stop flowing, transfers would start timing out, and eventually the device would stop enumerating and nothing would work. Then, with no changes, it would start working again.

“It was working great when I got in this morning,” I continued, “But now…”

I’d been twiddling code and writing tests for a week, but hadn’t found the root cause of the problem. One difficulty was that both the hardware and the software were new.  Another was that there were a bunch of barely documented “analog” registers that controlled things like driving current and various transistor-level things; mucking with the settings of these seemed to help, but not always, and what they really controlled was a mystery.

I thought some more. It had worked yesterday morning, too. Then again, after lunch. Then there had been a meeting, and —

“I know what it is,” I said.

—-

Back at my desk, I started up my tests. They were working. I left the unit running, went to the kitchen and drank a can of apple juice. I kept the can.

Returning to my cube, the unit had gone into flake mode again. I used my pocket knife to cut the top off the can and pushed out a dent in the bottom of it with a pencil eraser. I added some insulating tape around the bottom’s rim, and then perched the can on top of the chip I was working with.

I filled the can with ice. Ten seconds later the device enumerated and my tests started working again.

I took the can off: about thirty seconds later the device failed. Can on: working. Can off: busted.

I’m a software guy, but I know a temperature problem when I see one.

—-

“The manufacturer says they know what the problem is, and that they’ll fix it in the next chip revision.”

“That only cost me a week.”

Oh well. 🙂

This is a classic problem in bring-up. I’m embarrassed that I hadn’t thought of it sooner.

This entry was posted in Uncategorized. Bookmark the permalink.

3 Responses to A classic bug

  1. FeepingCreature says:

    Nice!

    I used to have a really shitty ISP-provided router with a total lack of fan that liked to overheat and drop connections when put under any kind of real load. The solution: two garbage bags and a bucket of water .. the sad thing is that it worked.

  2. Steve says:

    It’s easier to just not take the Juice out of the can first and balance it on top. I kept a knackered router running for months by this technique. Eventually got the router replaced when someone important drank a hot can of coke.

Leave a Reply

Your email address will not be published. Required fields are marked *