Chassing a “Heisenbug”

I have a crash that occurs at somewhat random times. It may go 20 minutes or it may crash within a few seconds. The blink code on my LED indicates a Prefetch Abort. In other words, the instruction it went to execute was bogus. Breaking in the debugger the registers only point to the bogus instruction. This address is well outside of valid RAM or ROM address spaces. I suspect that a jump is using a bogus address and jumping into the weeds.

But every time I try to debug this, the behavior changes. I added a big of code to sanity check an index into a jump table (good possibility this was the problem) and now no more crashes and hence the Heisenbug.

When it did crash, the bogus address of the instruction was only ever one of two. 0x62616E58 or 0x79746976. Neither of these values appear in RAM or ROM. They make no sense as ASCII strings (banX or ytiv). They do not ring bells as far as data. They certainly are not addresses.

Do I let this slide, or do I spend more time tracking a bug I can no longer reproduce? Argh.

This entry was posted in AVC. Bookmark the permalink.