I’m a contractor at a rocket launch service provider. The final build of the ground control software is compiled and deployed to the launch pad with debug flags enabled because of a “fly like you test” mandate.
Millions of dollars and tons of time invested by brilliant people are riding on rockets that are launched using software with debug flags because of an “if it ain’t broke don’t fix it” mentality and archaic test strategies.
Perfect, now you just have to wrap your program inside a debugger in production!
We test AND develop in production. Get on my level.
One of our customers does that. It happened multiple times already that one dev fixed an issue in production, and the next regular deployment overwrote everything.
But fortunately, it’s just critical infrastructure and nothing important.
When I left my last job they were using the zip file method for version control and one creative developer managed to link two versions of libc at the same time.
Software is so useful that the standard for utility is extremely low.
Heisenbug. Nasty buggers, especially in my domain: Embedded Engineering. When you are in the debugger, the whole processor is stopped, missing tons of data coming in, missing interrupts, getting network timeouts, etc. More often than not, resuming makes no sense, and you have to get straight to reboot.
“You don’t debug embedded” ~my brother, who’s been working in embedded for almost 15 years
Someone has a compiler if statement left somewhere in their code (… probably)
Aren’t those almost always race condition bugs? The debugger slows execution, so the bug won’t appear when debugging.
Turned out that the bug ocurred randomly. The first tries I just had the “luck” that it only happened when the breakpoints were on.
Fixed it by now btw.bug ocurred randomly.
Fixed it by now btw.
someone’s not sharing the actual root cause.
I’m new to Go and wanted to copy some text-data from a stream into the outputstream of the HTTP response. I was copying the data to and from a []byte with a single Read() and Write() call and expexted everything to be copied as the buffer is always the size of the while data. Turns out Read() sometimes fills the whole buffer and sometimes don’t.
Now I’m using io.Copy().Note that this isn’t specific to Go. Reading from stream-like data, be it TCP connections, files or whatever always comes with the risk that not all data is present in the local buffer yet. The vast majority of read operations returns the number of bytes that could be read and you should call them in a loop. Same of write operations actually, if you’re writing to a stream-like object as the write buffers may be smaller than what you’re trying to write.
This is where
printfdebugging really shines, ironically.Honestly, this is why I tell developers that work with/for me to build in logging, day one. Not only will you always have clarity in every environment, but you won’t run into cases where adding logging later makes races/deadlocks “go away mysteriously.” A lot of the time, attaching a debugger to stuff in production isn’t going to fly, so “printf debugging” like this is truly your best bet.
To do this right, look into logging modules/libraries that support filtering, lazy evaluation, contexts, and JSON output for perfect SEIM compatibility (enterprise stuff like Splunk or ELK).
Sound like a critical race condition or bad memory access (this latter only in languages with pointers).
Since it’s HTTP(S) and judging by the average developer experience in the domain of multi-threading I’ve seen even for people doing stuff that naturally tends to involve multiple threads (such as networked access by multiple simultaneous clients), my bet is the former.
PS: Yeah, I know it’s a joke, but I made the serious point anyways because it might be useful for somebody.
When I write APIs I like to set endpoints to return all status codes this way no matter what you’re doing you can always be confident you’re getting the expected status code.
Heisenbugs are the worst. My condolences for being tasked with diagnosing one.
Clearly you should just ship it with the debugger and call it a day
The term is Heisenbug
Just run your prod env in debug mode! Problem solved.
Haha, heisenbugs, always a fun time.
More seriously, I’d be surprised if this wasn’t a classic race condition
It wasn’t :D
See my comments below.
this happens with so many scripts I’ve tried to debug with strace because strace requires to run as root or sudo which elevates the niceness of process which prevents certain errors from occuring when the script is run with root permissions and so it runs flawlessly without bugs and you sit wondering wtf









