|
|
On Fri, 09 Nov 2007 09:34:25 +0000, Invisible wrote:
> Jim Henderson wrote:
>> On Thu, 08 Nov 2007 18:20:39 +0000, Orchid XP v7 wrote:
>>
>>> Nicolas Alvarez wrote:
>>>> Invisible escribió:
>>>>> (I never really understood why programs produce core dumps. I mean,
>>>>> seriously. What chance is there of anybody *ever* deducing anything
>>>>> useful from this data? 10^-78?)
>>>> The original programmer with access to the sourcecode *can* deduce
>>>> data from it.
>>> I seriously doubt it...
>>
>> I've done it, several times, even with things I'm not the original
>> programmer for but have access to the source code.
>
> Really?
Yes, really. :-)
> Well, it still sounds absurdly improbable to me, but I'll take your word
> for it.
Part of the contents of a program in memory is a symbol table - which is
a table that basically tells you what data is stored where. After all,
the program has to know where the value for a variable called "a" is
stored, which means that can be figured out. You don't need a full
system core for this (which used to be the way it was done, in fact, with
my experience working with NetWare, that *was* the way it was done - the
entire contents of memory were dumped along with the current state of the
system). EIP tells you where the program is currently executing and the
various registers tell you the state of the machine - such as if there's
an overflow, PFPE, or some other sort of issue that caused the crash.
So what you do, working at a machine level, is start with where EIP is
and work backwards. You look at the stack backtrace to see what
functions called what functions, and you can figure out what code path
the program had taken.
It's a hell of a lot easier if you have symbolic debug information
available to you, but it's not impossible if you don't - just takes a bit
more time to diagnose. Sometimes a *lot* more time.
From a code standpoint, using Borland C++ (years ago), there was a source-
level debugger that you could single step through. IIRC, there was a way
to take data from a crashed program and use it as well, but it's been a
long time since I did that.
At the moment, the only core file on my system is from novfsd (the Novell
Client for Linux daemon), which I don't have symbol info for, but even
loading the core up in gdb, I can tell that it suffered a segmentation
fault. With a symbol table, it could pinpoint exactly where the daemon
crashed and from there the developers could code a fix for it. As it is,
the backtrace tells me what values are on the stack, but not the name of
the symbol at the address. But that's what the --symbols parameter is
for - to tell it to read in the symbol table.
> Now, how about that Windoze habit of saying "The program has experienced
> an error and will be shut down. Do you want to send debugging
> information?" What do you estimate the chances are that *anybody* will
> do anything at all with the data thus sent? ;-)
Knowing some of the people who do the analysis on that type of data, I'd
say fairly good, actually. It starts with statistical analysis of the
reports coming in, and the things that happen most frequently are likely
candidates for a fix to be written. Of course, different vendors do this
different ways.
I watched one engineer do live debugging of a client issue on a Windows
system; watching someone use a tool like that who knows what they're
doing is a very interesting thing, because they can pull information
about what's going on at any given point in the process. Doing it live
or doing it from a core file is pretty similar, except that with a core
file the system isn't changing; it's actually the *easier* way to debug,
because you're working from a snapshot of a program at a point in time
instead of watching things change.
It's not much different, procedurally, than what CSIs go through in
trying to reconstruct a crime scene. They arrive and things have already
happened, and they have to piece together what transpired ahead of time
in order to build a case. (Of course CSI is more difficult because
people don't act logically; computers do, and as such, they're quite a
bit more predictable).
Though arguably watching things change on a live system can provide more
insight into how you got to a point.
Jim
Post a reply to this message
|
|