Communities

Writing
Writing
Codidact Meta
Codidact Meta
The Great Outdoors
The Great Outdoors
Photography & Video
Photography & Video
Scientific Speculation
Scientific Speculation
Cooking
Cooking
Electrical Engineering
Electrical Engineering
Judaism
Judaism
Languages & Linguistics
Languages & Linguistics
Software Development
Software Development
Mathematics
Mathematics
Christianity
Christianity
Code Golf
Code Golf
Music
Music
Physics
Physics
Linux Systems
Linux Systems
Power Users
Power Users
Tabletop RPGs
Tabletop RPGs
Community Proposals
Community Proposals
tag:snake search within a tag
answers:0 unanswered questions
user:xxxx search by author id
score:0.5 posts with 0.5+ score
"snake oil" exact phrase
votes:4 posts with 4+ votes
created:<1w created < 1 week ago
post_type:xxxx type of post
Search help
Notifications
Mark all as read See all your notifications »
Q&A

Welcome to Software Development on Codidact!

Will you help us build our independent community of developers helping developers? We're small and trying to grow. We welcome questions about all aspects of software development, from design to code to QA and more. Got questions? Got answers? Got code you'd like someone to review? Please join us.

Post History

50%
+0 −0
Q&A Tools for debugging coredumps

It's rare that you find a software tool that does exactly what you want in the way that you want. Hence the popularity of scripting languages among software developers for doing all sorts of littl...

posted 6mo ago by Invisible Mender‭

Answer
#1: Initial revision by user avatar Invisible Mender‭ · 2024-05-08T07:33:38Z (6 months ago)
It's rare that you find a software tool that does exactly what you want in the way that you want.  Hence the popularity of scripting languages among software developers for doing all sorts of little tasks.  I've had good success using [Perl](https://www.perl.org/docs.html) to drive [GDB](https://en.m.wikipedia.org/wiki/GNU_Debugger).  

First I'll explain the context a little.  The company had a high reliability data delivery network of legacy C components.  These had a simple resilience arrangement of primary and secondary instances, with automatic switchover to the secondary on failure. Most software failures were intermittent and random, and were usually timing dependent.  The network was mission critical for the company, because it ensured a steady subscription income, with an SLA of [five-nines](https://en.m.wikipedia.org/wiki/High_availability#Percentage_calculation).  

My job was analysing and fixing the steady stream of core dumps that occurred.  The first few that I looked at were some 18 to 20 Mb in size, after which I stopped looking at the size.  While there was no particular time pressure (management could already see that I knew what I was doing), spending a lot of time analysing huge core dumps never looks like an important activity, so I was looking for a way to speed up dump analysis.  The tool I chose was Perl with the Expect module and supported by modules such as IO::Stty.  Perl because I was familiar with it, but any scripting language with an Expect-like module should be suitable, such as [Python](https://www.python.org/) with [Pexpect](https://pypi.org/project/pexpect/), [Tcl](https://en.m.wikipedia.org/wiki/Tcl), [Lua](https://www.lua.org/) and others.  

The control script (in Perl in my case) will never be finished because every bug is different, but you can reuse code you wrote for previous bugs, such as finding status table entries for a current connection or user.  Scripting languages often help with easy to write syntax.  For example in Perl you can write 

     my $varlist = qw( var1 var2 var3 );
    
for an easy to extend list of variables in the language of the program you're debugging that your script is going to report on.  

These legacy C components each had an internal [cooperative multitasking system](https://en.m.wikipedia.org/wiki/Cooperative_multitasking) and one of the things my script eventually did was to report a stack backtrace for all current tasks.  I note this to show that the script you write will always be closely linked to the program that you're debugging.  

Having found this technique useful for dealing with core dumps, I soon found it useful for dealing with running processes as well.  With networked server components like this, even in the test network, when a process is stopped at a breakpoint everything else won't wait while the programmer thinks and things will start to timeout, messing up whatever test you were doing.  Using the scripting language, the response to the breakpoint can be easily automated.  It was even used on the live system where the scale was 100's or 1,000's of clients or connections, although I can only recall this being done with a less sensitive administrative component, the absence of which for short periods wouldn't cause any problems.

The script I wrote was only on the companies own systems, so is not accessible, but I've often thought it might be useful to implement such a script for some open source software which can run at a large scale such as [Nginx](https://en.m.wikipedia.org/wiki/Nginx) to demonstrate the utility of the technique.