kernel oopses

Robert Latham robl at
Tue Jan 29 10:39:33 EST 2002

On Mon, Jan 21, 2002 at 06:13:44PM -0800, Martin Siegert wrote:
> This is somewhat off topic - sorry for that.

it's a great topic for clusters.  in an ideal world, the kernel never
oopses, but when you have N kernels and possibly dodgy hardware, it

i get frustrated with this list because topics like Martin's get
ignored, while topics like cooling with LN2, game console clusters
and anything athlon get multi-day discussions.

[snip problem report ]

> The first thing I would like to do is to log the oops message. Right now
> it goes to the console only - it does not appear in the log files
> although syslog sends everything of severity *.info to /var/log/messages.

i guess you've read Documentation/oops-tracing.txt , but if not, it's
a good start.

depending on where the panic happens, the part of the kernel that
would normally write that oops out to disk doesn't run.  

So you've got a few options:

. typing off the screen:  sucks.  a lot.  and is highly error prone.
  and the kernel console blanking mechanism might kick in ( and since
  the kernel has paniced, it won't listed for input signals and unblank
  itself ) but if you've got no other option...  
  ( one time a guy took a picture of the oops with a digital camera and
  sent that to me. that was fun.  I don't have any character regognition
  software, but if someone knows of a linux OCR tool that won't mind a
  screenful of hex, i'd like to hear about it )

. serial console:  not bad.  if it's just one machine, you can pass
  parameters to your kernel and capture all kernel messages over the
  serial port.  Documentation/serial-console.txt has all the info you

. netconsole:
  like a serial console, but using your network device instead of a
  serial device.  It's a kernel patch and a convienece script for the
  sender  and a userspace tool for the reciever to display the messages.
  Patching a kernel and setting up yet another tool might be a bit much,
  but man is it cool to see it work :>  

. patch your kernel to support "dump log to swapfile" or "dump log to
  disk".  I haven't set something like this up, but always meant to
  try it out...

Basically the name of the game is to get that oops into a form you can
feed to ksymoops, then hope the backtrace it prints out gives you a
clue.  ( like "oh, the last thing it called was do_scsi_service... maybe
i have a dogdy scisi controller ).

Anybody else know of good ways ( even funny bad ways might be
entertaining) to capture an oops?


Rob Latham
                                             A215 0178 EA2D B059 8CDF  
                                             B29D F333 664A 4280 315B
Beowulf mailing list, Beowulf at
To change your subscription (digest mode or unsubscribe) visit

More information about the Beowulf mailing list