Thursday 16 November 2017

How to panic a guest domain in Solaris

I recently came across a Solaris 10 guest domain in a hung state.
I accessed its console from the primary domain but I was not able to see a login prompt or anything for that matter.

I was able to ping the server but unable to login to it via ssh.

Hence we decided to reboot the guest domain but we wanted to make sure that a crash dump was generated which could be shared with Oracle support for further analysis.

We decided to induce a kernel panic in the guest domain to ensure the generation of a crash dump on system restart.

The command used to accomplish this is ldm panic-domain.

[sahil@primary-domain-p:~] $ sudo ldm panic-domain test-domain-g
[sahil@primary-domain-p:~] $ sudo ldm list test-domain-g
NAME             STATE      FLAGS   CONS    VCPU  MEMORY   UTIL  NORM  UPTIME
test-domain-g active     -t----  5002    64    158G     100%  100%  157d 1h
[sahil@primary-domain-p:~] $ sudo console test-domain-g
Trying 127.0.0.1...
Connected to localhost.
Escape character is '^]'.

Connecting to console "test-domain-g" in group "test-domain-g" ....
Press ~? for control options ..
 6:21 100% done
100% done: 1523171 pages dumped, dump succeeded
rebooting...
Resetting...
NOTICE: Entering OpenBoot.
NOTICE: Fetching Guest MD from HV.
NOTICE: Starting additional cpus.
NOTICE: Initializing LDC services.
NOTICE: Probing PCI devices.
NOTICE: Finished PCI probing.


The ldm panic-domain command ensured that a crash dump was generated when the guest domain underwent a reboot.

I hope this quick tip was helpful.

Monday 6 November 2017

Shutdown a zone stuck in down state

While working on a patching activity I came across an issue wherein the swap file system temporarily mounted for a zone did not get unmounted properly during the installpatchset phase.

swap                   295G     8K   295G     1%    /zones/lab-zone/lu


From the zoneadm list output, I observed that the zone to which the above file system belonged to was somehow stuck in down state.

usport-lab-g# zoneadm list -icv
  ID NAME             STATUS     PATH                           BRAND    IP
   0 global           running    /                              native   shared
   7 lab-zone       down       /zones/lab-zone              native   shared

Further investigation revealed that the zoneadmd process for the zone was still active.

usport-lab-g# ps -ef | grep zoneadmd
    root  6962  6863   0 06:31:49 pts/1       0:00 grep zoneadmd
    root 21890     1   0 06:12:45 ?           0:01 zoneadmd -z lab-zone

I forcefully terminated this process with the kill command.

usport-lab-g# kill -9 21890
usport-lab-g# ps -ef | grep zoneadmd
    root  7025  6863   0 06:32:00 pts/1       0:00 grep zoneadmd

This fixed the problem and the zone was now in the installed state as I anticipated.

[ssuri@usport-lab-g:~] $ sudo zoneadm list -icv
  ID NAME             STATUS     PATH                           BRAND    IP
   0 global           running    /                              native   shared
   - lab-zone       installed  /zones/lab-zone              native   shared


I hope this quick tip was helpful for you and I thank you for reading.

Using capture groups in grep in Linux

Introduction Let me start by saying that this article isn't about capture groups in grep per se. What we are going to do here with gr...