This post is just a delightful recap of “not so bright” actions I’ve seen done unintentionally that wind up causing a reboot of a node or more in CRS. Some were done by DBAs while others were done by system administrators.
Environment HP & Solaris 10gR2
Kill ocssd.bin Process
This instantly crashed the node.
Delete Content under /tmp/.oracle
This hanged CRS. Commands such as crsctl check crs won’t even come back. You have to reboot the node to recreate the socket file under tmp.
Change System Date
This rebooted CRS flaging a “Cluster Integrity” issue. Poor system administrator almost had a heart attack.
Change Physical Hostname
System administrator accidentally changed the physical hostname then changed it back. The host file was changed as well. This hung CRS. A reboot took care of it.