What is the Cluster Health Monitor (CHM)?
Introduction
The Cluster Health Monitor (CHM) (formerly a.k.a. Instantaneous Problem Detector for Clusters or IPD/OS) is designed to
detect and analyze operating system (OS)
and cluster resource related degradation and failures
in order to bring more explanatory power to many issues that occur in clusters, in which Oracle Clusterware and / or Oracle RAC are used, such as node evictions.
Why should you use CHM?
Because there is a Monday morning for example
Assume the following scenario:
Leaving the office Friday night
Getting an email that one node in the cluster rebooted on Sunday morning
Getting a question from your manager why that node rebooted on Monday
Typical way of addressing this question:
Gather and analyze Oracle Clusterware and operating system logs (e.g. following MOS doc 330358.1 – CRS 10gR2/ 11gR1/ 11gR2 Diagnostic Collection Guide)
Open a Service Request with Oracle Support
Possible outcomes:
Oracle Support finds the answer in one of the logs
Oracle Support needs more node specific information to answer the question
For the latter: This why you need Cluster Health Monitor (CHM) for example
Based on the previous scenario:
It is determined that the reboot was caused by an abnormally high CPU load in conjunction with extreme IO waits.
Your manager asks you:What caused the high CPU load? What can we do to prevent this in future?
For the latter: CHM provides a historical view on collected data:
>crfgui -d “00:05:00″ -m 192.168.2.8
Cluster Health Analyzer V1.10 Look for Loggerd via node 192.168.2.8…reading 300 sec from the past Connected to Loggerd on rac1Note: Node rac1 is now upCluster ‘MyCluster’, 2 nodes. Ext time=2010-08-18 23:22:30
How to Install CHM?
Use the documentation
Overview of Cluster Heath Monitor (CHM) (http://www.oracle.com/technetwork/database/enterprise-edition/ipd-overview-130032.pdf)
Summary of installation steps:
Download the software
Unzip the downloaded file
Do not install from a shared file system
Set up an OS-user for CHM
The user must have passwordless SSH access to all nodes
The user can be the same as the Oracle Grid Infrastructure-owner
Install the software
$CHM_install_DIR/install/crfinst.pl –i {node1,node2…} –b /BDBdirectory
Do not use a shared destination for the location of the BDBdirectory
The software is distributed across all nodes specified under –i automatically
Define one of the nodes as the master node
Run “crfinst.pl -f -b /u01/orachmbdb/” as root on all nodes to enable the tool
Future Development of CHM What you will find in Oracle Grid Infrastructure 11.2.0.2
Cluster Health Monitor is planned to be integrated with Oracle Grid Infrastructure starting with 11.2.0.2 as follows:
The data gathering part of the tool will be part of the standard installation
CHM will therefore be installed into the Oracle Grid Infrastructure home
The Berkeley DB will be installed in the Oracle Grid Infrastructure home (default)
The GUI remains as a separately downloadable item
Changes in some parts of the architecture are possible, but the principles remain
The tool will provide more configuration options on the command line for example
The tool will be enabled per default with a default retention time (adjustable)
Going forward, all OS supported for Oracle Grid Infrastructure will be supported for Cluster Health Monitor.
More Operating Systems are planned to be supported for CHM as 11.2.0.2 becomes available on those OS’s (completion is planned for 11.2.0.3)
© 2010 – 2011, www.oracledatabase12g.com. 版权所有.文章允许转载,但必须以链接方式注明源地址,否则追究法律责任.
相关文章 | Related posts:




最新评论