本次测试来至于跟朋友一次聊天,关于10G RAC中crash节点的删除与重新增加,已经N久没有做过10G RAC的操作,并且原来的操作记录也没有找到,悲剧的曾经的笔记全掉了。
在这次测试过程中,遇到一个原来重要没有遇到的过的问题。
欢迎大家加入ORACLE超级群:17115662 免费解决各种ORACLE问题
本次是通过手动删除节点1来模拟节点1crash后,在节点2上清除节点1的信息。官方文档见:
Steps to Remove Node from Cluster When the Node Crashes Due to OS/Hardware Failure and cannot boot up (文档 ID 466975.1)
1,当前集群资源的状态
www.htz.pw $ crs_stat -t Name Type Target State Host ———————————————————— ora….SM1.asm application ONLINE ONLINE sol1 ora….L1.lsnr application ONLINE ONLINE sol1 ora.sol1.gsd application ONLINE ONLINE sol1 ora.sol1.ons application ONLINE ONLINE sol1 ora.sol1.vip application ONLINE ONLINE sol1 ora.sol10g.db application ONLINE ONLINE sol2 ora….g1.inst application ONLINE ONLINE sol1 ora….g2.inst application ONLINE ONLINE sol2 ora….SM2.asm application ONLINE ONLINE sol2 ora….L2.lsnr application ONLINE ONLINE sol2 ora.sol2.gsd application ONLINE ONLINE sol2 ora.sol2.ons application ONLINE ONLINE sol2 ora.sol2.vip application ONLINE ONLINE sol2 |
2,删除节点1中数据库与crs的信息
www.htz.pw # /oracle/app/oracle/product/10.2.0/db_1/bin/crsctl stop crs Stopping resources. This could take several minutes. Successfully stopped CRS resources. Stopping CSSD. Shutting down CSS daemon. Shutdown request successfully issued.
www.htz.pw $ crs_stat -t Name Type Target State Host ———————————————————— ora….SM1.asm application ONLINE OFFLINE ora….L1.lsnr application ONLINE OFFLINE ora.sol1.gsd application ONLINE OFFLINE ora.sol1.ons application ONLINE OFFLINE ora.sol1.vip application ONLINE ONLINE sol2 ora.sol10g.db application ONLINE ONLINE sol2 ora….g1.inst application ONLINE OFFLINE ora….g2.inst application ONLINE ONLINE sol2 ora….SM2.asm application ONLINE ONLINE sol2 ora….L2.lsnr application ONLINE ONLINE sol2 ora.sol2.gsd application ONLINE ONLINE sol2 ora.sol2.ons application ONLINE ONLINE sol2 ora.sol2.vip application ONLINE ONLINE sol2
删除节点1相当的信息 www.htz.pw # rm /etc/init.d/init.cssd rm /etc/init.d/init.crs www.htz.pw # rm /etc/init.d/init.crs www.htz.pw # rm /etc/init.d/init.crsd www.htz.pw # rm /etc/init.d/init.evmd www.htz.pw # rm /etc/rc3.d/K96init.crs /etc/rc3.d/K96init.crs: No such file or directory www.htz.pw # rm /etc/rc3.d/S96init.crs www.htz.pw # rm -Rf /var/opt/oracle/scls_scr www.htz.pw # rm -Rf /var/opt/oracle/oprocd www.htz.pw # rm /etc/inittab.crs www.htz.pw # cp /etc/inittab.orig /etc/inittab
www.htz.pw # rm -rf /var/tmp/.oracle/* www.htz.pw # rm -rf /tmp/.oracle/* www.htz.pw # rm -rf /oracle/app |
3,节点2,CRS中削除节点1的信息
www.htz.pw # pwd /oracle/app/oracle/product/10.2.0/crs_1/bin www.htz.pw # ./oifcfg getif e1000g0 192.168.111.0 global public e1000g1 192.168.112.0 global cluster_interconnect
在oifcfg中清除节点1的信息 www.htz.pw # ./oifcfg delif -node sol1 PROC-4: The cluster registry key to be operated on does not exist. PRIF-11: cluster registry error
在ons中清除节点1的信息 www.htz.pw # cat $CRS_HOME/opmn/conf/ons.config localport=6100 remoteport=6200 loglevel=3 useocr=on
www.htz.pw # $CRS_HOME/bin/racgons remove_config sol1:6200 racgons: Existing key value on sol1 = 6200. racgons: sol1:6200 removed from OCR.
www.htz.pw # $CRS_HOME/bin/crs_stat -t Name Type Target State Host ———————————————————— ora….SM1.asm application ONLINE OFFLINE ora….L1.lsnr application ONLINE OFFLINE ora.sol1.gsd application ONLINE OFFLINE ora.sol1.ons application ONLINE OFFLINE ora.sol1.vip application ONLINE ONLINE sol2 ora.sol10g.db application ONLINE ONLINE sol2 ora….g1.inst application ONLINE OFFLINE ora….g2.inst application ONLINE ONLINE sol2 ora….SM2.asm application ONLINE ONLINE sol2 ora….L2.lsnr application ONLINE ONLINE sol2 ora.sol2.gsd application ONLINE ONLINE sol2 ora.sol2.ons application ONLINE ONLINE sol2 ora.sol2.vip application ONLINE ONLINE sol2
在CRS中清除节点1的信息 www.htz.pw # $CRS_HOME/bin/srvctl remove instance -d sol10g -i sol10g1 Remove instance sol10g1 from the database sol10g? (y/[n]) y
www.htz.pw # $CRS_HOME/bin/srvctl remove asm -n sol1
www.htz.pw # $CRS_HOME/bin/srvctl remove nodeapps -n sol1 Please confirm that you intend to remove the node-level applications on node sol1 (y/[n]) y PRKO-2108 : Node applications are still running on node: sol1
www.htz.pw # $CRS_HOME/bin/srvctl remove nodeapps -n sol1 Please confirm that you intend to remove the node-level applications on node sol1 (y/[n]) y PRKO-2108 : Node applications are still running on node: sol1 # $CRS_HOME/bin/crs_stat -t Name Type Target State Host ———————————————————— ora….L1.lsnr application ONLINE OFFLINE ora.sol1.gsd application ONLINE OFFLINE ora.sol1.ons application ONLINE OFFLINE ora.sol1.vip application ONLINE ONLINE sol2 ora.sol10g.db application ONLINE ONLINE sol2 ora….g2.inst application ONLINE ONLINE sol2 ora….SM2.asm application ONLINE ONLINE sol2 ora….L2.lsnr application ONLINE ONLINE sol2 ora.sol2.gsd application ONLINE ONLINE sol2 ora.sol2.ons application ONLINE ONLINE sol2 ora.sol2.vip application ONLINE ONLINE sol2 www.htz.pw # $CRS_HOME/bin/crs_stop -f ora.sol1.vip Target set to OFFLINE for `ora.sol1.LISTENER_SOL1.lsnr` Attempting to stop `ora.sol1.vip` on member `sol2` Stop of `ora.sol1.vip` on member `sol2` succeeded. www.htz.pw # $CRS_HOME/bin/srvctl remove nodeapps -n sol1 Please confirm that you intend to remove the node-level applications on node sol1 (y/[n]) y PRKO-2112 : Some or all node applications are not removed successfully on node: sol1 # $CRS_HOME/bin/crs_stat -t Name Type Target State Host ———————————————————— ora….L1.lsnr application OFFLINE OFFLINE ora.sol1.vip application OFFLINE OFFLINE ora.sol10g.db application ONLINE ONLINE sol2 ora….g2.inst application ONLINE ONLINE sol2 ora….SM2.asm application ONLINE ONLINE sol2 ora….L2.lsnr application ONLINE ONLINE sol2 ora.sol2.gsd application ONLINE ONLINE sol2 ora.sol2.ons application ONLINE ONLINE sol2 ora.sol2.vip application ONLINE ONLINE sol2
www.htz.pw # $CRS_HOME/bin/crs_stat|grep lsn NAME=ora.sol1.LISTENER_SOL1.lsnr NAME=ora.sol2.LISTENER_SOL2.lsnr www.htz.pw # $CRS_HOME/bin/crs_unregister ora.sol1.LISTENER_SOL1.lsnr
www.htz.pw # $CRS_HOME/bin/olsnodes -n sol1 1 sol2 2
删除节点1的信息 www.htz.pw # ./rootdeletenode.sh sol1,1 CRS-0210: Could not find resource ‘ora.sol1.LISTENER_SOL1.lsnr’. CRS-0210: Could not find resource ‘ora.sol1.ons’. CRS-0210: Could not find resource ‘ora.sol1.vip’. CRS-0210: Could not find resource ‘ora.sol1.gsd’. CRS-0210: Could not find resource ora.sol1.vip. CRS nodeapps are deleted successfully clscfg: EXISTING configuration version 3 detected. clscfg: version 3 is 10G Release 2. Successfully deleted 14 values from OCR. Key SYSTEM.css.interfaces.nodesol1 marked for deletion is not there. Ignoring. Successfully deleted 5 keys from OCR. Node deletion operation successful. ‘sol1,1’ deleted successfully www.htz.pw # $CRS_HOME/bin/olsnodes -n sol2 2
更新inventory文件 www.htz.pw $ $ORA_CRS_HOME/oui/bin/runInstaller -updateNodeList ORACLE_HOME=$ORA_CRS_HOME “CLUSTER_NODES=sol2” CRS=TRUE Starting Oracle Universal Installer…
No pre-requisite checks found in oraparam.ini, no system pre-requisite checks will be executed. The inventory pointer is located at /var/opt/oracle/oraInst.loc The inventory is located at /oracle/app/oracle/oraInventory ‘UpdateNodeList’ was successful.
www.htz.pw $ $ORACLE_HOME//oui/bin/runInstaller -updateNodeList ORACLE_HOME=$ORACLE_HOME “CLUSTER_NODES=sol2” Starting Oracle Universal Installer…
No pre-requisite checks found in oraparam.ini, no system pre-requisite checks will be executed. The inventory pointer is located at /var/opt/oracle/oraInst.loc The inventory is located at /oracle/app/oracle/oraInventory ‘UpdateNodeList’ was successful.
更新后的值
www.htz.pw $ cat inventory.xml <?xml version=”1.0″ standalone=”yes” ?> <!– Copyright (c) 2008 Oracle Corporation. All rights Reserved –> <!– Do not modify the contents of this file by hand. –> <INVENTORY> <VERSION_INFO> <SAVED_WITH>10.2.0.4.0</SAVED_WITH> <MINIMUM_VER>2.1.0.6.0</MINIMUM_VER> </VERSION_INFO> <HOME_LIST> <HOME NAME=”OraCrs10g_home” LOC=”/oracle/app/oracle/product/10.2.0/crs_1″ TYPE=”O” IDX=”1″ CRS=”true”> <NODE_LIST> <NODE NAME=”sol2″/> </NODE_LIST> </HOME> <HOME NAME=”OraDb10g_home1″ LOC=”/oracle/app/oracle/product/10.2.0/db_1″ TYPE=”O” IDX=”2″> <NODE_LIST> <NODE NAME=”sol2″/> </NODE_LIST> </HOME> </HOME_LIST> </INVENTORY> |
4,增加节点
www.htz.pw $ pwd /oracle/app/oracle/product/10.2.0/crs_1/oui/bin www.htz.pw $ ls addLangs.sh addNode.sh attachHome.sh detachHome.sh lsnodes ouica.bat ouica.sh resource runConfig.sh runInstaller runInstaller.sh www.htz.pw $ ./addNode.sh Starting Oracle Universal Installer…
No pre-requisite checks found in oraparam.ini, no system pre-requisite checks will be executed. Oracle Universal Installer, Version 10.2.0.4.0 Production Copyright (C) 1999, 2008, Oracle. All rights reserved
这里点next就报错,从报错信息中可以找到下面内容,oui的默认日志路径见log文件位置 INFO: Username:oracle
INFO: Install area Control created with access level 1
INFO: Oracle Universal Installer version is 10.2.0.4.0
INFO: Setting variable ‘ORACLE_HOME’ to ‘/oracle/app/oracle/product/10.2.0/crs_1’. Received the value from the command line. INFO: Setting variable ‘PREREQ_CONFIG_LOCATION’ to ”. Received the value from variable association. INFO: Setting variable ‘FROM_LOCATION’ to ‘/oracle/app/oracle/product/10.2.0/crs_1/inventory/ContentsXML/comps.xml’. Received the value from a code block. INFO: Setting variable ‘ROOTSH_LOCATION’ to ‘/oracle/app/oracle/product/10.2.0/crs_1/root.sh’. Received the value from a code block. INFO: Setting variable ‘ROOTSH_STATUS’ to ‘3’. Received the value from a code block. INFO: Setting variable ‘ORACLE_HOME’ to ‘/oracle/app/oracle/product/10.2.0/crs_1’. Received the value from the command line. INFO: Setting variable ‘PREREQ_CONFIG_LOCATION’ to ”. Received the value from variable association. INFO: Setting variable ‘FROM_LOCATION’ to ‘/oracle/app/oracle/product/10.2.0/crs_1/inventory/ContentsXML/comps.xml’. Received the value from a code block. INFO: Setting variable ‘ROOTSH_LOCATION’ to ‘/oracle/app/oracle/product/10.2.0/crs_1/root.sh’. Received the value from a code block. INFO: Setting variable ‘ROOTSH_STATUS’ to ‘3’. Received the value from a code block. INFO: *** Welcome Page*** INFO: Unable to read inventory information for the home: /oracle/app/oracle/product/10.2.0/crs_1. INFO: Unable to read inventory information for the home: /oracle/app/oracle/product/10.2.0/crs_1.
下面部分是在点next之后产生的 /************************************************************************************ INFO: Setting variable ‘ORACLE_HOME_NAME’ to ‘OraCrs10g_home’. Received the value from a code block. INFO: Unable to read inventory information for the home: /oracle/app/oracle/product/10.2.0/crs_1. SEVERE: Abnormal program termination. An internal error has occured. Please provide the following files to Oracle Support :
“/oracle/app/oracle/oraInventory/logs/addNodeActions2014-05-08_06-05-16PM.log” “/oracle/app/oracle/oraInventory/logs/oraInstall2014-05-08_06-05-16PM.err” “/oracle/app/oracle/oraInventory/logs/oraInstall2014-05-08_06-05-16PM.out ***************************************************************************************/
/oracle/app/oracle/oraInventory/logs/oraInstall2014-05-08_06-05-16PM.err中我们可以发现下面的报错信息
org.xml.sax.SAXParseException: <Line 889, Column 9>: XML-20210: (Fatal Error) Unexpected EOF. at oracle.xml.parser.v2.XMLError.flushErrorHandler(XMLError.java:415) at oracle.xml.parser.v2.XMLError.flushErrors1(XMLError.java:284) at oracle.xml.parser.v2.XMLReader.popXMLReader(XMLReader.java:540) at oracle.xml.parser.v2.NonValidatingParser.parseElement(NonValidatingParser.java:1339) at oracle.xml.parser.v2.NonValidatingParser.parseRootElement(NonValidatingParser.java:326) at oracle.xml.parser.v2.NonValidatingParser.parseDocument(NonValidatingParser.java:293) at oracle.xml.parser.v2.XMLParser.parse(XMLParser.java:209) at oracle.sysman.oii.oiii.OiiiInstallXMLReader.readComps(OiiiInstallXMLReader.java:271) at oracle.sysman.oii.oiii.OiiiInstallInventory.getCompOHListElement(OiiiInstallInventory.java:1663) at oracle.sysman.oii.oiii.OiiiAreaInventory.getAllCompsVect(OiiiAreaInventory.java:1052) at oracle.sysman.oii.oiii.OiiiAreaInventory.getTopLevelComps(OiiiAreaInventory.java:1872) at oracle.sysman.oii.oiii.OiiiInstallInventory.setOHProperties(OiiiInstallInventory.java:6064) at oracle.sysman.oii.oiif.oiifp.OiifpContentsTabPanel.addHomes(OiifpContentsTabPanel.java:777) at oracle.sysman.oii.oiif.oiifp.OiifpContentsTabPanel.fillInventoryTree(OiifpContentsTabPanel.java:691) at oracle.sysman.oii.oiif.oiifp.OiifpContentsTabPanel.refreshTree(OiifpContentsTabPanel.java:1508) at oracle.sysman.oii.oiif.oiifp.OiifpContentsTabPanel.prepareInvTree(OiifpContentsTabPanel.java:2253) at oracle.sysman.oii.oiif.oiifd.OiifdInventoryDialog.doModal(OiifdInventoryDialog.java:457) at oracle.sysman.oii.oiif.oiifw.OiifwWizDialog.onViewPrivate(OiifwWizDialog.java:863) at oracle.sysman.oii.oiif.oiifw.OiifwWizDialog.access$000(OiifwWizDialog.java:330) at oracle.sysman.oii.oiif.oiifw.OiifwWizDialog$PrepareInventoryTree.run(OiifwWizDialog.java:1778) at java.lang.Thread.run(Thread.java:534) 这里报XML结果,非法的文件结局,中途以为是inventory配置错误,使用opatch -lsinventory结果显示正常,这里怀疑是某个XML文件损坏导致的。
通过truss来查看runInstaller访问了那些xml文件
www.htz.pw $ truss -aefo /tmp/123.log ./addNode.sh Starting Oracle Universal Installer…
No pre-requisite checks found in oraparam.ini, no system pre-requisite checks will be executed. Oracle Universal Installer, Version 10.2.0.4.0 Production Copyright (C) 1999, 2008, Oracle. All rights reserved.
可以看到打开了如下的XML文件 115 1728 23646/1: open(“/oracle/app/oracle/product/10.2.0/crs_1/oui/jlib/xmlparserv2.jar”, O_RDONLY|O_LARGEFILE) = 6 119 1776 23646/1: open(“/oracle/app/oracle/product/10.2.0/crs_1/oui/jlib/xml.jar”, O_RDONLY|O_LARGEFILE) = 6 1382 15600 23646/1: open(“/oracle/app/oracle/oraInventory/ContentsXML/inventory.xml”, O_RDONLY|O_LARGEFILE) = 15 1385 15915 23646/1: open(“/oracle/app/oracle/product/10.2.0/crs_1/inventory/ContentsXML/oraclehomeproperties.xml”, O_RDONLY|O_LARGEFILE) = 15 1387 15974 23646/1: open(“/oracle/app/oracle/product/10.2.0/db_1/inventory/ContentsXML/oraclehomeproperties.xml”, O_RDONLY|O_LARGEFILE) = 15 1401 18121 23646/15: open(“/oracle/app/oracle/oraInventory/ContentsXML/comps.xml”, O_RDONLY|O_LARGEFILE) = 19 1403 18217 23646/15: open(“/oracle/app/oracle/product/10.2.0/crs_1/inventory/ContentsXML/comps.xml”, O_RDONLY|O_LARGEFILE) = 19 1421 19159 23646/15: open(“/oracle/app/oracle/product/10.2.0/crs_1/inventory/ContentsXML/comps.xml”, O_RDONLY|O_LARGEFILE) = 20 1444 22099 23646/1: open(“/oracle/app/oracle/product/10.2.0/crs_1/inventory/ContentsXML/comps.xml”, O_RDONLY|O_LARGEFILE) = 21
最后发现只有comps.xml有889行
这里下载了一个XML编辑器,不能正常编辑COMPS.XML文件,说明文件有异常。 从其它的环境是CP一个COMPS.XML过来一对比,发现出错的XML文件下了很多的内容
正常的comps.xml文件的结构如下: www.htz.pw $ cat comps.xml <?xml version=”1.0″ standalone=”yes” ?> <!– Copyright (c) 2008 Oracle Corporation. All rights Reserved –> <!– Do not modify the contents of this file by hand. –> <PRD_LIST> <TL_LIST> </TL_LIST> <COMP_LIST> </COMP_LIST> <ONEOFF_LIST> </ONEOFF_LIST> </PRD_LIST> www.htz.pw $ pwd /oracle/app/oracle/oraInventory/ContentsXML
其实这里我们也可以通过opatch来验证XML文件结构是否正确
www.htz.pw $ $ORA_CRS_HOME/OPatch/opatch util LoadXML -xmlInput /oracle/app/oracle/product/10.2.0/crs_1/inventory/ContentsXML/comps.xml.back Invoking OPatch 10.2.0.4.2
Oracle Interim Patch Installer version 10.2.0.4.2 Copyright (c) 2007, Oracle Corporation. All rights reserved.
UTIL session
Oracle Home : /oracle/app/oracle/product/10.2.0/db_1 Central Inventory : /oracle/app/oracle/oraInventory from : /var/opt/oracle/oraInst.loc OPatch version : 10.2.0.4.2 OUI version : 10.2.0.4.0 OUI location : /oracle/app/oracle/product/10.2.0/db_1/oui Log file location : /oracle/app/oracle/product/10.2.0/db_1/cfgtoollogs/opatch/opatch2014-05-08_20-39-18PM.log
Invoking utility “loadxml” UtilSession failed: Unable to parse the xml file.
OPatch failed with error code 73
由于这里是测试环境,所以我直接使用的mv方式,如果是生产环境,建议从其它相当环境CP一个文件过来
www.htz.pw # mv /oracle/app/oracle/product/10.2.0/crs_1/inventory/ContentsXML/comps.xml /oracle/app/oracle/product/10.2.0/crs_1/inventory/ContentsXML/comps.xml.back
再次执行addNode.sh终于见到了 一路next下去,一切正常,直到出现下面的图片 这里我们选择yes,因为OCR是由root用户执行的,日志属主是root,不影响addNode.sh操作
在执行addNode.sh操作的主机上面执行rootaddnode.sh报下面的错误
www.htz.pw # /oracle/app/oracle/product/10.2.0/crs_1/rootaddnode.sh clscfg: EXISTING configuration version 3 detected. clscfg: version 3 is 10G Release 2. Attempting to add 1 new nodes to the configuration Using ports: CSS=49895 CRS=49896 EVMC=49898 and EVMR=49897. node <nodenumber>: <nodename> <private interconnect name> <hostname> node 3: sol1 sol1-priv sol1 Creating OCR keys for user ‘root’, privgrp ‘root’.. Operation successful. /oracle/app/oracle/product/10.2.0/crs_1/bin/srvctl add nodeapps -n sol1 -A %s_nodevips%/255.255.255.0/e1000g0 -o /oracle/app/oracle/product/10.2.0/crs_1 PRKO-2109 : Invalid address string: %s_nodevips%/255.255.255.0/e1000g0
www.htz.pw # /oracle/app/oracle/product/10.2.0/crs_1/bin/srvctl config nodeapps -n sol2 -a VIP exists.: /sol2-vip/192.168.111.49/255.255.255.0/e1000g0
www.htz.pw # grep “s_nodevips|CRS_NEW_NODEVIPS” /oracle/app/oracle/product/10.2.0/crs_1/rootaddnode.sh CRS_NEW_NODEVIPS=%s_nodevips%
www.htz.pw # grep “CRS_NEW_NODEVIPS” /oracle/app/oracle/product/10.2.0/crs_1/rootaddnode.sh CRS_NEW_NODEVIPS=%s_nodevips% NODE_VIP=`$ECHO $CRS_NEW_NODEVIPS | $CUT -d’,’ -f$Ni`
手动修改rootaddnode.sh脚本内容, /oracle/app/oracle/product/10.2.0/crs_1/rootaddnode.sh
Ni=1 for i in `$ECHO $NODES_LIST` do NODE_NAME=$i NODE_VIP=`$ECHO $CRS_NEW_NODEVIPS | $CUT -d’,’ -f$Ni` NODEVIP=$NODE_VIP/$NETMASK/$NETIFs
$ECHO $CH/bin/srvctl add nodeapps -n $NODE_NAME -A $NODEVIP -o $CH $CH/bin/srvctl add nodeapps -n $NODE_NAME -A $NODEVIP -o $CH Ni=`expr $Ni + 1` done
更改后 Ni=1 for i in `$ECHO $NODES_LIST` do NODE_NAME=$i #NODE_VIP=`$ECHO $CRS_NEW_NODEVIPS | $CUT -d’,’ -f$Ni` NODE_VIP=192.168.111.48 NODEVIP=$NODE_VIP/$NETMASK/$NETIFs
$ECHO $CH/bin/srvctl add nodeapps -n $NODE_NAME -A $NODEVIP -o $CH $CH/bin/srvctl add nodeapps -n $NODE_NAME -A $NODEVIP -o $CH Ni=`expr $Ni + 1` done
www.htz.pw # /oracle/app/oracle/product/10.2.0/crs_1/rootaddnode.sh clscfg: EXISTING configuration version 3 detected. clscfg: version 3 is 10G Release 2. Node sol1 is already assigned nodenum 3. Aborting: No configuration data has been changed. clscfg -add -nn nameA,numA,nameB,numB,… -pn privA,numA,privB,numB,… [-hn hostA,numA,hostB,numB,…] [-t p1,p2,p3,p4] -nn specifies nodenames in the same fashion as -nn in -install mode -pn specifies private interconnect names as -pn in -install mode -hn specifies hostnames in the same fashion as -hn in -install mode -t specifies port numbers to be used by CRS daemons on the new node(s) default ports: 49895,49896,49897,49898 WARNING: Using this tool may corrupt your cluster configuration. Do not use unless you positively know what you are doing.
/oracle/app/oracle/product/10.2.0/crs_1/bin/srvctl add nodeapps -n sol1 -A 192.168.111.48/255.255.255.0/e1000g0 -o /oracle/app/oracle/product/10.2.0/crs_1
下面是在1节点执行 www.htz.pw # /oracle/app/oracle/product/10.2.0/crs_1/root.sh WARNING: directory ‘/oracle/app/oracle/product/10.2.0’ is not owned by root WARNING: directory ‘/oracle/app/oracle/product’ is not owned by root WARNING: directory ‘/oracle/app/oracle’ is not owned by root WARNING: directory ‘/oracle/app’ is not owned by root WARNING: directory ‘/oracle’ is not owned by root Checking to see if Oracle CRS stack is already configured OCR LOCATIONS = /dev/rdsk/c2t0d0s0 Setting the permissions on OCR backup directory Setting up NS directories Oracle Cluster Registry configuration upgraded successfully WARNING: directory ‘/oracle/app/oracle/product/10.2.0’ is not owned by root WARNING: directory ‘/oracle/app/oracle/product’ is not owned by root WARNING: directory ‘/oracle/app/oracle’ is not owned by root WARNING: directory ‘/oracle/app’ is not owned by root WARNING: directory ‘/oracle’ is not owned by root clscfg: EXISTING configuration version 3 detected. clscfg: version 3 is 10G Release 2. Successfully accumulated necessary OCR keys. Using ports: CSS=49895 CRS=49896 EVMC=49898 and EVMR=49897. node <nodenumber>: <nodename> <private interconnect name> <hostname> node 1: sol1 sol1-priv sol1 node 2: sol2 sol2-priv sol2 clscfg: Arguments check out successfully.
NO KEYS WERE WRITTEN. Supply -force parameter to override. -force is destructive and will destroy any previous cluster configuration. Oracle Cluster Registry for cluster has already been initialized Startup will be queued to init within 30 seconds. Adding daemons to inittab Expecting the CRS daemons to be up within 600 seconds. CSS is active on these nodes. sol2 sol1 CSS is active on all nodes. Waiting for the Oracle CRSD and EVMD to start Oracle CRS stack installed and running under init(1M) Running vipca(silent) for configuring nodeapps
Creating VIP application resource on (0) nodes. Creating GSD application resource on (0) nodes. Creating ONS application resource on (0) nodes. Starting VIP application resource on (2) nodes1:CRS-1002: Resource ‘ora.sol1.vip’ is already running on member ‘sol1’ CRS-0223: Resource ‘ora.sol1.vip’ has placement error. Check the log file “/oracle/app/oracle/product/10.2.0/crs_1/log/sol1/racg/ora.sol1.vip.log” for more details … Starting GSD application resource on (2) nodes1:CRS-0233: Resource or relatives are currently involved with another operation. Check the log file “/oracle/app/oracle/product/10.2.0/crs_1/log/sol1/racg/ora.sol1.gsd.log” for more details … Starting ONS application resource on (2) nodes1:CRS-0233: Resource or relatives are currently involved with another operation. Check the log file “/oracle/app/oracle/product/10.2.0/crs_1/log/sol1/racg/ora.sol1.ons.log” for more details …
Done.
这里注意报了很多错误,但是不影响。
这里看到各个节点都正常 www.htz.pw $ crs_stat -t Name Type Target State Host ———————————————————— ora.sol1.gsd application ONLINE ONLINE sol1 ora.sol1.ons application ONLINE ONLINE sol1 ora.sol1.vip application ONLINE ONLINE sol1 ora.sol10g.db application ONLINE ONLINE sol2 ora….g2.inst application ONLINE ONLINE sol2 ora….SM2.asm application ONLINE ONLINE sol2 ora….L2.lsnr application ONLINE ONLINE sol2 ora.sol2.gsd application ONLINE ONLINE sol2 ora.sol2.ons application ONLINE ONLINE sol2 ora.sol2.vip application ONLINE ONLINE sol2
www.htz.pw # ifconfig -a lo0: flags=2001000849<UP,LOOPBACK,RUNNING,MULTICAST,IPv4,VIRTUAL> mtu 8232 index 1 inet 127.0.0.1 netmask ff000000 e1000g0: flags=1000843<UP,BROADCAST,RUNNING,MULTICAST,IPv4> mtu 1500 index 2 inet 192.168.111.46 netmask ffffff00 broadcast 192.168.111.255 ether 0:c:29:5a:e5:7a e1000g0:1: flags=1040843<UP,BROADCAST,RUNNING,MULTICAST,DEPRECATED,IPv4> mtu 1500 index 2 inet 192.168.111.48 netmask ffffff00 broadcast 192.168.111.255 e1000g1: flags=1000843<UP,BROADCAST,RUNNING,MULTICAST,IPv4> mtu 1500 index 3 inet 192.168.112.46 netmask ffffff00 broadcast 192.168.112.255 ether 0:c:29:5a:e5:84
下面就是在oracle用户下面增加节点 www.htz.pw $ ./addNode.sh 这里很顺利,无任何报错
配置监听服务,这里可以使用手动的方式来配置,由于是测试环境,我这里在正常的节点上面通过netca来配置的。在配置过程中监听服务需要被中断。
这里中途需要停listener www.htz.pw $ crs_stat -t Name Type Target State Host ———————————————————— ora….L1.lsnr application ONLINE ONLINE sol1 ora.sol1.gsd application ONLINE ONLINE sol1 ora.sol1.ons application ONLINE ONLINE sol1 ora.sol1.vip application ONLINE ONLINE sol1 ora.sol10g.db application ONLINE ONLINE sol2 ora….g2.inst application ONLINE ONLINE sol2 ora….SM2.asm application ONLINE ONLINE sol2 ora….L2.lsnr application ONLINE ONLINE sol2 ora.sol2.gsd application ONLINE ONLINE sol2 ora.sol2.ons application ONLINE ONLINE sol2 ora.sol2.vip application ONLINE ONLINE sol2
在正常的节点上面执行dbca来增加实例 dbca很正常,会自动增加ASM实例的信息。
www.htz.pw $ crs_stat -t Name Type Target State Host ———————————————————— ora….SM3.asm application ONLINE ONLINE sol1 ora….L1.lsnr application ONLINE ONLINE sol1 ora.sol1.gsd application ONLINE ONLINE sol1 ora.sol1.ons application ONLINE ONLINE sol1 ora.sol1.vip application ONLINE ONLINE sol1 ora.sol10g.db application ONLINE ONLINE sol2 ora….g1.inst application ONLINE ONLINE sol1 ora….g2.inst application ONLINE ONLINE sol2 ora….SM2.asm application ONLINE ONLINE sol2 ora….L2.lsnr application ONLINE ONLINE sol2 ora.sol2.gsd application ONLINE ONLINE sol2 ora.sol2.ons application ONLINE ONLINE sol2 ora.sol2.vip application ONLINE ONLINE sol2
www.htz.pw $ crs_stat|grep asm NAME=ora.sol1.ASM3.asm NAME=ora.sol2.ASM2.asm
这里注意到我们的ASM实例变成了ASM3,是由于自动增加的原因,我们可以使用增加增加ASM实例来解决问题
www.htz.pw $ srvctl stop instance -d sol10g -i sol10g1 www.htz.pw $ srvctl stop asm -n sol1
www.htz.pw $ crs_unregister ora.sol10g.sol10g1.inst www.htz.pw $ crs_unregister ora.sol1.ASM3.asm
www.htz.pw $ cat /oracle/app/oracle/admin/+ASM/pfile/init.ora ############################################################################## www.htz.pw # Copyright (c) 1991, 2001, 2002 by Oracle Corporation ##############################################################################
########################################### www.htz.pw # Cluster Database ########################################### asm_diskgroups=’DATA’ background_dump_dest=/oracle/app/oracle/admin/+ASM/bdump cluster_database=TRUE core_dump_dest=/oracle/app/oracle/admin/+ASM/cdump instance_type=asm large_pool_size=12582912 remote_login_passwordfile=EXCLUSIVE user_dump_dest=/oracle/app/oracle/admin/+ASM/udump
+ASM2.instance_number=2 +ASM1.instance_number=1
www.htz.pw $ export ORACLE_SID=+ASM1 www.htz.pw $ sqlplus / as sysdba
SQL*Plus: Release 10.2.0.4.0 – Production on Thu May 8 23:14:50 2014
Copyright (c) 1982, 2007, Oracle. All Rights Reserved.
Connected to an idle instance.
SQL> create spfile from pfile;
File created. www.htz.pw $ srvctl remove asm -n sol1 www.htz.pw $ srvctl add asm -n sol1 -i +ASM1 -o $ORACLE_HOME -p $ORACLE_HOME/dbs/spfile+ASM1.ora www.htz.pw $ srvctl start asm -n sol1
www.htz.pw $ srvctl start instance -d sol10g -i sol10g1
www.htz.pw $ srvctl start instance -d sol10g -i sol10g1 修改一下依赖性 www.htz.pw $ srvctl modify instance -d sol10g -i sol10g1 -s +ASM1 www.htz.pw $ srvctl stop asm -n sol1 www.htz.pw $ srvctl start instance -d sol10g -i sol10g1
下切正常 www.htz.pw $ crs_stat -t Name Type Target State Host ———————————————————— ora….SM1.asm application ONLINE ONLINE sol1 ora….L1.lsnr application ONLINE ONLINE sol1 ora.sol1.gsd application ONLINE ONLINE sol1 ora.sol1.ons application ONLINE ONLINE sol1 ora.sol1.vip application ONLINE ONLINE sol1 ora.sol10g.db application ONLINE ONLINE sol2 ora….g1.inst application ONLINE ONLINE sol1 ora….g2.inst application ONLINE ONLINE sol2 ora….SM2.asm application ONLINE ONLINE sol2 ora….L2.lsnr application ONLINE ONLINE sol2 ora.sol2.gsd application ONLINE ONLINE sol2 ora.sol2.ons application ONLINE ONLINE sol2 ora.sol2.vip application ONLINE ONLINE sol2 |
整个增加过程结束,在增加过程中,遇到了一点小麻烦。
SOLARIS RAC平台模拟节点crash后强制删除与增加:等您坐沙发呢!