Jump to content

Billy

Members
  • Posts

    5
  • Joined

  • Last visited

Everything posted by Billy

  1. Yes, I have been updating my original post. The update shows that I am using Pacemaker which includes Corosync. Previously, The OpenAIS messaging layer was used over heartbeat; The current messaging and membership layer is Corosync when installing Pacemaker. I find that Pacemaker is truly a great project with many benefits and resources from other groups within the community. Good Luck. Billy
  2. Ummm, yeah, pbxnsip is probably still running. The http server and related files are embedded in the binary. So if you have the application running, this is what you are seeing. Hope this helps. Billy
  3. Sure! Windex http://www.windex.com/?sid=SEM&cid=Google
  4. UPDATE: * pacemaker-openais and pacemaker-heartbeat are gone; pacemaker now only comes in one flavour, having support for corosync and heartbeat built it. This is based on pacemaker's capability to detect by which messaging framework it has been started and act accordingly. * openais is gone. pacemaker 1.0.6 uses corosync. I have updated the above tutorial. Have fun! Billy
  5. Distribution used for pbnxsip cluster: Debian 5.0 Lenny Download at http://cdimage.debian.org/cdimage/daily-bu...386-netinst.iso This tutorial assumes you have a basic familiarity with Linux: 1. Know how to install the Linux operating system (if you can read, you will have no issues with some of the tutorials on the net.) 2. Logging in 3. Virtual consoles 4. Shells and commands 5. Files and directories 6. The directory tree 7. Partitioning of hard drives 8. Installing from source 9. Etc. I have 2 nodes: Each node has 2 network cards. Each nodes have 2 partitions. (1 8gig and 1 11gig: [sda and sdb]) Cluster configuration is Active/Standby Note: Active/Active is possible with OCFS2 with DRBD primary/primary. After you tackle this, have read through all the documentation and with a little tweaking, you can have Active/Active pbxnsip Linux cluster. sip01: 10.1.10.201 (Primary) sip02: 10.1.10.202 (Secondary) Virtual IP (VIP) 10.1.10.210 default gateway: 10.1.10.1 Cluster software packages: pacemaker (includes the cluster resource manager and corosync; You do not have to install corosync separately.) drbd 8.3.4 (source only) drbd 8.2.x (Deb package) Resources for help; I highly recommend that you spend a few hours a day for about a week reviewing this documentation. There is a lot of information and configuration options to select. When something goes wrong or not as expected and you request for help, you will want to know what you are talking about (Or at least sound like you do). It will make everyone's life easier. "I started my cluster, but it is not working" will not cut it. pacemaker: http://clusterlabs.org/wiki/Main_Page pacemaker: http://oss.clusterlabs.org/mailman/listinfo/pacemaker pacemaker: http://clusterlabs.org/mediawiki/images/f/...n_Explained.pdf corosync (Cluster Engine) http://www.corosync.org openais: (Cluster Framework - designed to work with corosync) http://www.openais.org/doku.php?id=faq openais: (Cluster Framework - designed to work with corosync) http://www.openais.org/doku.php?id=support drbd: TCP/IP Block based replication http://www.drbd.org/users-guide/users-guide.html drbd: TCP/IP Block based replication http://www.nabble.com/DRBD-f14286.html drbd: TCP/IP Block based replication http://www.drbd.org/fileadmin/drbd/publica...onf.eu.2007.pdf Lets get started (I will not reinvent the wheel, there is plenty of documentation on getting Debian installed and configured.) I will paste outputs of my configs and such. 1. Install Debian: (Assumes you have a i386 architecture.) http://www.debian.org/releases/stable/i386/index.html.en 2. Log in as root (Or log in as a normal user and su - to root, or use sudo) Disclaimer: This document assume that you can work with Debian Linux already and know the security implications of working as root and so on. 3. Configure networking http://qref.sourceforge.net/Debian/referen...gateway.en.html my configs: sip01:~# cat /etc/network/interfaces # This file describes the network interfaces available on your system # and how to activate them. For more information, see interfaces(5). # The loopback network interface auto lo iface lo inet loopback # The primary network interface #Network access auto eth0 iface eth0 inet static address 10.1.10.201 netmask 255.255.255.0 gateway 10.1.10.1 # Used for heartbeat (Crossover cable) auto eth1 iface eth1 inet static address 192.168.10.201 netmask 255.255.255.0 sip02:~# cat /etc/network/interfaces # This file describes the network interfaces available on your system # and how to activate them. For more information, see interfaces(5). # The loopback network interface auto lo iface lo inet loopback # The primary network interface #Network access auto eth0 iface eth0 inet static address 10.1.10.202 netmask 255.255.255.0 gateway 10.1.10.1 # Used for heartbeat (Crossover cable) auto eth1 iface eth1 inet static address 192.168.10.202 netmask 255.255.255.0 ========================================================================== 4. Test local network and heartbeat network sip01:~# ping sip02 -c 1 PING sip02.local.dom (10.1.10.202) 56(84) bytes of data. 64 bytes from sip02.local.dom (10.1.10.202): icmp_seq=1 ttl=64 time=4.56 ms --- sip02.local.dom ping statistics --- 1 packets transmitted, 1 received, 0% packet loss, time 0ms rtt min/avg/max/mdev = 4.566/4.566/4.566/0.000 ms sip01:~# ping hb02 -c 1 PING hb02.local.dom (192.168.10.202) 56(84) bytes of data. 64 bytes from hb02.local.dom (192.168.10.202): icmp_seq=1 ttl=64 time=2.83 ms --- hb02.local.dom ping statistics --- 1 packets transmitted, 1 received, 0% packet loss, time 0ms rtt min/avg/max/mdev = 2.830/2.830/2.830/0.000 ms sip01:~# +++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++ Note on using subinterfaces (Virtual IP address on physical interface, e.g. eth0:0) I would like to add; NAT breaks the end to end UDP/IP model, more specifically TCP/IP. It was a good thought, but a bad idea! In order to understand what I mean, a mention of RFC 1631, rfc 1180, and rfc 768 is in order. You will want to review those RFCs to get an understanding of the examples I am about to explain. When we read/hear about TCP/IP, it is not one protocol that is being discussed, explained, or criticized; Do not even get me started on TCP, TCP is flawed in itself. At any rate, that is for another discussion (But hey, do not take my word for it: http://www.linuxsecurity.com/resource_file...-security.html). The good thing is that we do not have to worry too much with TCP in VOIP. The focus here is UDP as this is the main protocol used in the PBXNSIP application. As I was saying before ripping on TCP (I like ripping on Windows too!), TCP/IP is a suite of protocols> more specifically, TCP, UDP, and IP. I like to refer the layer 4 protocol that I am discussing as either TCP/IP or UDP/IP for TCP and UDP respectively. As I stated, NAT breaks the end to end IP protocol. A NAT device (Firewall, router, any device running the translation) needs to track all the translations. NAT tracks the translations utilizing a session (NAT Table). Understanding packet flow is not easy in the beginning; I am a Network Engineer by trade not a Linux System Engineer. I know enough about Linux to get myself into a little trouble. When a client is sending UDP packets to a PBXNSIP server using the VIP, the packet returning back to the client will be the IP address of the physical interface and NOT that of the of the VIP. When a packet is sent back to the client, the packet will need to traverse the NAT device, and the NAT device will execute a NAT lookup to see if there is a session already setup for the client, if not, the packet is dropped. The easiest way to look at this is: Server physical IP on eth0: 24.60.75.20 Server VIP: 25.60.75.25 NAT session table Inside Source IP - Destination IP - Inside Source Port - Destination Port - Outside NAT IP Inside 192.168.1.2 74.90.100.25 Inside 32540 5060 24.60.75.25 The Outside NAT IP is the IP addressed assigned by the clients ISP. The destination IP and Port are the PBXNSIPs UDP SIP port (5060) and VIP. The client sends a UDP Registration SIP packet to 74.90.100.25 on port 5060 The server will then send a UDP authentication packet back to 74.90.100.25 port 32450 with a source of 24.60.75.20 port 5060 The NAT device of the client will execute a NAT lookup for the return packet of the UDP authentication SIP packet and as you can see, there is no session for 24.60.75.20 to 74.90.100.25. Most NAT devices will drop the packet by default. There are other options that you can enable on most NAT devices not to drop a packet if there is no session. Remember, nothing on the SIP server is broken, this is how UDP communications work (UDP is connectionless). Most TCP/IP based hosts will use the IP address of the physical interface to build the IP packet before sending the packet down to the data link layer where the data link layer will encapsulate the IP packet into a frame. As I said, NAT breaks the end to end IP model, so to accommodate you customers that are using NAT, you will need to add routes to your routing table with different metrics and source addresses. There are several ways to accomplish on how we want to resolve this. If you find that binding pbxnsip to the Virtual IP address is not an option you can utilize this method: ip route delete default via 10.1.10.1 ip route add default via 10.1.10.1 dev eth0 metric 5 exit What this will do is add a route with a metric of 5, still there is the only default route that will be in the routing table, but later on as you will see we will add a cluster resource to add an additional default route with a source of 10.1.10.210 and a metric of 0. When the cluster adds the second route, there will be 2 routes that will be installed into the table, one with a metric of 0 and one with a metric of 5. (I will show on how to add a resource to the cluster via the CRM to add a default route) The one that will be preferred is that of the route that has a metric of 0. The end result will look like this: sip01:~# ip route 10.1.10.0/24 dev eth0 proto kernel scope link src 10.1.10.201 10.0.136.0/24 dev eth1 proto kernel scope link src 10.1.10.201 default via 10.1.10.1 dev eth0 src 10.1.10.210 default via 10.1.10.1 dev eth0 metric 5 sip01:~# route Kernel IP routing table Destination Gateway Genmask Flags Metric Ref Use Iface localnet * 255.255.255.240 U 0 0 0 eth0 192.168.10.0 * 255.255.255.0 U 0 0 0 eth1 default 10.1.10.1 0.0.0.0 UG 0 0 0 eth0 default 10.1.10.1 0.0.0.0 UG 5 0 0 eth0 Now what happens : Any traffic destined to the VIP: 10.1.10.210 and any return traffic from the server will be sourced as 10.1.10.210. or you can just bind the pbxnsip application to the VIP for 5060 and any other ports that you will need. ++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++ ====================================== Initial Configuration and installation ====================================== 5. Install the clustering software A. update apt sources sip01:~# echo "deb http://people.debian.org/~madkiss/ha lenny main" >> /etc/apt/sources.list sip01:~# apt-key adv --keyserver pgp.mit.edu --recv-key 1CFA3E8CD7145E30 sip01:~# apt-get update sip01:~# apt-get install pacemaker sip01:~# apt-get install psmisc sip02:~# echo "deb http://people.debian.org/~madkiss/ha lenny main" >> /etc/apt/sources.list sip02:~# apt-key adv --keyserver pgp.mit.edu --recv-key 1CFA3E8CD7145E30 sip02:~# apt-get update sip02:~# apt-get install pacemaker sip02:~# apt-get install psmisc The next 2 commands, execute on the primary node: sip01:~# corosync-keygen ( you will need to press keys until it reaches 1024) sip01:~# scp /etc/corosync/authkey root@sip02:/etc/corosync sip01:~# vi /etc/default/corosync Change start=no to start=yes sip02:~# vi /etc/default/corosync Change start=no to start=yes ======================= Edit configfile ======================= Most of the options in the /etc/corosync/corosync.conf file are ok to start with, you must however make sure that it can communicate so make sure to adjust this section [Note: Read - http://www.corosync.org/doku.php?id=faq:configure_openais) interface { # The following values need to be set based on your environment ringnumber: 0 bindnetaddr: 192.168.10.0 mcastaddr: 226.94.1.1 mcastport: 5405 } Save the file and start the cluster: sip01:# /etc/init.d/corosync start sip02:# /etc/init.d/corosync start Check the status of the cluster: sip01:/etc/corosync# crm_mon --one-shot -V sip02:/etc/corocync# crm_mon --one-shot -V You should have a similar output if you followed on the directions: sip01:~# crm_mon --one-shot -V crm_mon[5635]: 2009/10/21_15:15:06 ERROR: unpack_resources: No STONITH resources have been defined crm_mon[5635]: 2009/10/21_15:15:06 ERROR: unpack_resources: Either configure some or disable STONITH with the stonith-enabled option crm_mon[5635]: 2009/10/21_15:15:06 ERROR: unpack_resources: NOTE: Clusters with shared data need STONITH to ensure data integrity ============ Last updated: Wed Oct 21 15:15:06 2009 Stack: openais Current DC: sip01 - partition WITHOUT quorum Version: 1.0.5-unknown 2 Nodes configured, 3076020514 expected votes 0 Resources configured. ============ Online: [ sip01 sip02 ] sip02:/etc/corosync# crm_mon --one-shot -V crm_mon[5635]: 2009/10/21_21:13:23 ERROR: unpack_resources: No STONITH resources have been defined crm_mon[5635]: 2009/10/21_21:13:23 ERROR: unpack_resources: Either configure some or disable STONITH with the stonith-enabled option crm_mon[5635]: 2009/10/21_21:13:23 ERROR: unpack_resources: NOTE: Clusters with shared data need STONITH to ensure data integrity ============ Last updated: Wed Oct 21 21:13:23 2009 Stack: openais Current DC: sip01 - partition WITHOUT quorum Version: 1.0.5-unknown 2 Nodes configured, 3076020514 expected votes 0 Resources configured. ============ Online: [ sip01 sip02 ] As you can see the setup is complaining about STONITH, but that is ok since we have not configured that part of the cluster. Also, this is bad: partition WITHOUT quorum. This is ok too. One of the most common reasons for this is the way quorum is calculated for a 2-node cluster. Unlike Heartbeat, OpenAIS/Corosync doesn't pretend 2-node clusters always have quorum. In order to have quorum, more than half of the total number of cluster nodes need to be online. Clearly this is not the case when a node failure occurs in a 2-node cluster. If you want to allow the remaining node to provide all the cluster services, you need to set the no-quorum-policy to ignore. All crm configure commands will always be ran from the primary (not mandatory, but best practice). sip01:~# crm configure property no-quorum-policy=ignore sip01:~# crm configure property stonith-enabled=false Recheck the status if the cluster: sip01:~# crm_mon --one-shot -V The STONITH errors have been purged; However, the partition WITHOUT quorum is still present. This is Ok too. Lets configure our ip_pbxnsip (Fail-over IP[VIP]) sip01:~# crm configure primitive ip_pbxnsip ocf:heartbeat:IPaddr params ip=10.1.10.210 op monitor interval=10s If your local network is on eth0, then a sub-interface will be created for the virtual IP (VIP). sip01:~# ifconfig eth0:0 eth0:0 Link encap:Ethernet HWaddr 08:00:27:c2:e8:c4 inet addr:10.1.10.210 Bcast:10.1.10.255 Mask:255.255.255.0 UP BROADCAST RUNNING MULTICAST MTU:1500 Metric:1 sip02 will not have the VIP configured as it is the standby. sip02:/etc/corosync# ifconfig eth0:0 eth0:0 Link encap:Ethernet HWaddr 08:00:27:14:e8:8e UP BROADCAST RUNNING MULTICAST MTU:1500 Metric:1 Lets test the fail-over before moving on. sip01:~# /etc/init.d/corosync stop Check that the fail-over IP started on the standby: sip02:/etc/corosync# ifconfig eth0:0 eth0:0 Link encap:Ethernet HWaddr 08:00:27:14:e8:8e inet addr:10.1.10.210 Bcast:10.1.10.255 Mask:255.255.255.0 UP BROADCAST RUNNING MULTICAST MTU:1500 Metric:1 Also, check from the clusters perspective: sip02:/etc/corosync# crm_mon --one-shot ============ Last updated: Wed Oct 21 21:26:35 2009 Stack: openais Current DC: sip02 - partition WITHOUT quorum Version: 1.0.5-unknown 2 Nodes configured, 3076098343 expected votes 1 Resources configured. ============ Online: [ sip02 ] OFFLINE: [ sip01 ] ip_pbxnsip (ocf::heartbeat:IPaddr): Started sip02 Lets test preemption of sip01: Start corosync on sip01: sip01:~# /etc/init.d/corosync start Check status from sip02: sip02:/etc/corosync# crm_mon --one-shot ============ Last updated: Wed Oct 21 21:27:36 2009 Stack: openais Current DC: sip02 - partition WITHOUT quorum Version: 1.0.5-unknown 2 Nodes configured, 3076098338 expected votes 1 Resources configured. ============ Online: [ sip01 sip02 ] ip_pbxnsip (ocf::heartbeat:IPaddr): Started sip01 Migrate the resource to the other node: For whatever reason, if you wanted to run the resource on another node then the one it is running on now: sip01:# crm crm(live)# resource crm(live)resource# list ip_pbxnsip (ocf::heartbeat:IPaddr) Started crm(live)resource# migrate ip_pbxnsip sip02 crm(live)resource# bye bye Stop the resource: You can also stop the resource if you wish: sip01:# crm crm(live)# resource crm(live)resource# stop ip_pbxnsip crm(live)resource# bye bye ======================================== Install DRBD: http://www.drbd.org/ I highly recommend that you install drbd from source; For the inexperienced users, if you just want to install by package, then: On both nodes: primary and secondary: sip01:~# apt-get update drbd8-utils sip01:~# drbd8-modules-2.6.26-2-686 sip02:~# apt-get update drbd8-utils sip02:~# drbd8-modules-2.6.26-2-686 Then skip to the DRBD config file configuration section [Configure the DRBD config file:], then next items are related to installing by source. To install by source, first download the source tarball (latest): on both nodes: sip01:~# apt-get install linux-headers-$(uname -r) sip01:~# cd ~ sip01:~# wget http://oss.linbit.com/drbd/8.3/drbd-8.3.4.tar.gz sip01:~# tar xzvf drbd-8.3.4.tar.gz sip01:~# cd drbd-8.3.4/drbd sip01:~# apt-get install make gcc build-essential flex sip01:~# make clean all sip01:~# cd .. sip01:~# make tools sip01:~# make install sip01:~# make install-tools sip02:~# apt-get install linux-headers-$(uname -r) sip02:~# cd ~ sip02:~# wget http://oss.linbit.com/drbd/8.3/drbd-8.3.4.tar.gz sip02:~# tar xzvf drbd-8.3.4.tar.gz sip02:~# cd drbd-8.3.4/drbd sip02:~# apt-get install make gcc build-essential flex sip02:~# make clean all sip02:~# cd .. sip02:~# make tools sip02:~# make install sip02:~# make install-tools Note!!!! The commands that you ran above: sip01:~# cd drbd-8.3.4/drbd sip02:~# cd drbd-8.3.4/drbd sip01:~# make clean all sip02:~# make clean all You are making the DRBD module. Any kernel upgrades will require you to rebuild the kernel module! Configure the DRBD config file: Find out what disks have been identified: sip01:~/drbd-8.3.4# dmesg | grep Attached [ 7.129455] sd 0:0:0:0: [sda] Attached SCSI disk [ 7.134073] sd 2:0:0:0: [sdb] Attached SCSI disk I have sda and sdb (Your configuration might be different). In the config file replace my sdb with your dev device. There is a lot of information that you will want to read about in the drbd.conf file. I can not go over every piece in detail. You will need to spend the time reading and conduct some lab tests. I would recommend that you set this environment up using VMware or such. There will be some fine tuning needed, but most of this information will get you going with a cluster that you can test and tweak. Once you have something that is right for your environment, you can then set this up on your production servers. I suggest you read through the drbd.conf file if you used drbd 8.4.3; It should be complete with examples and comments. My configuration should get you up and going: This will zero out the file!!! sip01:# cat /dev/null > /etc/drbd.conf Open the file and copy my config below: sip01:# vi /etc/drbd.conf global { usage-count yes; } common { syncer { rate 80M; } } resource r0 { protocol C; handlers { pri-on-incon-degr "/usr/lib/drbd/notify-pri-on-incon-degr.sh; /usr/lib/drbd/notify-emergency-reboot.sh; echo b > /proc/sysrq-trigger ; reboot -f"; pri-lost-after-sb "/usr/lib/drbd/notify-pri-lost-after-sb.sh; /usr/lib/drbd/notify-emergency-reboot.sh; echo b > /proc/sysrq-trigger ; reboot -f"; local-io-error "/usr/lib/drbd/notify-io-error.sh; /usr/lib/drbd/notify-emergency-shutdown.sh; echo o > /proc/sysrq-trigger ; halt -f"; fence-peer "/usr/lib/drbd/crm-fence-peer.sh"; } startup { } disk { on-io-error detach; } net { after-sb-0pri disconnect; after-sb-1pri disconnect; after-sb-2pri disconnect; rr-conflict disconnect; } syncer { rate 80M; } on sip01 { device /dev/drbd0; disk /dev/sdb2; address 192.168.10.201:7788; flexible-meta-disk internal; } on sip02 { device /dev/drbd0; disk /dev/sdb2; address 192.168.10.202:7788; meta-disk internal; } } after you configure the /etc/drbd.conf file, scp it to node2 (sip02) sip01:# scp /etc/drbd.conf root@sip02:/etc/ Create device metadata. This step must be completed only on initial device creation. It initializes DRBD's metadata: For "resource" you have the resource name you defined in drbd.conf. If this is untouched it should be r0. sip01:~# drbdadm create-md r0 sip02:~# drbdadm create-md r0 sip01:~# /etc/init.d/drbd start sip02:~# /etc/init.d/drbd start Check the status DRBD: sip01:~# cat /proc/drbd version: 8.3.4 (api:88/proto:86-91) GIT-hash: 70a645ae080411c87b4482a135847d69dc90a6a2 build by root@sip01, 2009-10-21 15:50:17 0: cs:Connected ro:Secondary/Secondary ds:Inconsistent/Inconsistent C r---- ns:0 nr:0 dw:0 dr:0 al:0 bm:0 lo:0 pe:0 ua:0 ap:0 ep:1 wo:b oos:10538280 At this point, this is normal until we can complete the setup. As long as you see: Secondary/Secondary ds:Inconsistent/Inconsistent Continue on! On the primary node: sip01:~# drbdadm -- --overwrite-data-of-peer primary r0 Check the status again, the data should be syncing: sip01:~# cat /proc/drbd version: 8.3.4 (api:88/proto:86-91) GIT-hash: 70a645ae080411c87b4482a135847d69dc90a6a2 build by root@sip01, 2009-10-21 15:50:17 0: cs:SyncSource ro:Primary/Secondary ds:UpToDate/Inconsistent C r---- ns:55372 nr:0 dw:0 dr:59564 al:0 bm:2 lo:1 pe:290 ua:123 ap:0 ep:1 wo:b oos:10492168 [>....................] sync'ed: 0.5% (10244/10288)M finish: 0:29:08 speed: 5,664 (4,192) K/sec At this point, wait for the sync to complete! I will be back in about 29 minutes! K, I am back! Lets check the status of DRBD sip01:~# cat /proc/drbd version: 8.3.4 (api:88/proto:86-91) GIT-hash: 70a645ae080411c87b4482a135847d69dc90a6a2 build by root@sip01, 2009-10-21 15:50:17 0: cs:Connected ro:Primary/Secondary ds:UpToDate/UpToDate C r---- ns:10538280 nr:0 dw:0 dr:10538816 al:0 bm:644 lo:0 pe:0 ua:0 ap:0 ep:1 wo:b oos:0 At this point, we can create the mount point and filesystem for drbd0. As we are building a Active/Standby cluster we can use ext3 or ext4. That choice is yours; However, if you are planning to run Active/Active (Primary/Primary) you can not use ext at all. You will want to use GFS or OCFS2; However, that is out of scope for this document. Create the filesystem: sip01:~# mkfs.ext3 /dev/drbd0 sip01:~# mkdir /usr/local/pbxnsip sip02:~# mkdir /usr/local/pbxnsip sip01:~# mount /dev/drbd0 /usr/local/pbxnsip Check the mount point: sip01:~# mount | grep drbd /dev/drbd0 on /usr/local/pbxnsip type ext3 (rw) sip01:~# cd /usr/local/pbxnsip sip01:/usr/local/pbxnsip# wget http://www.pbxnsip.com/download/pbxctrl-debian4.0-3.4.0.3201 sip01:/usr/local/pbxnsip# mv pbxctrl-debian4.0-3.4.0.3201 pbxctrl sip01:/usr/local/pbxnsip# chmod 755 pbxctrl sip01:/usr/local/pbxnsip# cd /etc/init.d/ sip01:/etc/init.d# vi pbxnsip Copy and paste: #!/bin/bash PBXEXE=/usr/local/pbxnsip/pbxctrl PBXDIR=/usr/local/pbxnsip #Service script for the pbxnsip PBX: case "$1" in start) echo -n "Starting pbxnsip daemon" $PBXEXE --dir $PBXDIR || return=$rc_failed echo -e "$return" ;; stop) echo -n "Shutting down pbxnsip daemon:" killall $PBXEXE || return=$rc_failed echo -e "$return" ;; restart) $0 stop && $0 start || return=$rc_failed ;; status) echo -n "Checking for service pbxnsip: " checkproc /usr/sbin/pbxnsip && echo OK || echo No process ;; *) echo "Usage: $0 {start|stop|status|restart}" exit 1 esac # Inform the caller not only verbosely and set an exit status. test "$return" = "$rc_done" || exit 1 exit 0 sip01:/etc/init.d# chmod 755 pbxnsip build the config files and such (Start pbxnsip) sip01:/usr/local/pbxnsip# /etc/init.d/pbxnsip start NOTE!!!! If you have a 64bit Debian version, you will need to install ia32-libs [apt-get install ia32-libs] sip01:/usr/local/pbxnsip# ps -ef | grep pbx root 15798 1 1 18:02 pts/0 00:00:00 /usr/local/pbxnsip/pbxctrl --dir /usr/local/pbxnsip We are almost done; at this point, you should be able to connect to the http interface and login. This is the basic installation. We will now want to setup the cluster to: Start the virtual IP Start DRBD Mount the file system (/usr/local/pbxnsip) and start the application This will be dicussed later on! This has to be start in this order to make it work right. The first thing we will want to modify is the init script. A couple of things to note: Per pbnxsip: https://www.pbxnsipsupport.com/index.php?_m...v=0,142,143,188 They state to also do this: To install the script, just use the command "update-rc.d" like this: update-rc.d pbxnsip defaults ( DO NOT DO THIS!!! the cluster resource manager will start the application ): Also for DRBD: Use the DRBD OCF resource agent. In this case, you must not let init load and configure DRBD, because this resource agent does that itself. If you followed this tutorial, you are safe; the init scripts are not enabled to start at boot tho the scripts are in the init.d directory. You can install rcconf (Debian only) and double check the status of those scripts: sip01:# apt-get install rcconf sip01:# rcconf They should not be checked! We will want to modify the pbxnsip script that is in /etc/init.d Why? Well, the folks at pbxnsip have a bug (This is normal; nothing bad; I am sure there will be something is this document that I missed; It happens) sip01:~# /etc/init.d/pbxnsip status Checking for service pbxnsip: /etc/init.d/pbxnsip: line 23: checkproc: command not found No process When we add this to the CRM, the Cluster Resource Manager will use that script to check the status to see if it is running, but will fail as there is no such thing as checkproc in Debian 5.0 functions! sip01:~# vi /etc/init.d/pbxnsip find the line: PBXDIR=/usr/local/pbxnsip Add this after that line . /lib/lsb/init-functions find the line checkproc /usr/sbin/pbxnsip && echo OK || echo No process change that line to read: status_of_proc "$PBXEXE" && exit 0 || exit 1 Final result will be: #!/bin/bash PBXEXE=/usr/local/pbxnsip/pbxctrl PBXDIR=/usr/local/pbxnsip . /lib/lsb/init-functions #Service script for the pbxnsip PBX: case "$1" in start) echo -n "Starting pbxnsip daemon" $PBXEXE --dir $PBXDIR || return=$rc_failed echo -e "$return" ;; stop) echo -n "Shutting down pbxnsip daemon:" killall $PBXEXE || return=$rc_failed echo -e "$return" ;; restart) $0 stop && $0 start || return=$rc_failed ;; status) echo -n "Checking for service pbxnsip: " status_of_proc "$PBXEXE" && exit 0 || exit 1 ;; *) echo "Usage: $0 {start|stop|status|restart}" exit 1 esac # Inform the caller not only verbosely and set an exit status. test "$return" = "$rc_done" || exit 1 exit 0 So now when you check the status of that application, it will be fixed! sip01:~# /etc/init.d/pbxnsip status Checking for service pbxnsip: is running. Lets go ahead and stop it. Lets go ahead and get sip02 configured and tested: sip01:~# umount /usr/local/pbxnsip sip01:~# /etc/init.d/drbd stop sip01:~# crm node standby sip02:~# drbdadm primary r0 sip02:~# mount /dev/drbd0 /usr/local/pbxnsip/ sip02:/usr/local/pbxnsip# ls Your data should be there! Copy over the init script that we create and tweaked on sip01: sip01:~# scp /etc/init.d/pbxnsip root@sip02:/etc/init.d sip02:~# /etc/init.d/pbxnsip start sip02:~# /etc/init.d/pbxnsip status It should be running! Cluster is almost finished! Login to the http gui Change the admin password save, then logout. sip02:~# /etc/init.d/pbxnsip stop sip02:~# umount /usr/local/pbxnsip sip02:~# drbdadm secondary r0 sip01:~# crm node online sip01:~# /etc/init.d/drbd start sip01:~# drbdadm primary r0 sip01:~# mount /dev/drbd0 /usr/local/pbxnsip/ sip01:~# /etc/init.d/pbxnsip start Login to the HTTP GUI to ensure you can log in with the new password. The data should have replicated over. It should work. We have manually failed over the cluster, at this point the cluster is operational from a manaul prespective. We now want to configure the CRM for the pbxnsip, drdb and file system resources: Start the virtual IP Start DRBD Mount the file system (/usr/local/pbxnsip) and start the application In this order! crm configure primitive drbd_pbxnsip ocf:linbit:drbd params drbd_resource="r0" op monitor interval="10s" crm configure ms ms_drbd_pbxnsip drbd_pbxnsip meta master-max=1 master-node-max=1 clone-max=2 clone-node-max=1 notify=true crm configure primitive fs_pbxnsip ocf:heartbeat:Filesystem params device="/dev/drbd0" directory="/usr/local/pbxnsip" fstype="ext3" crm configure primitive ip_pbxnsip ocf:heartbeat:IPaddr params ip=10.1.10.210 op monitor interval=10s If you choose to add routes and not bind to the application to the VIP to the ports, you will need to add the gwsrc_route resource, if not skip the next line.crm configure primitive gwsrc_route ocf:heartbeat:Route params destination="0.0.0.0/0" gateway="10.1.10.1" source="10.1.10.210" crm configure primitive pbxnsipd lsb:pbxnsip crm configure group pbxnsip fs_pbxnsip ip_pbxnsip pbxnsipd crm configure colocation pbxnsip_on_drbd inf: pbxnsip ms_drbd_pbxnsip:Master crm configure order gwsrc_route inf: pbx:start gwsrc_route:start crm configure order pbxnsip_after_drbd inf: ms_drbd_pbxnsip:promote pbxnsip:start Also, you might want to include a resource to ping your default gateway and if that should fail, initiate a failover: crm configure primitive pingd ocf:pacemaker:pingd params host_list=10.1.10.1 multiplier=100 op monitor interval=15s timeout=5s crm configure location my_pbxnsip_cluster_on_connected_node pbxnsip rule -inf: not_defined pingd or pingd lte 0 sip01:~# crm_mon --one-shot ============ Last updated: Thu Oct 22 15:07:49 2009 Stack: openais Current DC: sip01 - partition WITHOUT quorum Version: 1.0.5-unknown 2 Nodes configured, 3075885346 expected votes 2 Resources configured. ============ Online: [ sip01 sip02 ] Master/Slave Set: ms_drbd_pbxnsip Masters: [ sip01 ] Slaves: [ sip02 ] Resource Group: pbxnsip fs_pbxnsip (ocf::heartbeat:Filesystem): Started sip01 ip_pbxnsip (ocf::heartbeat:IPaddr): Started sip01 pbxnsipd (lsb:pbxnsip): Started sip01 Test failover sip01:~# crm node standby sip01:~# crm_mon --one-shot ============ Last updated: Thu Oct 22 15:09:22 2009 Stack: openais Current DC: sip01 - partition WITHOUT quorum Version: 1.0.5-unknown 2 Nodes configured, 3075885346 expected votes 2 Resources configured. ============ Node sip01: standby Online: [ sip02 ] Master/Slave Set: ms_drbd_pbxnsip Masters: [ sip02 ] Stopped: [ drbd_pbxnsip:0 ] Resource Group: pbxnsip fs_pbxnsip (ocf::heartbeat:Filesystem): Started sip02 ip_pbxnsip (ocf::heartbeat:IPaddr): Started sip02 pbxnsipd (lsb:pbxnsip): Started sip02 congratulations, you have a fully functional PBXnSIP Linux cluster...... Hopefully If there is anything you think I missed or you are stuck somewhere, shoot me an email. I am not sure how this is going to come out, so I am including what I have pasted as a text document. Good luck! forum_debain_cluster.txt
×
×
  • Create New...