Jump to content

Server Crash: How to transfer the PBX to a backup machine


andy
 Share

Recommended Posts

Our main server crashed so I had to move the PBX on a spare machine temporarily. Our licence is dongled, so there is basically no problem to do that.

 

Result: The http interface is available, all trunks are registering, but the phones can't find the server and register with the setip extensions, even though the spare machine has the same IP and no firewall.

 

I installed the software on the new machine first from a downloaded PBXNSIP installation package. Then I copied the complete content of my previously backed up PBX folder into the new installation.

 

Would you see any reason why the phone would not find the PBXNSIP server? Maybe there are any settings that can't simply be transferred from one machine to another.

Link to comment
Share on other sites

Would you see any reason why the phone would not find the PBXNSIP server? Maybe there are any settings that can't simply be transferred from one machine to another.

 

Probably the outbound proxy is not correct any more. My suggestion is to use the same IP address on the new machine as the old machine, this way you can keep the phones untouched.

 

An alternative would be having a completely automatic plug and play. Then you may have to change option 66 on the DHCP server, and that's it.

Link to comment
Share on other sites

Your experience drives home the point to have a tested disaster recover plan in place before the disaster occurs. The phones logs will have the info showing how it did not locate the new server. Phone and switch MAC tables will cause problems like this, especially if MAC learning is enabled.

Link to comment
Share on other sites

You mean the ARP cache does not get updated? That would be a huge surprise and IMHO a bug.

 

I don't think that's the reason. Our replacement machine has the same IP, no firewall and no MAC learning activated. The only difference is the operating system. The broken server was running Server 2003, the replacement machine Win XP. The internal DNS is down, but we are using IP numbers anyway to adress all components.

 

We'll - the server will be back in operation tomorrow. Still it would be interesting to find the problem for future reference, because it's not really possible in a small company to have redundant machines for all servers, as breakdown are quite rare anyway.

Link to comment
Share on other sites

I don't think that's the reason. Our replacement machine has the same IP, no firewall and no MAC learning activated. The only difference is the operating system. The broken server was running Server 2003, the replacement machine Win XP. The internal DNS is down, but we are using IP numbers anyway to adress all components.

 

We'll - the server will be back in operation tomorrow. Still it would be interesting to find the problem for future reference, because it's not really possible in a small company to have redundant machines for all servers, as breakdown are quite rare anyway.

 

Eliminating possibilities without unquestionable evidence has shortened many a technicians careers. After 33 years, I've learned to never rule anything without 100% proof accomplished through successive approximation to try and isolate the failure in large chunks by discovering that which is working. I.E. Is DHCP handing out the correct IP's, can a PC on the LAN get an IP address and ping the new server, can you portscan the PBX and identify the SIP ports? Does a softphone register, can you run etherreal (wireshark) and capture registration packets.

 

SNOM phones have a great log file that would clearly indicate the trouble, what about your phone choice?

 

but since we are guessing the likely problem, its IP address / DHCP / SUBNET mask related problem. This is basic IP stuff to find.

 

Good Luck and this is my .02 worth

 

Cheers

Link to comment
Share on other sites

No solution yet. We are still running our phones on a conventional ISDN phone system which we kept for backup.

 

And luckily I am not a professional technician who gets fired on this matter, but a small companie's manager who has to take care of our network besides his regular job.

 

But I believe in technology so it was worth the try to go with a soft PBX. Actually we've been quite happy with PBXNSIP until this incident, as it is way more flexible than a conventional small phone system.

 

Here is a new hint from the PBX's log. 192.168.26.141 is one of our Snom phones. It's request to register seems to find it's way into the PBX somehow, otherwise there would be no log entry like this one:

 

[8] 2008/03/30 10:07:21: SIP Rx udp:192.168.26.141:5060:

REGISTER sip:192.168.26.130 SIP/2.0

Via: SIP/2.0/UDP 192.168.26.141:5060;branch=z9hG4bK-fsc2w36u2lpq;rport

From: "andreas" <sip:200@192.168.26.130>;tag=1zxwzngbuq

To: "andreas" <sip:200@192.168.26.130>

Call-ID: 3c267012321c-xkx751bzghuq

CSeq: 2 REGISTER

Max-Forwards: 70

Contact: <sip:200@192.168.26.141:5060>;flow-id=1;q=1.0;+sip.instance="<urn:uuid:6b24e3ed-00ba-4fd5-95c1-5cd623b49ef0>";audio;mobility="fixed";duplex="full";description="snom320";actor="principal";events="dialog";methods="INVITE,ACK,CANCEL,BYE,REFER,OPTIONS,NOTIFY,SUBSCRIBE,PRACK,MESSAGE,INFO"

Contact: <https://192.168.26.141:443>

User-Agent: snom320/7.1.30

Supported: gruu

Allow-Events: dialog

X-Real-IP: 192.168.26.141

Expires: 60

Content-Length: 0

 

 

There is an internal number 200 and it seems the PBX can't find it. This is the next few lines of the log:

 

 

[8] 2008/03/30 10:07:21: SIP Tx udp:192.168.26.141:5060:

SIP/2.0 404 Not Found

Via: SIP/2.0/UDP 192.168.26.141:5060;branch=z9hG4bK-fsc2w36u2lpq;rport=5060

From: "andreas" <sip:200@192.168.26.130>;tag=1zxwzngbuq

To: "andreas" <sip:200@192.168.26.130>;tag=55fce6be84

Call-ID: 3c267012321c-xkx751bzghuq

CSeq: 2 REGISTER

Content-Length: 0

 

 

So far I though the phone can't find the server at all. That does not seem to be the case. Maybe there is someone in this forum who understands the full technical meaning of this log.

Link to comment
Share on other sites

I see two things that could be the problem here:

 

1. There is a nother process occupying the SIP port. Use "netstat -abn" to find out which process is listening on port 5060. Use the task manager to find out what process is using the PID that you see in netstat.

 

2. For some reason, the domain name "localhost" is not there any more and the domain "192.168.26.130" does not exist. Then the PBX rejects the request.

Link to comment
Share on other sites

I see two things that could be the problem here:

 

1. There is a nother process occupying the SIP port. Use "netstat -abn" to find out which process is listening on port 5060. Use the task manager to find out what process is using the PID that you see in netstat.

 

2. For some reason, the domain name "localhost" is not there any more and the domain "192.168.26.130" does not exist. Then the PBX rejects the request.

 

 

Our networking technician was did check the server today and he did not find any unusual behaviour or settings.

 

It's a new MS Server 2008, but that should not make a real difference. netstat -abn shows UDP Port 5060 listening at IP 0.0.0.0. PBXNSIP shows both IPS's 127.0.0.1 and 192.168.26.130 under Admin/Status/General. The firewall is turned off. The phones are adressing 192.168.26.130 (extension@192.168.26.130:5060).

 

Should I simply delete the complete PBXNSIP setup and install it from scratch? Maybe there is a problem with some of the XML settings taken over from the old machine.

Link to comment
Share on other sites

Our networking technician was did check the server today and he did not find any unusual behaviour or settings.

 

This has sure taken on a life of it's own. I can't say enough how important it is not to look for what's broke, but to validate what is working. Understanding the relationship of IP, DHCP, SIP PORTS,UDP ports and using basic troubleshooting skills. Telnet, Ping, netstat are all the basic tools.

 

The Phones have a complete log of events associated with registration, install Wireshark and you can trace this from both ends.

 

Sound to me as if it's time to get paid professional help. I'm sure most of the contributors on this site would be fired had a client had a similar experience and outage.

 

Good Luck.

Link to comment
Share on other sites

This has sure taken on a life of it's own. I can't say enough how important it is not to look for what's broke, but to validate what is working. Understanding the relationship of IP, DHCP, SIP PORTS,UDP ports and using basic troubleshooting skills. Telnet, Ping, netstat are all the basic tools.

 

The Phones have a complete log of events associated with registration, install Wireshark and you can trace this from both ends.

 

Sound to me as if it's time to get paid professional help. I'm sure most of the contributors on this site would be fired had a client had a similar experience and outage.

 

Good Luck.

 

Well, we shouldn't discuss here who should get fired but rather find technical solutions. And we have found it: Simple but very efficient. Without any changes to the server or the network, a completely new install and setup solved the problem. Probably some of the XML files were corrupt. But this seems to be extremely rare according to the very helpful people at PBXNSIP. Usually a simple transfer of the settings should do it. Case closed.

Link to comment
Share on other sites

Join the conversation

You can post now and register later. If you have an account, sign in now to post with your account.
Note: Your post will require moderator approval before it will be visible.

Guest
Reply to this topic...

×   Pasted as rich text.   Paste as plain text instead

  Only 75 emoji are allowed.

×   Your link has been automatically embedded.   Display as a link instead

×   Your previous content has been restored.   Clear editor

×   You cannot paste images directly. Upload or insert images from URL.

 Share

×
×
  • Create New...