Jump to content

Trunk failover


Kristan

Recommended Posts

Hi All,

 

Bit of a feature request - at the moment failover is either all SIP error codes, or just 5xx codes. Could this be made this a little more flexible?

 

Basically I'd like to be able to failover from a SIP trunk to say ISDN on certain errors, like 401, 404, 408, 480 etc. but not the rest of the 4xx errors. There's no point in my attempting to ring a number on the ISDN if it's just busy, but there is if there is a problem at the voip provider.

 

I'm not sure how I'd like to see this implemented either, maybe initially just a list of codes in the pbx.xml, but I just thought I'd mention it :blink:

Link to comment
Share on other sites

I think the all error codes include non sip events like icmp redirects in case the link is really down not just on sip error codes. If the remote side is busy on trunk a it should be busy on trunk b as well so they may not be a good idea. A 486 should be a real busy, the provider should send back a 5xx if they have real problems. Have you hit any issues or are you just trying to think this through?

Link to comment
Share on other sites

I agree - that's why I don't want to have it failover on a 486 busy here , but I would like to failover on a 404 not found, 401 unauthorised or 408 timeout and possibly a few others. We've had instances where the VoIP provider has changed something and as a result all calls come back with 401's and die. In these sorts of cases I'd rather it failover to a secondary trunk - if we're seeing them it's clearly a provider error. Obviously for normal responses like a 407, 486 or 487 it'd be stupid to failover as they're just part of a normal conversation.

Link to comment
Share on other sites

Well, all these codes have practically no meaning and the operators use them as they like and might even change their mind. Sometimes the codes even depend on the termination providers down the SIP trunking value chain... The only difference is the code class (4xx, 5xx or 6xx). So much for the defense of the current setup...

Link to comment
Share on other sites

  • 2 months later...

I would like to add something here, that does fall into trunk failover, but not directly related to priority level within the dial-plan for failing over to another trunk.

 

My request is for the trunk registration to utilize the NAPTR-->SRV information more efficiently. Let me explain.

 

We utilize the NAPTR-->SRV-->A records lookup that PBXnSIP supports (yeah!!) to connect to our upstream PSTN provider. They require it since they load balance and do maintenance and we work 24x7.

 

In our trunk configuration under the outbound proxy we set it to bfp.example.domain.

 

PBXnSIP works great with this but lacks what I think (IMHO) is the next step. When looking at the network captures when initially registering the trunk we see the following.

 

PBX: DNS NAPTR query for bfp.example.com

DNS response with:

_sip._udp.int.example.com type SRV Class IN priority 0 weight 0 port 5060 target sftsw-1.example.com

_sip._udp.int.example.com type SRV Class IN priority 1 weight 0 port 5060 target sftsw-2.example.com

 

PBX: DNS SRV query for _sip._udp.int.example.com

DNS response with:

_sip._udp.int.example.com type SRV Class IN priority 0 weight 0 port 5060 target sftsw-1.example.com

_sip._udp.int.example.com type SRV Class IN priority 1 weight 0 port 5060 target sftsw-2.example.com

 

PBX: DNS A query for sftsw-1.example.com

 

And then registration occurs.

 

What I have noticed and it would be great to have "enhanced" would be when registration fails during a re-registration for any reason, the PBX attempt to re-resolve the Outbound Proxy entry by running the DNS query process again to see if the records have changed. This is not happening at this time. Standard re-registration attempts a 60s continues to hit the IP address in the A record query.

 

I have to restart the PBXnSIP service to initialize the DNS query process so that registration can attempt to the first priority SRV record and after failing moving on to the next SRV record.

 

It would be great to have this done without having to restart the service.

Link to comment
Share on other sites

Maybe it would make sense to have an upper limit on the DNS expiry time. Many DNS records have a duration of a day or so, which is very very long.

 

Alternatively, maybe we should add a button that just clears the DNS cache. That would also force a reload on the next occasion.

Link to comment
Share on other sites

Join the conversation

You can post now and register later. If you have an account, sign in now to post with your account.
Note: Your post will require moderator approval before it will be visible.

Guest
Reply to this topic...

×   Pasted as rich text.   Paste as plain text instead

  Only 75 emoji are allowed.

×   Your link has been automatically embedded.   Display as a link instead

×   Your previous content has been restored.   Clear editor

×   You cannot paste images directly. Upload or insert images from URL.

×
×
  • Create New...