Child pages
  • Sipxcom High Availability Primer

Versions Compared

Key

  • This line was added.
  • This line was removed.
  • Formatting was changed.

...

By design Sipxcom is designed is always use its built-in DNS server for querying SRV records - the Sipxcom unmanaged DNS server option when enabled allows querying of an unmanaged DNS server for phone registrations.   In testing the Sipxcom HA solution in release 16.12, difficulties were encountered using the managed DNS tools of Sipxcom. This document http://wiki.sipxcom.org/display/sipXcom/DNS+Management provides a good overview of the Sipxcom managed DNS capabilities and how they should work. Despite significant Significant investments to were madeto make Sipxcom DNS management tools work with the high availability solution , the HA solution performed best in the lab when Sipxcom used a DNS server that was completely separate from Sipxcom - , i.e. DNS services on each of the Sipxcom servers was disabled.  The remainder first part of this document assumes that Sipxcom uses a separate DNS server for all SRV record processing - to do this, careful attention needs to be placed on how Sipxcom primary and secondary servers are built and to disable the DNS service from being enabled at startup. The second part of this document describes how the HA solution works using the managed DNS tools available in Sipxcom.

Configuring Sipxcom for High Availability using Standalone DNS servers

These next few subsections assumes that a new Sipxcom server is being built from scratch, a separate DNS server is used for all Sipxcom DNS processing, and that DNS is turned off on all Sipxcom servers.

...

  • TFTP and SNTP servers point to the primary 10.20.2.31 proxy
  • DNS server for the phones in the phase 1 tests points to unmanaged DNS server at 10.20.2.35. In phase 2 testing where phones are simulated in a separate location, the DNS address is 10.10.17.35 and SRV weights are adjusted so that phones in that location register to the secondary server.
  • The Lines - > Registration - > Primary Registration Server - > Expires value is reduced from 3600 seconds to 120 seconds.
  • The registrations GUI in Sipxcom does not provide any information on which proxy the phones are registered to. A custom configuration file is configured for the Polycom phones that allows remote PCAPs to be enabled - combined with the SIP expires Setting (previous point), Wireshark is used to validate which proxy phones are registering to. A way to pull this information from Mongo is being explored using Mongo commands from a unix script is being explored.

Double-Check Lab Configuration

...

Configuring Sipxcom for High Availability using the Managed DNS Tools Available in Sipxcom

These next few subsections assumes that a new Sipxcom server is being built from scratch, and the managed DNS tools in Sipxcom and onboard DNS servers are enabled in Sipxcom. The standalone DNS services in the test setup (10.20.2.35 and 10.10.17.35) will still be used as an unmanaged DNS service for site phone registrations.

Build Standalone DNS Servers

Build the standalone DNS servers 10.20.3.25 and 10.10.17.10 as per the previous section. Manipulate the weights of the tcp, udp, and rr service records on the 10.20.2.35 DNS server to have the phones on the 10.20.2.x subnet register to the pbx2 server while phones on the 10.10.17.x subnet register to the pbx3 subnet (tcp records illustrated below).

Image Added

Build Primary Sipxcom Server

Build primary Sipxcom server with a valid upstream DNS forwarder address (e.g. 8.8.8.8). Once the primary Sipxcom server has been built, turn on all services except for Sipxbridge and DHCP (in the lab phones were statically provisioned). Sipxcom builds the following DNS settings in /etc/named.conf, /etc/resolv.conf, and /var/named/default.view.lvtest.com.zone.Image Added

NAT Traversal Settings

Set NAT traversal settings exactly like NAT traversal settings in previous section with standalone DNS servers.

Add Server and Role

Add secondary servers and roles exactly like the previous section with standalone DNS servers.

Add Secondary Servers to Global Databases

Add secondary servers to Global Databases exactly like the previous section with standalone DNS servers.

Turn on DNS, SipXproxy, SipRegistrar Services on PBX2/3 and Push Profiles

Once pbx2 and pbx3 are built and successfully added as secondary servers to the Global Databases, then perform the following:

  1. Go to System - > Core Services and turn on DNS on pbx2 and pbx3
  2. Go to System - > Telephony Services and turn on SIP Proxy and SIP Registrar services on pbx2 and pbx3
  3. Go to System - > Services and push the server profiles, which replicates the Mongo database and DNS information
  4. Go to Diagnostics - > Job Status and ascertain all replication was successfully completed.

Image Added

Check DNS Configuration and Configure Failover

After the high availability cluster is configured and services defined, the DNS configuration on each server should look as follows:

  1. The /etc/resolv.conf file on each system should have the IP address of the server as the first nameserver, followed by the other 2 nameservers.
  2. The /etc/named.conf file should point to the upstream DNS server defined at initial Sipxcom installation (e.g. 8.8.8.8)
  3. The zone file is defined as default.view.lvtest.com.zone file and located in the /var/named directory.
  4. By default the SRV records in the zone file are configured to deliver services equally across all three servers - i.e. there are three servers, and if the HA system had 90 registered phones, each system would have 30 registrations
  5. The A records for each system are defined at the end of the zone file.

Image Added

Defining Regions for DNS Failover

The System - > Regions and System - > DNS - > Record View features within Sipxcom creates separate DNS zone files for each subnetwork. The following architecture and registration rules will be used to build DNS regions and failover rules within Sipxcom.

Image Added

The first step is to define two regions within Sipxcom - one is called Main1020 with an IP address range of 10.20.2.x/24 and the other region is called Local1010 with an IP address range of 10.10.17.x/24. The System - > DNS - > Record View menu will map the region to the failover plan.

Defining DNS Failover Plans and Record Views

Go to System - > DNS Fail-over Plans and create two plans:

  1. pbx2failover plan where phones always register to the pbx2 server
  2. pbx3failover plan where phones always register to the pbx3 server

Now go into the System - > DNS - > Record View and build two plans

  1. pbx1020 with a fail-over plan of pbx2failover that applies to Main1020 region
  2. pbx1010 with a fail-over plan of pbx3failover that applies to local1010 region

Image Added

What the Sipxcom DNS tools does is build the following DNS configuration in the /etc/named.conf file - DNS queries from the 10.20.2.x subnetwork use the pbx1020 zone file which always returns pbx2 SRV records while DNS queries from the 10.10.17.x subnetwork use the pbx1010 zone file which always returns pbx3 SRV records.

 

Image Added

 

Double-Check Lab Configuration with Standalone DNS Servers

ssh into each primary and secondary server, and double-check the following:

  • DNS service is turned off
  • /etc/resolv.conf file is pointed to the unmanaged DNS server at 10.20.2.35
  • The SRV records on the unmanaged DNS service are pointing to pbx3 first, then pbx2, and then pbx - do a dig SRV _sip._tcp.lvtest.com command
  • Double-check that all Sipxcom processes are running by doing a service sipxecs status
  • Using Wireshark, double-check that phones are registering to pbx3.
  • Place internal and external calls on system to validate that everything is working properly.

Image Added

Image Added

Image Added

Double-Check Lab Configuration with Sipxcom DNS Servers

ssh into each primary and secondary server, and double-check the following:

  • DNS service is turned offon
  • /etc/resolv.conf file is point to the unmanaged DNS server at 10.20.2.35The SRV records on the unmanaged DNS service are pointing to pbx3 first, then pbx2, and then pbx1 has all three Sipxcom primary and secondary servers defined as named servers
  • The SRV records  are pointing to pbx3 first, then pbx2, and then pbx when doing DNS queries from the 10.10.17.x subnetwork - do a dig SRV _sip._tcp.lvtest.com command
  • The SRV   are pointing to pbx2 first, then pbx3, and then pbx when doing DNS queries from the 10.20.2.x subnetwork - do a dig SRV _sip._tcp.lvtest.com command
  • Double-check that all Sipxcom processes are running by doing a service sipxecs status
  • Using Wireshark, double-check that phones are registering to pbx3.
  • Place internal and external calls on system to validate that everything is working properly.

Image Removed

...

Image Removed

StarTrinity SIP Tester Tool

...

  1. Use the Excel Import capability of Sipxcom to pre-populate a large number of users with the same SIP password.
  2. Go into the Registration (UAC) section of Sipxcom and pull down the Add Batch menu
  3. Provision the first user name, expiry field (I shorten from 3600 to 300 seconds), the number of user registrations to create, SIP password, and IP address of the Registrar that the users should register to. in this case, we are trying to register all phones to the 10.10.17.10 Sipx proxy.
  4. Hid the Add symbolic link - doublecheck the status field to ascertain the users connected correctly, and click on the trace symbolic link to ascertain that users are registering to the correct proxy. Go to the Sipxcom Diagnostics - > Registrations page to validate that the users have registered correctly to Sipxcom.


Image RemovedImage Added

When the Delete All symbolic link in Siptester is selected, the tool will instruct Sipxcom to un-register each line - in the Sipxcom Active registrations page, there may be still active or expired registrations - ssh to the ASipxcom primary server and use the following procedure to clear out all active registrations.

Image RemovedImage Added

Preliminary Phase 1 Test Results

...

In scenario 2 when phones are register to pbx3 from the SIP tester on the same 10.10.17.10 subnetwork, 5-10 Kbps of bandwidth is generated in replication traffic to the primary and secondary servers in the 10.20.2.x subnetwork. When 100 or 500 phones immediately register to pbx3, approximately 1 megabit per second of bandwidth (or 100 Kilobytes (KB)) is generated for several seconds that is destined for the primary and secondary servers on the 10.20.2.x. subnetwork - this information is replication traffic only and not user registrations.