Saturday, June 6, 2009

Rethink the way you implement OCS Enterprise Voice

I'd like to start off this post by thanking Mike Stacy at Evangelyze for opening my eyes to this new way of thinking about number normalization and routing. This article discusses the options available to OCS voice implementations surrounding the use of E.164 numbers and the plus sign (+) in your dial plan.

I called Mike a month or so ago and we were talking about how to handle click to call on missed call notifications via email. I noticed Microsoft's missed calls were normalized into E.164 and couldn't figure out how to make this happen. Even if you create a location profile for your Mediation server to handle a 10 digit inbound number transformed into E.164, Mike indicated the missed call is generated with the raw number from the SIP INVITE. The only way I can figure Microsoft makes this happen is at the gateway. They prefix the number with a "+1" on inbound 10 digit numbers. After testing this on my own, it works. However, this brought up the question......why am I adding digits on inbound calls and stripping them on outbound?

This began a debate about what value the plus sign (+) is now that you don't need it in R2. Yes, that's don't need a plus sign in your TEL URI or your normalization rule(s). So you might be asking yourself, why would I care? Why not just add a plus sign to everything?

There are two reasons for this but first I need to explain the difference between a number with a plus sign (+) and one without. When OCS sees a number with a plus sign it treats it as a "global number" and doesn't attempt to normalize it. Rather, it has already been normalized so it will attempt to find a route and call it. This is the most important concept to grasp here which I'll elaborate more on later.

I had a lengthy discussion with a senior escalation engineer at Microsoft recently about the need for the plus sign in OCS voice. The Microsoft engineer indicated I wasn't using E.164 formatting with my OCS solution and that what I was trying to do wasn't supported. So I asked him how you would possibly represent a non-DID such as "extension 2112"? He indicated you could use one of two formats as follows:

Either: +2112
Or use the DID with the extension: +17805552000;ext=2112

Jochen Kunert has an excellent blog about this scenario:

I explained to him that "+2112" isn't an E.164 number. He disagreed. I pointed him to: and explained that simply having a plus sign in front of a series of digits didn't mean it was an E.164 number.

When it comes to using the full number with the extension (+17805552000;ext=2112) this creates it's own challenges. First, Exchange UM doesn't like to try and call people by saying their name (when you use the Auto Attendant). Second, there are gateways out there that don't like this format and can't handle digit manipulation. Lastly, I've seen issues when you try to forward your phone to a non-DID in this format (OCS drops the extension number but keeps the "+17805552000"); likely a bug for now but still a pain in the butt.

So back to the crux of the issue and why this is a problem for us....

1. For click to dial scenarios which include missed calls or clicking someone's name in your organization (or federated list), you want to be able to click on a number and have your own organization's normalization rules kick in. If you have a plus sign in the number, it won't be normalized. I mentioned this earlier....OCS will treat these numbers as "global numbers" and it will skip client-side normalization. The end result is that you need to have rules in your gateway to strip the "+1" on outbound local calls.

2. If you normalize all numbers to E.164 with a plus sign, you can't distinguish between local or long distance numbers easily. Here in Alberta we have a 780 area code which has overlapping local and long distance calling zones. This isn't that uncommon really. In order to restrict users from placing calls to 780 numbers which are long distance, we would have to create elaborate routes for all NXX values which are local to the user's phone policy, phone usage record, and route.

Let's humor this scenario quickly. If you have very specific routes for local numbers such as ^\+1780([966])(\d{4})$ you still need to build rules in at your next hop to remove the "+1" because the phone company needs 10 digits on a local call. So now you have twice as much complexity built out in OCS and your gateway(s). Too much to manage....

A better way to design this would be to follow these simple rules:

1. Normalize numbers to a format people would normally use when trying to place a phone call. For example, if somone wants to call a 10 digit local number, don't add a "+1" to it. Also if someone wants to dial a 4 digit internal number you can normalize it to a DID (i.e. 7805551212) or a non-DID (i.e. 2112).

2. Create a catch-all normalization rule to format 10 digit North America numbers with a "1" in front of it.

3. Don't add a plus sign to normalized numbers. This includes your numbers in Active Directory. You want client-side normalization to handle all possible scenarios for dialing. You can put numbers in their 11 digit format (i.e. 17805551212) but leave them as is.

4. At the gateway keep it simple. You shouldn't need to populate your gateway with many inbound or outbound rules. The only exception to this would be for click to dial scenarios where you click on a federated contact who has their number shown as E.164.

So I encourage discussion on this topic as I have spent a lot of time thinking about the pros and cons of this way of designing voice solutions.