So this one was a doozy!

I get a call from helpdesk advising me that some voip calls in and out are giving fast busy / engaged tones. Naturally I start to test and notice the problem too. Every second or third call in or out was dropping. So I go through my standard routing, check the voice router, check the SIP trunk, the ITSP is up etc. I check the log on the Cisco 4351 ISR router and find the following:

004096: Jun 30 14:09:35.206: %VOICE_IEC-3-GW: SIP: Internal Error (1xx wait timeout): IEC=1.1.129.7.65.0 on callID 3075 GUID=C307C371B9CA11EAA506F597830A4E96

Ok so this one I have never seen before. It turns out that IEC logging is turned on from a previous TAC case with:
vgy(config)#voice iec syslog
But what does the error mean? Lets decode it:
vgy#show voice iec description 1.1.129.7.65.0
IEC Version: 1
Entity: 1 (Gateway)
Category: 129 (Call setup timeout)
Subsystem: 7 (SIP)
Error: 65 (1xx wait timeout)
Diagnostic Code: 0

Ok so we can see it’s a generic SIP error for timeout. The plot thickens. At this point I turn on the following:
vgy#term mon
vgy#debug ccsip messages
vgy#debug ccsip calls inout

Now I place a call inbound to the gateway from my mobile phone. I get the user busy tone. Interesting, we have a timeout placing calls outbound and a user busy from the ITSP inbound. My mobile call never even got to the gateway, as I couldnt see it in the debugs.

Lets check the SIP channel load:
show sip-ua calls brief | i Total
Total SIP call legs:60, User Agent Client:30, User Agent Server:30

The AHA! moment. This voice services only has a 30 channel SIP service. We are maxing the amount of SIP channels that we pay the ITSP for. It is consistent with the user experience of fast busy / engaged tones and the timeout IEC logging error above. I also log the active SIP trunk calls with Cacti (2 Call legs equal one channel or call) which also confirmed that we were indeed hitting our max calls on this trunk.
Cacti Voice Router

I’m now off to order more SIP channels from Telstra for this service 😛