Resolving Inefficiencies in Call Logging for a leading Telco
The client’s enterprise systems comprised multiple upstream and downstream systems that participated in fulfillment of a business function. There were a few issues with effective triaging across cross-functional microservices teams
No standard procedure of logging entries and error reporting by various cross functional teams
- Lack of standardization in logs and ineffective usage of log analytics tools like Splunk, led to significant time spent by the Client’s teams in analyzing the issue before it was assigned to the right team. Many times, this delay used to impact critical business activities
- No unified way of tracking and visualization of request from source service to the leaf node in the hierarchy (L0 to L5)
- No visibility on significant latency issues in microservices
Our team did a thorough analysis of the logging system and implemented the following changes:
Enhanced Log Triage
Built effective triaging dashboards in Splunk using inbuilt indexing and analytics features, captured through a logging framework.
Identified high latency downstream API’s and resolved latency issues by tracking contribution of each service to its hierarchy.
Dynamic Event Sequencing
Built dynamic sequence of events – reducing the need to maintain sequence diagrams – for system design and architectural improvements.
Drill down capability to track all the logs. The cross functional and geo-based teams found it very effective. They were able to identify the team responsible for resolving the issues.
The turnaround time for issue resolution was reduced from multiple days to minutes.
The dynamic performance metrics helped identify and resolve bottlenecks with no effort spent in identifying it.
See more work
Modernizing City of Medicine Hat’s Citizen Portal with AWS
City of Medicine Hat launches digital services for citizens with Accolite and AWS