As a SOC Manager, Monitoring MTTD and MTTR is essential to measure the effectiveness of your SOC in detecting, responding and mitigating security incidents. These metrics help you identify the gaps in security infrastructure and processes. MTTD and MTTR can also be used to set and measure performance against a predefined Service Level Agreement (SLA). The lower MTTD, the faster SOC can respond to a security incident. the question to ask is, how to lower MTTD and MTTR:
Detection:
-What kind of detection security tools are being used?
-Are the detection rules up-to-date and effective?
-Are there any false positive and false negatives impacting the detection?
Response:
-Is there a clear and well-defined IR process in place?
-Are there sufficient resources and staff available to respond to incidents?
-are there well trained and equipped IR team members?
-Are there playbooks and response plans in place for different incidents?
1-False Positive = Total number of false positive alerts / total number of alerts generated by SIEM for a specific period (day or a week).
For example, if your SIEM generated 209 alerts during a week, and 50 of those alerts were determined to be false positive, the false positive rate would be calculated as follow:
False Positive Rate = (# of False positive Alerts / Total Number of Alerts) x 100
False Positive Rate = (50/209) x 100
False Positive Rate = 23.92%
What does this indicate?
This indicates that 23.92% of your alerts are false positive, the higher the number, the more fine-tuning work needs to be done on your security controls, SIEM use cases, and outdated threat intelligence. Reducing this metric should be a priority and give your analysts time to investigate real incidents and alerts.
2-Incident Closure Rate = Total number of incidents successfully resolved / total number of incidents identified
For example, if your SOC identified 17 security incidents during a week/month/year and successfully resolved 13 of them, the incident closure rate would be calculated as follows:
Incident Closure Rate = (# of resolved incidents/ total # of incidents) x 100
Incident Closure Rate = (13/17) x 100
Incident Closure Rate = 76.47 %.
What does this indicate?
This indicates that 76.47 % of your incidents were successfully resolved. While it is important to note the difference between incident Closure Rate and the TTR, incident Closure Rate gives you a broader view of the overall effectiveness of the SOC responding to and resolving security incidents. while TTR gives you an indication of the time it takes your SOC team to respond to an incident.
3-Staff Utilization Rate = Total amount of time SOC analyst actively engaged in activities / total available time for each analyst.
For example, if you have 5 SOC analysts, and each analyst is available for 40 hours per week, the total available time would be 200 hours (5 analysts x 40 hrs ). Let's assume that the total amount of time spent actively engaged in activities during the week is 165, the Staff utilization rate would be calculated as follows:
Staff Utilization Rate = (total active time / total available time) x 100
Staff Utilization Rate = (165 / 200) x 100
Staff Utilization Rate = 82.5 %
What does this indicate?
This indicates that SOC analysts spend 82.5 % of their available time actively engaged in security monitoring and incident response activities. This is an important indicator for SOC managers as it may not be sustainable for the long run and it causes staff burnouts, and reduced quality of work, by the end of the day, we are human! A good Staff Utilization Rate falls between 70% to 80% to ensure enough flexibility and maintain a healthy work environment.
4-Knowledge management: it is a process rather than a metric. SOC managers should have a defined knowledge management process, which includes, documenting security incidents and their resolutions, maintaining up-to-date cybersecurity policies and procedures, and providing ongoing training to SOC staff to ensure they are equipped with necessary skills and knowledge to detect, respond to, and prevent any cybersecurity threats. As the saying goes "There is no more profitable investment than investing in your human resources"
Optimize your KPIs:
1-Prioritize alerting: this ensures that alerts are categorized based on their severity and potential impact on the organization, SOC team should be notified of the highest priority alerts first. This ensures that critical alerts are addressed quickly to reduce the risk of major incidents.
2-Improve SIEM rules: this is to ensure that SIEM rules are optimized to detect relevant threats. This can involve creating new rules, modifying existing ones or adapting a detection framework such as MITRE attack.
3-Expand Visibility: Periodically evaluate your current visibility and identify gaps. Remember, you can't detect what you can't see!
4-Automate: Use automation to reduce the time it takes to analyze and respond to alerts. This includes automating investigations, threat hunting & response.
a.Arguably, threat hunting can't be fully automated as human intuition and creativity are often required to uncover new or previously unknown threats.
5-Train, Train, Train: Training is the key element in optimizing your SOC KPIs. All your KPIs heavily depend on your team abilities to DETECT & RESPOND to cybersecurity incidents. Without proper training and knowledge transfer to your team you will not be able to optimize your SOC KPIs.
6-Improve Staff Utilization: proper staffing, workflow optimization & automation to identify and eliminate any unnecessary or redundant tasks will help you improve your Staff Utilization Rate KPI.
Conclusion:
In conclusion, monitoring and optimizing SOC KPIs such as MTTD, MTTR, false positive rate, incident closure rate, staff utilization, and knowledge management is essential to ensure effective incident response and maintain a high level of security posture. By calculating and analyzing these metrics, SOC managers can identify areas of improvement, optimize processes and workflows, and allocate resources efficiently. Implementing best practices such as automation, continuous training, and collaboration between teams can help reduce MTTD and MTTR and improve incident response time. By continuously monitoring and improving SOC KPIs, organizations can enhance their security capabilities and better protect against cyber threats.