A field engineer receives a call. A customer line is down. The automated system flagged it, raised a ticket, and then stopped. It could not determine the root cause. So now a skilled engineer - someone with years of experience diagnosing complex network faults, is working through a manual investigation process that could take hours, drawing on institutional knowledge that exists largely in their head rather than any accessible system.

This scenario plays out thousands of times every day across telecoms infrastructure networks globally. It is not a failure of automation ambition. Operators have invested heavily in automated monitoring, fault detection, and ticketing systems for years. The gap is what happens when those systems reach their diagnostic limit, and in networks where legacy infrastructure coexists with modern platforms, that limit is reached constantly.

The question facing telecoms operators in 2026 is no longer whether to pursue further automation and AI integration. The industry has settled that question decisively. The real challenge is understanding precisely where human intervention still dominates, why it persists, and what a credible AI-enabled alternative looks like in an environment where legacy systems are not going away any time soon.

The Automation Gap: Why Manual Intervention Persists

The telecoms industry has been pursuing network automation for decades. Provisioning, performance monitoring, alarm management - large portions of these workflows have been automated to varying degrees across most major operators. And yet, as of 2025, the industry still relies heavily on manual remediation when things go wrong. According to analysis published by The Fast Mode, telecoms operators continue to depend significantly on human intervention to prevent or resolve service disruptions, despite years of automation investment.

The GSMA’s autonomous network maturity model maps operator progress across six levels, from fully manual (Level 0) to fully autonomous (Level 5). As of 2025, more than two-thirds of operators globally had not progressed beyond Level 2 - partial automation within specific domains, still largely reactive and incident-driven. Full autonomy at scale is not expected until around 2030 at the earliest.

This is not a technology availability problem. The tools to move further up the maturity curve exist. The constraint is the environment those tools have to operate in.

Large telecoms networks - particularly those managing both legacy copper infrastructure and newer fibre and IP platforms simultaneously - present an integration challenge that generic automation cannot easily handle. Configuration errors on legacy systems do not always surface in ways that modern monitoring platforms can interpret. Fault patterns on older equipment do not always match the training data that AI models are built on. And knowledge of how to diagnose and fix these issues resides in the minds of experienced engineers rather than in any structured, accessible knowledge base.

The Cost of Manual-First Fault Resolution

Mean Time to Repair (MTTR) is the operational metric that captures this cost most directly. In telecoms, every additional hour of downtime risks SLA penalties, customer churn, and reputational damage. High MTTR is not just an efficiency problem - it is a direct commercial and regulatory liability.

But MTTR tells only part of the story. The fuller picture includes three compounding costs that do not always appear in operational dashboards:

Engineer time absorbed by diagnosis rather than resolution. When a fault cannot be automatically diagnosed, a skilled engineer must investigate from scratch. In networks where multiple legacy systems run in parallel, this investigation can be extensive, and the same diagnostic process may be repeated each time a similar fault recurs - because the resolution knowledge is not being captured and shared.

Alarm fatigue eroding operational focus. Traditional network monitoring systems generate high volumes of alerts, many of which are false positives or low-priority notifications. Research by Sutherland suggests that AI-driven anomaly detection can reduce alarm fatigue by up to 40%, allowing operations teams to concentrate on genuinely critical incidents. Without this filtering, teams spend significant time processing noise rather than addressing real issues.

Knowledge that walks out the door. In an industry undergoing significant workforce change, the institutional knowledge required to diagnose complex legacy faults is concentrated in a relatively small pool of experienced engineers. As that workforce evolves, operators face the risk that critical diagnostic expertise becomes inaccessible precisely when it is most needed.

“AI-driven fault resolution automation can reduce Mean Time to Repair by up to 50%, allowing network engineers to shift their focus from reactive troubleshooting to strategic optimisation.” - Sutherland Global

Where AI Changes the Equation

The shift AI enables in fault management is not the removal of human engineers. It is the transformation of what those engineers spend their time doing.

The most immediate and impactful application is intelligent triage. Rather than routing every unresolved fault directly to a human for investigation, AI-powered systems can analyse fault signatures, cross-reference historical incident data, and present engineers with a prioritised diagnosis and a suggested resolution path. The engineer still makes the final judgement call - but they do so with context that would otherwise take hours to assemble manually. Research from Sutherland indicates that AI-driven analytics and automated fault categorisation can reduce MTTR by up to 50% in telecoms environments.

Beyond triage, AI is enabling a shift from reactive to predictive fault management. By continuously monitoring network conditions and identifying anomalies before they become service-affecting incidents, AI systems can surface potential failures early enough for pre-emptive intervention. This changes the operational rhythm fundamentally - from responding to outages to preventing them.

Critically, AI also addresses the knowledge accessibility problem. AI-powered knowledge assistant tools can capture and structure diagnostic expertise - turning what currently exists in experienced engineers’ heads into a searchable, accessible resource that any engineer can draw on in real time. When a fault pattern is identified that has been seen before, the resolution steps are immediately available, regardless of whether the engineer who originally solved it is in the room.

The Industry Has Already Made Its Decision

The investment data makes the direction of travel unambiguous. According to NVIDIA’s State of AI in Telecommunications 2026 survey, 89% of telecoms respondents said their AI budget would increase over the next twelve months - up from 65% the previous year. Network automation has overtaken customer experience as the leading use case for AI investment and deployment across the industry.

IDC projects global spending on AI-supporting technology to reach $749 billion by 2028. Operators that have already built the operational and data foundations to support AI deployment will capture a disproportionate share of that value. Those still operating primarily on manual diagnostic processes face a compounding disadvantage: not only are their current operations less efficient, but they are also less positioned to absorb and leverage the AI capabilities the industry is converging around.

The research firm Omdia found that 47% of communications service providers expect agentic AI - AI that can autonomously initiate and complete multi-step tasks - to become “very important” for network troubleshooting, live optimisation, and on-demand reporting. This is not a distant aspiration. It is an active investment priority for operators who are serious about operational efficiency.

What Separates Operators Who Benefit from Those Who Do Not

Deploying AI into fault management is not simply a matter of procuring the right tools. Operators who have attempted to introduce AI-driven fault resolution and found it underperforming have typically encountered one of three root causes:

Data that is not ready. AI fault detection models require clean, consistent, and accessible historical incident data to identify patterns and generate reliable diagnoses. In networks where fault data sits in disconnected legacy systems with inconsistent formats, the model has nothing meaningful to learn from.

Knowledge that has not been captured. AI knowledge tools are only as valuable as the expertise fed into them. Operators who have not made a deliberate effort to document and structure diagnostic knowledge - before implementing AI - find that the tool surfaces information that is incomplete or outdated.

Integration that stops at the boundary of legacy systems. The faults that are hardest to resolve - and therefore most costly - are disproportionately those involving legacy infrastructure. AI deployments that only cover modern platforms miss the environments where human intervention is currently most concentrated.

Getting these foundations right is not glamorous work, but it is the work that determines whether AI investment produces genuine operational improvement or simply adds another layer of technology to an already complex environment.

The Future of Fault Management Is Not Engineer-Free - It Is Engineer-Empowered

The goal of AI in telecoms fault management is not to eliminate the engineer. It is to ensure that when an engineer’s expertise is required, they arrive at the problem with the best possible information, the fastest possible context, and the institutional knowledge of every similar fault that has been resolved before them.

The industry is moving in one direction. The operators who will lead on operational efficiency, SLA performance, and cost management over the next five years are those investing now in the data foundations, knowledge infrastructure, and AI capabilities that make that transition possible. For those still running predominantly manual diagnostic processes across legacy network environments, the cost of waiting is already accumulating - in engineer hours, in MTTR, and in the widening gap between their operational capability and that of more advanced peers.

VE3 helps telecoms and infrastructure operators build the data foundations and AI capabilities needed to transform network fault management. From automated fault triage to AI-powered knowledge platforms, we deliver outcomes in complex, legacy-heavy environments. visit www.ve3.global for more details.