As Open RAN deployments gather pace and accurate timing, synchronization has emerged as a critical step toward consistent performance. It is a complex task, with devices requiring time synchronization to UTC reference via IEEE PTP/SyncE-based boundary clocks, slave clocks, PRTC clocks with GNSS receivers, as well as accurate, real-time monitoring in support of customer SLAs. As thousands of Open RAN devices are deployed, monitoring the network for associated failures and taking appropriate action will require AI/ML advancements. Early lessons from the field reveal a path forward.
Accurate and reliable synchronization has long been required to maintain critical telecom network capabilities, such as cell operations, cell coverage and efficiency, precise handover between cell towers, and so on. Telecom service providers can implement various methods to meet stringent phase and time synchronization requirements. Each method’s intent is to ensure synchronization of all nodes to the primary reference time clock (PRTC) source using the Global Positioning System (GPS) / Global Navigation Satellite System (GNSS). GNSS refers to a constellation of satellites providing signals from space that transmit positioning and timing data to GNSS receivers for time reference.
The location of the sync source may, however, vary depending on the network topology, cost, and application. Typically, telecom network nodes are synchronized using Precision Timing Protocol (PTP) and Synchronous Ethernet (SyncE) technologies used in IEEE 1588 PTP Grandmaster (PRTC source), boundary clocks, transparent clocks, and slave clocks within the network.
Open RAN is an industrywide initiative to develop open interfaces, disaggregate RAN into its individual network elements, and enable interoperability among hardware suppliers. The standardization of Open RAN is meant to give operators the flexibility to mix and match radio network components for best performance. This flexibility is not without complexity, further increasing the criticality of network timing function synchronization across components. Specifically, given synchronization applications are distributed across the front-haul (FH) and mid-haul (MH) network, including the distributed unit (DU) and radio unit (RU), it is important to synchronize the applications needing timing on the DU and RU to maintain very good sector/cell key performance indicators (KPIs).
In a legacy radio access network (RAN), all end applications needing synchronization reside within a single eNodeB. In 3GPP split architectures (Open RAN), disaggregated applications need timing distributed over the network, including DU, RU, and centralized unit (CU). That timing distribution increases the operator’s burden of managing the synchronization networks.
Eyes on KPIs
With the evolution of IEEE PTP, the radio network’s dependency on GNSS systems has significantly reduced. GNSS is, however, still needed as the PRTC as defined in ITU-T G.8272 and ITU-T G.8272.1 specifications, depending on the type of deployment. This is true for Open RAN DU, which is a logical node hosting RLC/MAC/high-PHY layers based on a lower-layer functional split, and RU, which is a logical node hosting Low-PHY layer and RF processing based on a lower-layer functional split (Figure 1). IEEE 1588 grandmaster clocks (PRTC clock source) are placed at the central data center within the network to source the clock to DUs and RUs in front-haul or mid-haul networks.
Every PRTC clock device with an integrated GNSS receiver or an intermediate boundary clock source that provides the time to the downstream network is susceptible to potential sync outages.
Depending on where the GNSS source is installed or placed, the impact on the entire network varies. For example, an O-RAN Alliance-compliant network with the DU acting as PTP grandmaster sourcing the clock to RUs in a front-haul network will see any GPS outages on the DU in RAN LLS-C1 configuration impact the DU and connected RUs.
In contrast, sourcing the GPS at the RU, which has an integrated GPS receiver and RU acting as PRTC sync source for the applications, would only see an outage impact at that RU. This also has implications for the PTP/SyncE KPIs distributed over the network from the PRTC source.
All timing and sync configurations and provisioning for a cloud-based deployment happen over a network. Outages are increasing, and networks need to be monitored closely to take corrective actions. This necessitates a central, real-time, accurate monitoring mechanism over the cloud network to retrieve the timing KPIs, build intelligence to estimate the outages based on key metrics, and take reactive and corrective steps to avoid impacting cell KPIs and increase cell availability.
Root causes of sync-based outages
As Open RAN networks proliferate, operators need to understand the primary causes of sync outages, all of which require real-time tracking to detect. In our work in Japan and other countries, we have identified:
- GNSS signal jamming, both intentional and unintentional
- GNSS signal spoofing, including of multiple constellations and on multiple frequency bands
- Signal blockages and multipath errors caused by tree canopies in rural areas or large glass structures in urban areas
- Ionospheric effects and geographical issues
- Hardware faults connecting GNSS receivers, such as assure arrestor issues due to lighting or cable faults
- Poor weather conditions
- Leap second warnings, including the need to inform all applications on upcoming leap second additions/deletions
- PTP packet drops, clock quality degradations, and clock advertisements
- SyncE clock quality degradations
There are also security concerns to consider. Modern networks provide vital infrastructure for business, mission, and society-critical applications which are of national concern. Between July 2020 and June 2021, the telecom industry was the most targeted industry with regard to GNSS security threats, with 40% of attacks versus 10% for the next-highest industry vertical (Source: EUROCONTROL EVAIR).
Detect and analyze timing and sync failures
Based on our experience, network operators can take the following steps to initiate mitigating actions for timing and synchronization failures, including those related to GNSS Security threats:
- Develop a mechanism for the O-Cloud interface to detect and analyze GNSS signal jamming, or interference or spoofing conditions before the conditions deteriorate.
- Analyze bad weather conditions that can interfere with GNSS signals based on GNSS outage history detected on one DU. This data can be used to predict and anticipate how these conditions might impact neighboring and co-located DUs, allowing an operator to more quickly initiate corrective actions.
- Similarly, analyze GNSS error conditions and predict how these conditions might potentially impact neighboring and co-located DUs to initiate corrective actions.
- Detect and Analyze packet timing signal failure (PTSF) conditions and initiate mitigative actions. Predict PTSF conditions, and take corrective actions such as moving to an alternate clock source before waiting for the faults to actually happen and impact the network.
This is not meant to be an exhaustive list. These mechanisms need to be reported over the O-Cloud interface and bring in necessary contributions to WG6 O-Cloud API spec for the synchronization plane (S-plane) as the existing datasets are not sufficient.
AI and ML for timing and synchronization
A multi-layered approach to integrating artificial intelligence (AI) and machine learning (ML) into timing and synchronization should be considered for precise, effective, optimized handling of timing and synchronization events.
For example, a solution to timing outages can be divided into multiple layers and leverage AI/ML mechanisms to do local learning per DU/RU, a set of DUs/RUs pertaining to that CU, or a group of O-CUs (see Figure 2). It needs methods to define the interactions between these multiple layers in real-time and non-real-time based on event, alarm, or error type. In practice an operator would:
- Define detection and mitigation algorithms for local sync failures within DU/RUs.
- Analyze information exposure to rApps/xApps to make intelligent mitigation actions and policies at Non-RT-RIC (SMO) or Near RT-RIC.
- Build automation and AI/ML-based algorithms for sync failure detection and analytics as part of the 5G core or evolved packet core (EPC).
Conclusion
Open RAN brings the promise of increased revenue streams, lower capex, and TCO, and diversity advantages to operators. Reaping these benefits will depend in part on leveraging AI/ML technologies in timing and synchronization solutions to bring more efficiency and optimization to managing sync networks in Open RAN and help operators improve KPIs and the customer experience.
Sudhee says
This is one of the best article I have come accross in recent times. Keep up the good work. Keep exploring.
Ramana says
Thanks Sudhee
DJ72 says
Is this dependant on NTP to cascade timing out to Cell Site RU’s if Grandmaster Clock/GNSS fails (see NTP in diagram above)?
Ramana says
no NTP is not considered an equalent backup for GM failures. NTP in the picture was represented for controllers which doesnt need PTP timesync while PTP is always needed for sync on DU(s). and RU(s)
Geo Vadakkan says
Ramana Reddy, I highly value your dedication and thorough analysis of the KPI challenges encountered by Open RAN operators. I concur that achieving consistent performance in Open RAN deployments requires accurate synchronization, connecting devices to UTC through IEEE PTP/SyncE-based clocks. As AI/ML advancements oversee numerous Open RAN devices, ensuring network reliability and responding to failures become imperative.
Ramana Reddy says
Hi Geo,
Thanks for going through the article. I concur with your views. Regards-Ramana
Sunil says
Well-articulated and easy to visualize. Thank you. Hoping AI/ML technologies would Detect & Analyze – Grasp, with that RIC algos could take corrective actions.
Ramana Reddy says
Hi Sunil,
Thanks. Yes RIC seems to be the way forward to bring in some intelligence to improve Sync KPIs. Regards-Ramana
Amit Palkar says
Excellent Article, Ramana … The security part along with the option to have RIC (with AI/ML models) will definitely play a part in mitigating the threats which you have mentioned.
Ramana Reddy says
Hi Amit,
Thanks for going through the article. I agree with your views. Regards-Ramana