Netint technologies

Ultra-Low Latency Cloud Video Applications

The Case for Ultra-low Latency Performance

Emerging video services require encoding technology that can scale economically to any number of users. Cloud streaming is extremely demanding, presenting a combination of technical challenges. Unlike delivering complex web pages where content can be sourced from many locations, streamed video can be aggregated in the cloud and packaged for rapid delivery. High video quality and low latency must be achieved together to provide users with a sense that they are experiencing a local, native application.

 Latency requirements for emerging cloud streaming applications become increasingly viable as low latency targets can be achieved.

The instantaneous cloud is a new phase in computing where cloud technologies evolve to deliver native experiences to any smartphone, PC, or XR display. Delivering a sense of local application presence, or immersive virtual reality (VR) total sensory presence, requires low latency and high visual quality to ensure users have parity or better in their cloud-based application experience. The cloud can provide more general compute and 3D graphics performance than smartphones or low-performance PC clients. However, driving the visual experience with low latency is a relatively expensive and challenging problem to solve from the cloud. Moving these experiences closer to users can lower latency by avoiding long runs, network hops, and network congestion.

Network Topology

Making the instantaneous cloud a reality requires servers to be placed near users, within regional points of presence or cell tower base stations, accessing client devices over fast, low latency networks including fiber to the home and 5G. Traditionally servers in the cloud, in large, centralized data centers delivered applications with relatively high latency, and these high latencies remain today. Real-world local networks (~100 miles) can deliver just under 20 ms latency with ideal conditions and latencies of 40 ms to 80 ms are more typical. Target network latencies for 5G are below 10 ms nationwide (US) and below 5 ms for regional data centers.

The deployment of 5G infrastructure and placing servers in a regional data center within a fiber run in the same city as the central offices or base stations of network service providers can reduce network latency significantly while simultaneously providing virtually identical latency compared to base station placement of servers but without the associated cost. Partnerships that improve connectivity and the network performance between regional data centers and internet service providers are critical to delivering low latency applications. Network service providers, local data centers, and cloud service providers must coordinate to reduce latency.

Network service providers – which in the US include AT&T, Verizon, T-Mobile, Comcast, Charter Spectrum, and Lumen CenturyLink, – seeking to avoid the expense of base station deployments can employ network central offices for deploying servers. Often hosting legacy services and infrastructure, these central offices and points of presence must evolve to support large-scale, high-performance applications services. Network service providers are in a perfect storm of soaring demand and pressure to upgrade. The deployment of 5G, COVID-19, and high network utilization is putting pressure on resources to deploy additional data center infrastructure.

Local ISPs, Data Centers ISPs (Internet service providers) and data centers can create improved connectivity to mobile and home network base stations, lowering the latency of these connections by deploying fiber and tightening network relationships. Agnostic to both network providers and cloud service providers, they are well situated to become a deployment hub for cloud servers.

Cloud application providers with a high level of vertical integration, such as Amazon, Google, and Microsoft, already have low latency Content Distribution Networks (CDNs) making them ideally situated to provide cloud services at low latency all the way to the wired/wireless network service providers. Amazon’s low latency CDNs and the Twitch service can be utilized to provide cloud video services and cloud gaming. Google has created a local presence and runs fiber in many cities giving them the opportunity to optimize for latency in their data centers. Microsoft has launched game streaming to augment Xbox subscriptions and provide those subscriptions to any client platform even Android and iOS.

Cloud Video Requirements

The emerging cloud streaming experiences poised for growth have common technical requirements. First, they require visuals of high quality and resolution. Second, they require low latency, so that user input can be rapidly incorporated into the visual output. Third, they require scalability such that users can access these services reliably and service providers can offer them economically.

High Visual Quality

Users have become accustomed to crisp, high-quality images on their smartphones and HD screens. Game graphics are particularly sensitive to visual artifacts from compression and feature fast movement diminishing frame to frame compression. Achieving better latency has required sacrificing visual quality and/or bitrate, and the step function improvements offered by emerging video codecs including AV1 come with commensurate increases in latency and compute requirements. Today, game streaming at 1080p utilizing H.265 or VP9 video encoding will require about 3GB/hour or 7Mb/s with peaks over 15 Mb/s. In the future, cutting-edge AAA games threaten to make server hardware obsolete quickly. Running new titles will require switching to more modern graphics hardware, and user expectations shaped by native PC gameplay will require 3D rendering, resolution, and video encoding quality to be at parity with increasingly capable native PC and game console gameplay.

Low Latency Video

The raw latency of video encoding is a function of the complexity of the target video CODEC and the performance of the underlying processor. Today’s ASICs are the lowest latency encoders available with 4ms (720P) to 8ms (1080p) using H.265, a complex and computationally expensive CODEC. If this is combined with an ideal network latency of 5ms for local loops, there is a plentiful latency budget remaining for complex interactive applications. Game streaming to home and mobile devices has become a beachhead for consumer interactive application streaming, and the network is proving critical to the user experience.

A wide range of low latency use cases is driving the emergence of streaming architectures and video processing technologies. Games that require fast response, quick pointing precision, or easily noticed visual feedback suffer when latencies begin to pass several frames, which at 60FPS is 16.6 ms between frames. For emerging augmented reality (AR) and immersive VR, tracking user movements and translating camera pose into a real-time visual experience requires sub-frame motion to photon latency. Movements and head/pose tracking must be incorporated as soon as possible, ideally in the next frame of visual output. This next frame requirement creates the often-cited “20 ms motion to photon” requirement for immersive VR – 20 ms being the time required to achieve worst-case three frames latency (miss only two frames) at 90 frames per second. In virtual reality, the term presence describes the user’s sensation that they are in a particular space as if it were reality. To deliver presence in cloud-based VR, every step in the computing and networking architecture must work together to achieve sub 20 ms latency with high-quality, high-resolution, high-framerate visuals.

Variables that normally work against one another must be simultaneously tamed– visual quality vs. bitrate or latency vs. visual quality can no longer be opposing forces. Looking forward, the demand for lower latency continues far into the future – with everything from immersive VR to cloud-powered virtual agents requiring a fast response to user input.

The deployment of 5G front and back ends and utilizing compute resources closer to the user is critical. From a server perspective, dynamic resource allocation should allow better scalability in achieving parity with native experiences, but the high-quality displays in today’s smartphones create pressure on both resolution and visual quality. Achieving high quality with low latency remains a gap that high speed, ASIC encoding targets to improve the experience on mobile devices and complement low latency 5G networks.

To achieve the scale and economics required, video processing will need to scale in the cloud, and processing inefficiencies once tolerable now need to be eliminated.

New streaming architectures need to reduce latency by placing compute closer to users and reducing upstream bandwidth by processing application data and video minimizing backhaul traffic to central data centers. The technologies linking the data center, regional point of presence, wireless or wired network base stations, and users must be tightly integrated. This tight integration requires service providers to work together with application service providers to ensure the quality of service for user applications.

Related Article

Capped CRF
NETINT Symposium

Save Bandwidth with Capped CRF

What You Can Do with a VPU: Save Bandwidth with Capped CRF   Video engineers are constantly seeking ways to deliver high-quality video more efficiently