AWS Network Performance Management Problems and Solutions

Cloud based online services and applications face the challenge of optimizing end user experience in the face of an ever growing bandwidth demand. This is further complicated by network related aspects of application performance like latency, congestion and jitter which are inherent to the architecture of the internet.

These issues can have a very negative impact on end user experience and can vary widely depending on the customer’s geographical location or the ISP/network that they use. End users on the US East Coast may have a great experience, while those in Europe see long latencies and very slow performance of your Web shop or application. Similarly, end users at a specific ISP may experience a very responsive service, whereas their neighbor down the street is confronted with a very sluggish web site with long page loading times due to slow delivery of ads in the pages.

How One AI-Driven Media Platform Cut EBS Costs for AWS ASGs by 48%

How One AI-Driven Media Platform Cut EBS Costs for AWS ASGs by 48%

In this article we discuss AWS network performance management  problems and solutions.

Key problems in AWS network management

1. Business problem

Network performance is essentially a black box for online service providers where they have little to no visibility into performance metrics like latency and congestion. As of today (Net) DevOps personnel have to manually diagnose network performance issues and redirect network traffic to avoid these problems. This is not an exact science and is mostly reactive in nature. Also putting in place the hardware capabilities to optimize the network related aspects of cloud application performance is costly and complex. The absence of any cost effective and automated tools to improve these performance metrics, in any meaningful way, adds to the problems.

The business impact however is very real:

  • Delayed Ad serving, bidding for Adtech
  • Lower conversion rates in eCommerce
  • Customer support issues
  • Lower return rates
  • Increased churn rates
  • Higher bandwidth costs
  • Business/Service competitiveness
  • Losing out on potential sales
  • End user quality of service suffers

Network providers have a vested interest in BGP route selection. Not all routes through the internet cost the same. BGP route selections are often influenced by business interests of network providers and their wish to control next hop selection.

ISPs often choose to route traffic though network paths that have the most financial benefit for them and not based on network performance metrics. There have been documented cases of large ISPs intentionally creating congestion in some network nodes to charge service provider’s premium rates for non-congested paths. Basically what they do is create a lesser version of the internet to be able to charge bigger bucks for the internet as usual.

2. Technical problem

The internet is a huge mesh of complex interconnected networks. It utilizes two groups of routing protocols to determine the path of traffic through the various networks. Interior Gateway Protocols (IGPs) for intradomain routing, and BGP for interdomain routing between Autonomous System (AS) organizations. One way to understand network performance issues is to look at the way in which internet traffic
is routed by the BGP.

BGP serves as the standardized routing protocol of the internet. It was designed in the early days of the internet with a focus on network reachability and stability, however it is not very smart when it comes to routing traffic to optimize performance related metrics like latency, congestion and packet loss. In addition, it has become very hard to analyze, manage and troubleshoot with the explosive growth of the
internet.

BGP works by exchanging routing and reachability information between autonomous systems on the internet. BGP makes routing decisions based on a number of metrics including reachability and AS_PATH attribute. This basically translates into choosing network routes which are reachable and have the lowest number of AS hops. BGP does not have the capacity to evaluate different network routes based on their
performance metrics like latency, packet loss, congestion and packet loss. Therefore, these crucial performance related metrics are left out when making routing decisions. As a result, network traffic often suffers from high latency, congestion and packet loss.

GlobalDots’ solution

GlobalDots probes upwards of 600,000 network prefixes in real time and collects performance data of every path through the network. This data is then processed and analyzed to determine the best path through the network with the lowest latency, congestion, bandwidth cost and packet loss. (Net) DevOps have the ability to create rules to automate network traffic routing and in the process minimize network
performance issues.

The Cloud/AWS deployment places the GlobalDots appliance between the virtual infrastructure and the transit providers. The connection to the virtual infrastructure is a physical connect (i.e. AWS DirectConnect 1Gbase-LX or 10Gbase-LR). Each customer is connected to the GlobalDots premises using a unique VLAN identifier.

Within the GlobalDots premises, there exists a virtual routing table for each application that reflects the specific performance requirements of the customers. GlobalDots collects RIB (Routing Information Base) data from routeviews.org. RIB represents a special type of database which stores routing information received by every BGP speaker from other peers.

Diagram illustrating the architecture of a system including components like AWS Direct Connect

Next GlobalDots probes all prefixes in the RIB for specific metrics like latency, congestion, packet loss and bandwidth cost. All this performance data is processed and analyzed by a spark cluster and an optimized routing policy is generated for specific metrics based on customer requirements that have been indicated. The optimized routing policy is generated by matching the performance attributes of each destination network (prefix) in the network to the customers’ requirements.

Once the prefixes match the performance requirements of the customers they can opt to override the best path selection of the Border Gateway Protocol. Detailed analytic reports are generated and communicated to the customer through the front end dashboard and provide end to end visibility into network performance.

Benefits include:

  • Network route optimization
  • Latency optimization
  • Bandwidth cost optimization
  • Avoid network congestion
  • Reduce packet loss
  • Real-time network analysis

Conclusion

Managing AWS network performance has specific problems you need to solve if you want to get the most out of it. If you have any questions about how we can help you optimize your cloud costs and performance, contact us today to help you out with your performance and security needs.

Latest Articles

Three Ways CISOs Can Combat Emerging Threats in 2025

73% of CISOs fear a material cyberattack in the next 12 months, with over three-quarters convinced AI is advancing too quickly for existing methods to combat it. But what can CISOs do to prepare for the coming wave – and access the resources they need to deal with this evolving threat landscape? To find out, […]

11th November, 2024
How Optimizing Kafka Can Save Costs of the Whole System

Kafka is no longer exclusively the domain of high-velocity Big Data use cases. Today, it is utilized on by workloads and companies of all sizes, supporting asynchronous communication between even small groups of microservices.  But this expanded usage has led to problems with cost creep that threaten many companies’ bottom lines. And due to the […]

29th September, 2024

Unlock Your Cloud Potential

Schedule a call with our experts. Discover new technology and get recommendations to improve your performance.

    GlobalDots' industry expertise proactively addressed structural inefficiencies that would have otherwise hindered our success. Their laser focus is why I would recommend them as a partner to other companies

    Marco Kaiser
    Marco Kaiser

    CTO

    Legal Services

    GlobalDots has helped us to scale up our innovative capabilities, and in significantly improving our service provided to our clients

    Antonio Ostuni
    Antonio Ostuni

    CIO

    IT Services

    It's common for 3rd parties to work with a limited number of vendors - GlobalDots and its multi-vendor approach is different. Thanks to GlobalDots vendors umbrella, the hybrid-cloud migration was exceedingly smooth

    Motti Shpirer
    Motti Shpirer

    VP of Infrastructure & Technology

    Advertising Services