A Content Delivery Network (CDN) is an interconnected system of strategically-placed computers that provides content to large number of users by duplicating that content across those servers and delivering it from the server that is closest to each user. A CDN can provide identical static and dynamic content throughout the world to many users at precisely the same time. Testing that the content is indeed identical and in sync across the entire network presents many challenges. The concerns of testing a CDN are amplified greatly in relation to the size of the network and the variety of the content. Considering that CDNs are often large and complicated systems in and of themselves, ensuring proper functionality for the many end users is a daunting and complicated process with potentially catastrophic consequences if done incorrectly. In this white paper, we will explore the process of testing a CDN, including all of the individual facets which must be considered and their importance to the network as a whole.
Figure 1 via – https://www.business.att.com/content/productsub-category/images/ICDS_15831-diagram.jpg
The CDN testing process is comprised of both typical web testing processes, as well as specific processes which pertain to the networks themselves. Because of the diverse nature of these systems, it can be hard to pinpoint exactly what will be necessary in testing them; that said, the following list contains many of the components which are involved in testing CDNs. Many of these components are interrelated and interdependent upon each other; the system may demand multiple types of testing to assess one aspect of the system, such as user permissions. As we explore these methods, it may be necessary to cross-reference a handful of systems over multiple types of testing; this is true of the testing process itself, which creates a web of interwoven processes within which the network is tested.
This type of testing is dependent upon the type of content your application distributes and each type of content (static pages, dynamic pages, downloading video, streaming video (live and previously recorded), audio (downloads and streaming) images, text, etc. will require different testing approaches. To illustrate the complexity of content-specific testing, we will use the example of a network which delivers video content to users. Video content can typically be delivered through a number of means: live streaming, streaming but not live, downloadable video files, and more. Video streaming is complicated by multiple protocols, formats, and playback devices. On-demand video is often delivered using adaptive bitrate (ABR) which will convert your video stream into chunks (usually about 5-10 seconds in length) and then feed those chunks in various formats and speeds to match the user’s video player. The video can be delivered over the internet, WiFi, 3G, and 4G networks, and the user can start playing the video at any point they want, skipping around as they are watching. It is important to test that the same video is delivered seamlessly to various devices in various formats over various networks and that the users can control the playback of the video as they would with a DVR player.
Live streaming video presents a collection of unique challenges for testers. Live video can’t be cached; however, the CDN must still ingest the live stream, decode the video and audio and then reformat it to match the users possible playback devices and network speeds. The user’s physical location present an additional challenge when the live content is, for example, a broadcast of a concert
in New York. The data must travel across a sizeable distance before it reaches users in California who want to watch the concert; will the network stream to them with a built in latency depending on the distance from the main server? Is there only one source delivering all of the packets for the CDN through a very high-bandwidth pipe, or does it feed to repeaters or reflectors that are closer to the users on the west coast? Can the CDN administrator change the configuration settings depending on the expected viewership of a specific live feed? The CDN must be tested for how it reacts to all possible configurations before the content goes live, indeed before it is even determined which configuration the network itself will use. One major concern with live video streaming is if, the event is very popular, what will the server do if the demand is larger than its capacity? Will it reject additional users rather than degrade the experience for current users, or will it automatically spin up new repeaters, or build in additional delay and deliver ‘almost-live’ content? The strategy for handling a large load on the system must be tested as problems with excessive latency (caused by having too many users on the system) as well as variations in the latency from moment to moment can cause annoying jitter in the streaming audio and/or video, resulting in a poor user experience.
Figure 2- via https://technowide.net/wp-content/uploads/2012/09/CDN.png
Picture here for live vs. on demand video streaming?
A related concern is testing for permissions in regards to physical location, such as area-specific content which must be blocked due to area permissions – a sports game which cannot be viewed live in the area surrounding the stadium to encourage local enthusiasts to attend the game itself instead of watching online from their homes, for example. Testers must make sure that these permissions are observed, and there are two main methods for doing so. The first is to have testers (or remotely accessible devices) in the actual physical locations try to access the content in the above-mentioned simulated “live” environment, and the second is to use a program which spoofs GPS location to try and fake out the system.
Another example of content-specific testing would be an online radio station, such as Pandora. Most of these services show song information, such as bios of the artists, album art, etc. from third-party content providers that appear while the song is playing. The information must immediately be refreshed if the user changes channels or when the song ends and a new song starts playing. Testers will need to make sure that the system manages the content synchronization so that the matching content goes out to users precisely at the same time, everywhere. This process relates to the network’s push and pull of content, and making sure that the content gets where it needs to go within milliseconds. This is often accomplished via location of the CDN servers; ensuring that there are as many servers as close to users as possible, and by making sure that the content goes to all the machines and that they all get updated at exactly the same time.
It can be hard to set up a test environment for this type of CDN; often testing is done on the live system on off hours, using content within that system which users do not have access to yet. The environment may change from client to client, as the specifications of what to test and how to test it will change depending on the customer’s needs. In addition, the client may change their system’s functional requirements over time as it and they evolve as a business, so testers must take into account all the ways information can be shuffled around the system. For example, it used to be that client information was pulled in by the user’s system only when it was necessary, but this could cause as much as a ten second delay for the end user. Therefore, most modern systems have the CDN push information from a queue, which is much faster and provides a better user experience.
Network simulation testing relates to content-specific testing as described above, but is more related to testing which is done in a non-production, or non-live, environment. We have had success using the Amazon Cloud to set up machines in different areas to build different networks manually. However, tools exist to simulate this set-up, such as Cisco’s Packet Tracer.
Pull content is that which the user requests when they want it, such as weather information as is relevant to their zip code. Content that is dynamic and changing continuously if often ‘pushed’ to the end user (or repeater near the end-user to be available immediately), such as a stock market ticker. The first step of testing push content and pull content it to determine which data within the system is being pushed and which is being pulled, and to ensure that it is possible to test both within the system’s constraints. This must be considered beforehand as well as during the testing process.
A CDN can have five, fifty, five hundred servers, or even thousands; how does content get from one machine to another? No matter the size of the network, answering this question will always be necessary for an effective and efficient network. For example, if the network is 500 machines, it could be that there are five nodes, each which push to a hundred machines; there could be repeaters that are automatically set up to propagate the data, or it could be propagated based on usage (as soon as one user pulls specific content, it is then cached and stored on that server for the next user). The CDN administration can also assign specific nodes and schedule replication based on the specific pieces of content.
Figure 3 via – https://www.business.att.com/content/productsub-category/images/ICDS_15831-diagram.jpg
How a network is set up depends on how time-dependent the data is, and how the end user will utilize it. One page of the data may be dynamic but might only update once a day; this page may not go to its specific server until someone in the area requests is, at which point it loads for that user, is cached and ready for the next person. If data doesn’t go to the page until the page is requested, then the page must be requested from a user near each individual content server. There are often multiple schemes present in a CDN; each needs a different testing process to assess all of the potential defects. Ideally the testers are able to set up and test all of the possible configurations of replicating the content. Time sensitive data that is not presented to all users at precisely the same time can cause many issues. Imagine an auction site where bidders from all over the world are bidding against each other for a prized object – what happens during the last few seconds of bidding if the users are seeing different prices? A user many attempt to put in a bid that is immediately rejected because he is not seeing the actual price. The users will start to distrust the auction site and stop using it.
If the data is not frequently accessed, it can be loaded from a cache when one person needs it, and then it will be there for everyone. For each system, what can get cached and when, or if it should ever be cached at all, must be determined, and all possibilities must be tested; if the page never caches, does that page always update with a new, fresh copy? The caching method works well for some types of data, but obviously there are types which cannot take advantage of caching. One example would be stock market data, which needs to be replicated and refreshed immediately all the time; testers for this type of data would need to ensure that it is never cached, and that the parameters which have been set up for it are met. This is a question of determining how the system is set up, and making sure that it proceeds and processes data exactly how it is supposed to.
Edges in a CDN are literally just that: the physical edge of a network that is closest to the end user of the content. . A user could be right smack in the middle between two servers, and that user would sometimes be pulling content from one server and sometimes the other, if one server is busy, etc.; in this situation, it is easy for one server to get out of sync with the other, providing content to the end user which may or may not be the most current. If an edge server doesn’t have a particular piece of content requested, then it may go to the next closest edge server for the information, or the nearest distribution node, or if neither of those servers has the information, all the way back to the main content server. Understanding how that content gets delivered is important in designing test scenarios to verify that the end-users are always getting the correct content as quickly as possible.
It is also important to sometimes duplicate or back up the servers in a particular area to make sure they are always providing accurate content delivery; if a server goes down in Seattle, the next nearest one may be in California, but if the Seattle and California servers are not in sync with each other, the data the end user receives could be completely unusable. For example, on an online shopping network, if a user is looking at a product and the server goes down, the new server may not have received the most current price list, so the user could be seeing an inaccurate price for the items for which they are shopping. It is important to test these scenarios and even force the servers to have out of sync data, removing one of the servers and verifying that the business rules for handling these situations are followed. While there may be a requirement that all of these systems serve up the same data at precisely the same time, the only way to verify that is to have constant monitoring in production; so testing the scenario where this actually doesn’t happen is just as important.
This type of testing was briefly discussed earlier; it relates to whether certain content is protected, while certain other content is protected or encrypted from user access. An example of on-demand pay-for-use content would be a system like NetFlix. Unless you have a Netflix subscription, you should not be able to stream content from Netflix. Things to test would including verifying that multiple users can’t login with the same NetFlix account, or from multiple devices at once (streaming over your phone’s 4G network will be a different protocol than streaming to your television over the internet). You would also want to test logging in from different locations that would hit different CDN servers – does the login authentication go back to one main server, or does each CDN server contain information about who is logged in at any given time?
Protection of content is not always replicated to all servers, so the testing process will include ensuring that all servers pass along updates for consistent data as to who has access and who does not. That way, someone who is not allowed to listen in New York is also denied service when they travel to Virginia, et cetera.
When testing a CDN, it is vital to remember that there is no one method that will always work for all networks. Because of the size and complexity of CDNs, and the large number of people typically using them with varying degrees of access to the network’s content, careful consideration must be given to the test process. All parameters of a system must be considered while designing the tests and test processes; it is important to understand that each network must be tested differently based on its requirements and expected content and user base. To properly test a system of this complexity, it is vital to work with a group that has extensive knowledge of testing in general, as well as great expertise with CDNs in particular. Leaving the testing process up to someone less qualified will result in poor user experience, decreased ROI, and potentially even leaks of private data which would have been protected by a proper testing process, among other potentially-damaging consequences.