CDN Logs Are Priceless. How Can You Better Use Them?

Dr. Eduardo Rocha Senior Solutions Engineer & Security Analyst @ GlobalDots
11 Min read

CDN is Where Most of Your Web Traffic Goes

A content delivery network (CDN) is one of the most important tools for optimizing the performance of heavy-traffic websites and applications deployed in cloud environments. CDNs offer many benefits, including faster page load times, reduced bandwidth usage, and improved availability. They work by caching content on servers worldwide strategically placed for optimal performance.

While CDNs have been used to serve websites and web applications, it is more commonly used for serving streaming media such as videos, music, and games. In this context, CDNs have a wider scope of data distribution than traditional web caching; CDNs serve large volumes of data that are usually ‘high bandwidth’ or high latency (i.e., relating to traffic associated with high-value data). The term ‘web content delivery’ is also used to refer to the use of CDNs to deliver content to web browsers.

Reduce your AWS costs by over 50%

Discover your Cloud Saving Potential – Answer just 5 simple questions. AppsFlyer, Playtika, Lufthansa, IBM, top leading companies are already using our FinOps services.

Reduce your AWS costs 
by over 50%

CDN Logs

CDN logs are vital data points to understand a website’s performance & security posture and to handle emerging issues. A CDN log will show you how many times your site was accessed, which domains are visiting your site most often, and what files are being downloaded. CDN logs can provide valuable insight for CDN providers and their customers about the quality of the CDN services. We may use logs to identify bandwidth issues, changes in latency, and other such problems that might affect CDN performance. A CDN provider should know what is happening with their network and how it affects customer satisfaction. Logs must be regularly monitored to ensure no spikes in latency for users and any other potential issues.

Collecting and Categorizing CDN Logs

CDN logs are a resource that can be accessed through the API or the website. Most CDN websites will have a log in the footer. The API is excellent for monitoring the performance of the CDN. However, performance log reports can be delivered either through S3 or a custom log service.

CDN log files can identify the workload of the different specifications. To show the effects of caching tools on the performance of a system’s processes, we divide the log data by some specifications.

For example, if you watch videos over the Internet or on a TV set, their names contain string patterns such as “live,” “tv,” and the channel name. VoD downloads are the other chunks and manifest file requests, and their values do not contain any string pattern related to the live streaming class. It will have extensions such as “. dash”, “.ts,” “.m3u8”, “MPD.” Rearrange those records into those created during distribution.

Website records can be divided into smaller parts to analyze them in more detail. With dynamic content, content grant requests with a high hit value are called “HIT.” Use the CDN mechanism to distribute standard media in each “MISS.” We know all the product’s packaging content which their requisition doesn’t have any “MISS.” Other content is organized by category into non-packaging classes.

System logs in 7 days are listed in the table below. The total number of live streaming requests is more than 94.5% of the total number.

Table Shows the Number of Records Per Class

ServiceIs PackagingNumber of Records
Live StreamingNo152697608
Live StreamingYes129278868
Video-On-DemandNo2069393
WebsiteNo14301252

Information Recorded on CDN Logs

Web Access Log

Depending on the choice made with CDN, you can alter the format of your CDN logs, potentially to something more amenable to CDN log analysis, such as JSON. Still, this example will provide an overview of the information usually available in CDN logs and help you analyze CDN logs.

127.0.1.1 username [02/JAN/2022:13:56:35 +0000] “GET /my_image.gif HTTP/2.0” 200 159 1388

IP Address (127.0.1.1): The IP source from which the user has requested their data. Here you may see some DNS error messages coming from the same IP address, which can mean an individual has faked your URL or is using your IP address inappropriately.

Username (username): Some companies will attempt to decipher the Authorization header in your request and pull the username from it. For example, if Basic authentication is used, you will see the username encoded along with your password. If you observe any suspicious behavior, you might be able to trace it back to an account you can block.

Timestamp (02/JAN/2022:13:56:35 +0000): As the name suggests, this part of the log specifies when we sent a request. It is one of the most common values when rendering this data on graphs. For instance, detecting sudden peaks of traffic.

Request Line (GET /my_image.gif HTTP/2.0”): The HTTP GET and POST statuses used in this query indicate the application and item the user requested. For instance, we can assume that a request of type GET was issued. This indicates that the user was most likely requesting something from the server. Another example would be POST, where the user was sending something to the server; itemizing the requested resources and which HTTP version was used can also be done.

HTTP Status (200): A status code followed by a 2 is a general-purpose return value used to confirm success in response. The same code denotes a different outcome depending on the number appended to the return code’s end.

Latency (159): Latency is a hugely important indicator and a crucial metric to track. When fluctuations in latency occur, there is a chance that your end users may notice some slowdown in response. Slight declines in latency indicate that the CDN is running smoothly.

Response size (1388): The response size is commonly disregarded, but it is essential to productivity. If your endpoint gives a tremendous response body, this can lead to a great deal of extra work for your server. Learning the response size can help you understand whether your application is burdened.

How Can CDN Logs Benefit Your organization?

  • CDN logging allows a company to maintain logs and determine where their content is traversing. It is beneficial because it provides insight into the location of the client’s audience. We can use the data generated from a CDN log to optimize a website for user downloading, which includes analysis of browser history, geographical location, and industry. The information created from this insight can engage and provide better customer service.
  • CDN logging is used for content optimization. When a site is using a CDN, the amount of content delivered to the client decreases, which is beneficial for the client. However, the CDN logs can provide information about the site’s performance—including page load time, browser information, and client information.
  • CDN logging is a relatively inexpensive yet effective way for a company to maintain analytics. Using CDN logs, a company can gain a competitive edge and better understand its customers’ behaviors.

The Value of CDN Logs in Security Issues

Hackers have targeted CDNs due to the high visibility and the amount of data they hold. CDN Log Protection is a way to avoid this type of attack. CDN logs are a log that records all requests made from the user’s browser, including the IP address from which they make their request. We can analyze the logs after an attack to find out what hackers did and where they were located when the attack occurred. CDN Log Protection is a way to monitor CDN log files. It monitors the log files for unusual activity and prevents them from being uploaded to the CDN. CDN Log Protection is a free application that can help you help your customers while preventing hacking attempts.

How CDN Logs Help against DDoS Attacks

In recent years, the world has constantly been grappling with major disruptions to digital services. DDoS attacks have been a major factor in these disruptions. They work by generating many fake requests that clog up the servers and force them to shut down. Companies effectively using CDN logs can better detect DDoS attacks and divert traffic from a server under attack.

How CDN Logs Help against Malware

Malware attacks can bring even the most powerful of websites to their knees. CDN logs can help thwart attempts by cybercriminals to cripple networks by gaining control of web servers. Such logs can identify malicious activity and help providers shut off access to infected servers.

Solutions For Efficient CDN Observability

CDNs are the lifeblood of the Internet. They transfer data from remote servers to end users, speeding up load times and minimizing latency. However, making sure that they consistently do so can be a challenge. CDNs have developed many solutions to provide observability into their services to maintain easy access and transparency to overcome this obstacle. However, there are several challenges associated with CDN observability; this is especially true for the common distributed nature of many CDNs, where observability solutions encounter numerous limitations.

Furthermore, since virtually all CDNs use multiple cloud providers, the solutions often have to be customized for different providers to ensure transparency and maintain a consistent look-up across all services. We can address these problems by introducing new solutions and techniques.

#1: View CDN Logs at Source

The straightforward option is to is to incorporate the observability data directly into the CDN’s infrastructure, so the logs can be viewed & visualized on the CDN platform itself. This approach  requires in-depth knowledge of observability and its various aspects, such as dependency management, debugging, reporting, and security. This way, both the CDN and its clients can use the observability data.

#2: View CDN Logs in Your Cloud Provider’s Monitoring Solution

Another option is to use a cloud provider’s native observability solution. This approach will not require customization for different providers but instead may require knowledge of the provider’s native implementation language and the existing features and capabilities. However, there is still a possibility that some types of data will remain inaccessible using this approach.

#3: Streaming CDN Data to a 3rd-Party Solution

There are many robust logging & monitoring solutions available, which integrate with most of your business applications. However, CDNs produce such an amount of logs, that your solution of choice must have exceptional data ingestion capabilities. If you want things to be reflected real time, this solution shouldn’t depend on indexing. Finally, if you don’t want your costs to skyrocket, it should also have tiered storage pricing to complement those masses of data which are not frequently searched.

Another option is to host a custom open-source project that uses the observability data from the CDN, which may provide additional functionality and improve the existing solutions. It also comes at a cost: hiring a team to build and operate such an infrastructure.

Why Don’t Organizations Utilize Native CDN logging?

Viewing CDN logs at source might be the simplest option listed above. But this isn’t really the case, as it comes with great costs.

  • Latency – Native CDN logs, as displayed in the CDN dashboard are, surprisingly, not in real time. The 3–4-minute latency means that a service or security issue is detected after it has already affected your website. This, obviously, isn’t soon enough.
  • Limited visibility – Native CDN logs will only flag issues if the integrated bot manager flags them. So, if your issue has nothing to do with bots, or if there’s a false negative on your bad bot manager’s end, chances are it will go unnoticed. 
  • Low readability – Visualization isn’t the strongest side of CDN logs. Filtering is limited, and one cannot customize the visualization, but rather has to go with a narrow, predefined set of charts. Therefore, these logs are very complicated to consume and derive insights from when viewed on their source.

Why Traditional 3rd Party Logging Doesn’t Cut It

Due to the restricted troubleshooting capabilities of native CDN monitoring solutions, most teams export their CDN logs to more robust 3rd-party log management and monitoring systems. Unfortunately, traditional logging platforms have additional associated challenges.

CDNs produce a high volume of data, and traditional monitoring tools must index all of this data before they can analyze it and provide insights. Alternatively, they can use a more BI-oriented approach in which a complex bid data operation is needed to analyze the data and it is far, far from real time. Indexing all of this data is not only cost-prohibitive but inefficient in pinpointing crucial information and alerting on critical events and the BI approach is complex and requires constant maintenance.

To get around the challenges of high data volumes, teams reduce coverage by ingesting only part of the data – either by blocking low-level log statements or with sampling. Unfortunately, this means that only a partial view into performance is being monitored and any insights that could be gleaned from those logs will be lost. Needless to say, with incomplete coverage you have no actual way to verify how well your CDN is working for you and not a clue as to how much you will pay by the end of the billing period.

What to Look for in 3rd Party CDN Logging in 2022

Worthwhile 3rd party logging solutions are all about effective data ingestion and visualization.

Some leverages a proprietary streaming technology to ingest and analyze event data without relying on indexed storage. This enables users to ingest and monitor large data volumes with zero latency and without worrying about exponential cost increases or performance issues.

Other traits we would like to see include:

  • Grouping similar requests to reduce the amount of data shown
  • Real-time visualization
  • Kibana integration
  • Data enrichment which allows to flag suspicious events even if they’re not recognized as bots in the origin.
  • Alert setting

How CDN Logs are Saving Costs

It is estimated that CDN Logs saves customers around 5%-10% per month on their bandwidth costs. This is because CDN Logs uses a custom algorithm to send logs to the most efficient location for storage based on the importance and volume of data.

Every company has a different set of needs and wants when it comes to storing data, and this is why CDN Logs can suit any business’ needs. The most important aspect of CDN Logs is that it is cost-effective. This is because businesses are not required to pay for storage facilities in traditional log stores. In addition, companies do not have to spend time and money developing their own data collection tools. CDN Logs is backed by a 100% uptime guarantee, which shows that our customers are completely confident that their logs will be stored reliably and securely. 

Conclusion

Logging solutions can save hours when trying to troubleshoot problems. The best way to evaluate the logging solution is to do an assessment. Assess your needs and choose a solution that suits your organization’s needs. You will want to ask yourself a few things when assessing logging solutions. How many different logs do you need to capture? What are the types of logs you’d like to capture? And can they be captured in a central location that is easy for you to access?

The decision between using a logging service and creating your log files requires research and evaluation. You need to ensure that you have the right tools in place first. Logging solutions have various functions, features, limitations, and capabilities.

GlobalDots CDN & Observability solutions offer scalable, customizable logs that surface issues in real time and can be searched in seconds. They offer a comprehensive suite of enterprise-class services and 24/7 availability and support.

If your website uses a CDN and you want to get the most benefits out of it using logs to ensure performance & security, contact us for a free assessment.


Latest Articles

A Breakthrough in Observability: Cost-Effective Tracing

In an era where more observability vendors are offering tracing ingestion and visualization as part of their services, GlobalDots stands out by providing a set of data optimization features that significantly reduce costs, maximize insights, and create a scalable tracing strategy​. The Need for Cost-Effective Tracing Reduce your AWS costs by over 50% Discover your […]

Miguel Fersen Iberia & LATAM Regional Manager @ GlobalDots
25th May, 2023
Streamline Your Alert Management with Groupings

Alerting is crucial for avoiding outages, not just responding to them. That’s why GlobalDots recently added to its portfolio an innovation that revolutionizes the way alerts are processed, enabling teams to achieve their goals proactively and resolve issues quickly. Handling alerts on a large scale can be difficult, especially when dealing with hundreds or even […]

GlobalDots
2nd February, 2023
It’s time to unleash the power of the force!

There’s a galactic misconception that monitoring CDNs has no value and is too expensive to store and index. While that used to be true, it’s now the worst mistake an IT-jedi can make!  GlobalDots, a 20-year CDN expert, teamed with observability innovator Coralogix, to develop a next generation observability platform for monitoring Content Delivery Networks’ […]

GlobalDots
12th December, 2022
Real-time CDN analysis that identifies trends & detects anomalies

Finally, you can utilize your CDN to its full potential, using the most innovative solutions in the market. Real-time log analysis was never really possible, particularly for online businesses that deal with large amounts of traffic. With today’s technology, you can process logs instantly, view dashboards, and receive alerts before bad things happen. Reduce your AWS […]

Thorsten Deutrich VP Sales & DACH Regional Manager at GlobalDots
14th November, 2022

Unlock Your Cloud Potential

Schedule a call with our experts. Discover new technology and get recommendations to improve your performance.

Unlock Your Cloud Potential