I keep hearing terms like Cloudflare, Akamai, Fastly - but what do they really do behind the scenes, and why do companies rely on them so heavily?
Amazon could build one ginormous warehouse in the middle of the US. It would probably be super efficient at getting things in and out because it wouldn't have to worry about sending surfboards to California and extra snowblowers to Wisconsin, but only in the fall/winter. That being said, roads around that building would be congested and shipping times would be terrible.
Websites are similar. You want smaller, regional operations closer to the user. Instead of a website building data centers close to everyone they can hire those providers to host copies of their websites all around the world. This means those 100 gig Call of Duty downloads don't have to go all the way from the US to Australia thousands of times. The CDN downloads it once to the Australian server and all the local downloads are pointed there. That way we don't overload links with redundant data and people who want to download other things have bandwidth available.
When you host a website, only a small percentage of the content actually needs to be updated in real-time. CDN stands for content delivery network, I think, and they have a network of servers all over the world that will store the parts that don’t change as often so they are physically closer to the end user when they request a page. That means a user in Malaysia can start downloading all of the things that make the website look nice from a server in Malaysia while the main content comes from wherever the site owner is (hosted).
To give a more concrete example, there is dynamic content and then there is static content.
Dynamic content is, for example, this very comment I am typing right now. The people who created reddit did not make it. But they provided me with the means to write it. So this comment gets stored in a database and there's a lot of programming code that's actually involved in getting my comment. That code requires processing power to retrieve it as well as to then display it.
By contrast, static content is things like the reddit logo. That's just a plain gif or png file and it almost never changes. However, static content can actually be pretty huge, ranging from small files like stylesheets (the files defining your website colours, where images go and so on) to images and huge files such as videos of full movies.
So in order to save processing power and bandwidth, the dynamic content servers that run the code are left to deal with that, and the static content is served from CDN servers.
Even with content that changes often it can be worth using a cdn - for example we have a web api that gets very bursty request patterns. Putting a cdn in front with pretty short ttl on the cache can really help to reduce the load at origin during the spikes. Of course this pattern isn't appropriate for every endpoint (some might *really* need the bang up to date data) - but in many cases it's reasonable.
Those are distributed caches, though. Not arguing, I just don't know whether I would call those CDNs. Unless they actually cache locally stored, now static data. Which has always blown my mind as we're returning to the older days of the internet with Internet Services Accelerators (or whatever MS ISA used to stand for).
These things are all delivering some kind of http responses - the question is where they get them from in order to deliver them. I would say that caches are components of a CDN... but it's just definitions really. If you're thinking static assets, then conceptually those are cached items with a long ttl. In principle it's not really any different from some api response json blob that's been dynamically generated at origin and only lives for a couple of minutes in the CDN's caches?
I just want to address the second part of your question as other answers have covered the rest very well - companies rely on them because they improve user experience (faster load and download times), and a better user experience usually raises chances of that person buying from you or enquiring into your business etc, and because they add resilience - a single server hosting a popular website for every global visitor will struggle to cope with the demand, but multiple servers (the CDN) helps to share this around multiple locations so you’re less likely to get overloaded through lots of visitors.
So lets say I host my website on my computer at home (not a great plan, but it's easier to visualize). Every person who visits takes up a little chunk of my home computers processing power. Everything's hunky dory, until too many people visit at the same time. Suddenly, my computer crashes, and the site goes down for everyone.
So I instead break it up a bit, so that my site goes out to several other computers to copy of my site, and people get distributed around to them, so there's never really too many people on any one computer at a time (if one of the computers suddenly gets a ton of visitors, the excess gets dumped onto one of the others computers). Now, since I'm putting it out on different computers anyway, I might as well also have people go to the computer that's closest to them, which also improves the websites speed.
Most websites don't really need anything more than the basics, so having a cloudflare or such is a nice easy way to prevent short overloads (such as the reddit hug of death), and has some speed benefits as well. As soon as you're dealing with hundreds of thousands of views daily though, you have to have a system set up so that if you suddenly go viral and have a billion views, your website doesn't explode.
Essentially, a CDN (content delivery network) takes your site and puts copies of it all around the world, then directs any browser that connects to a nearby copy. That way, people from, say, Australia and France can access it much more quickly than if it were only available on one server in the Midwest US.
By spreading out copies of the site, they also mean it’s much harder for your site to go down due to an issue with your server. (Note that this also means that a CDN outage can be devastating, as it means everyone gets directed to nonexistent spaces- see the couple major ones in the past few years.)
It is a specialized service to help the internet work better.
Most people create websites and focus on content. You might create blogs, videos etc. But you don't really care about how that content gets to your end user - from your standpoint, just host it on a server and connect it to the internet. The rest is a black box.
Well the CDN is part of that black box. It makes copies of your content and hosts it in various regions (with a lot of rules about updating the copies). So when a user requests your content, the CDN automatically routes their request to one of the nearby copies rather than back to your server. This reduces the workload on your server as well as reducing long distance traffic on the internet. On top of that people who view your content get it faster with less delay.
By paying for CDN services, websites get better customer performance without having to do all that copying and rerouting themselves and bearing the cost of setting up all these mirror sites. The CDN, because it is specialized, can handle multiple different websites and become very efficient.
Physical copies of a newspaper like the New York Times are delivered to places all over the United States and parts of the entire world. They are not all physically printed and then delivered from New York itself. Instead their copies are delivered to printers all over the world, who then print their own copies, and are delivered locally. Otherwise it could take several days to have a daily newspaper delivered to you, and then it becomes stale or outdated.
You can think of CDN as kind of the same thing for the internet.
If you have servers distributed across the world, you need a CDN to 'direct' users to the right servers etc. - otherwise it would be pretty painful if you're an EMEA user to get directed to an APAC server (latency will be longer etc.)
CDNs host content more locally for the users.
Example: If I were to own a server for my website in Japan every user that visits the site from the US would have to download all the pictures and text through a long undersea cable and the site would feel slow. To avoid that and because I might also have visitors from Europe or maybe Africa I can buy a service from a CDN provider to host copies of my site all over the world. Now whenever someone visits my site instead of their request going all the way to Japan, it goes to the geographically copy of the site/or parts of it's content.
tldr: CDNs are globally available copies so websites load faster.
Also far high resilience than a website hosted in a single location.
Also, and to OP's question "why are they critical", most CDNs also provide extra functionality tied to resilience.
For example, all 3 examples OP posted offer some form of DDoS (distributed denial of service) protection so that an attacker can't just get upset at you and spend a couple bucks aiming a botnet at shutting your website down.
They offer these services not only over the normal content that CDNs typically host (HTML, stylesheets, images, and other non-changing resources) but in many cases will also provide an API gateway so that they can protect you against DDoS for your dynamic content as well which wouldn't normally benefit from a CDN.
For example, Cloudflare API Shield.
Are CDNs and CNDs the same thing?
No, it's a typo.
Thanks, all clear now
Oops fixed. CDN is just content delivery network.
Thank you for your explanation