Do you often clear the cache in your web browser? Years ago, you may have cleared your Netscape or Internet Explorer browser cache in order to clear up critical hard drive space. Public-access computers in libraries, for example, often set their web browsers to not cache at all, fearing that critical user-sensitive information would be cached and malicious individuals could access user data. However, did you know that the cache is central to mission-critical services and content distribution in the modern Internet? Before clearing your cache, consider the many benefits it offers to today’s users.
Caching: The basics
When you visit a website, your web browser downloads static content, such as images or text. Given that most static content has a fairly low rate of change, your web browser files it in a cache—a storage unit that maintains data locally. For example, Amazon is not likely to change their logo anytime soon, so instead of re-downloading the logo every time you open a new page, the image will instead be sourced from your locally stored file. This operation speeds up transaction time significantly and lessens the load on Amazon’s web servers.
Refreshing the cache
Each item placed into the cache has a time to live (TTL) indicating the lifetime of the cached item. A downloaded image, for example, may have a TTL of 30 days.
When you visit a website, your browser performs the following operations:
- If the website contains a specific image or text, your web browser will check the local cache first for the appropriate content.
- If the content in the cache has not expired (meaning its TTL has not reached 0) your web browser will display the image from local storage.
- If the content has expired, your web browser will fetch the image from the server again, clearing your cache.
Caches even exist on web applications. Google, Amazon, Netflix, and other large web applications cache their multimedia and content on special servers located around the globe, placing the cache entirely in fast random access memory (RAM). When you stream a movie from Netflix, you are most likely accessing a node in the company’s content delivery network located nearby, thanks to anycasting technology.
Web application caches are distributed across the Internet and often receive updates by querying other caching nodes closed to them. For example, a Netflix cache located near Boston, MA may query a cache located in Providence, RI for content updates, rather than querying central servers located far away.
The following is a representation of how caching works from a user’s perspective when accessing a service like Netflix.
- The user logs in to Netflix and requests to stream a movie.
- The domain name service (DNS) returns the closest Netflix cache service, and the movie begins streaming.
- At the same time, the Netflix caching server determines if the entirety of the film’s content is cached at that point of presence. Often, only pieces of data are stored at certain caching points. If the requested content is not stored, the caching server finds another nearby caching server and queries it for the requested data.
- The caching server continues to query other caching servers until it acquires the desired data, cascading through the network until the entirety of the film is retrieved, and then returns it to the user.
Caching software for web applications
The most widely used caching software for web applications is Memcached, a key-value store that loads data or multimedia entirely in memory and has built-in functionality for maintaining distributed networks of caches. Facebook, for example, relies heavily on Memcached not only to store multimedia content, but to store application byte codes which can be quickly distributed to the user.
How caching impacts performance
Web applications serve terabytes of multimedia per hour to millions of users every day. The sheer volume of data served makes caching necessary; it would be impossible for web applications like Netflix to rely on a single data center at one point of presence. Caching facilitates real-time multimedia delivery, and without it, the mobile phone revolution would have been impossible. Networked first-person shooting games would be highly limited and impossible to scale, for example. Development in performance is not only due to widely available broadband Internet for consumers, but also advancements in caching technology that can distribute data globally, aligning multimedia close to the user.
Old habits die hard
If you are thinking of clearing your cache, reconsider. Caching has been modernized with compression and no longer suffers from fragmentation. Advancements in computer hardware permit caches to exist entirely in memory, so cached content is no longer paged to disk and can be returned quickly. Today, clearing your cache will only lead to performance degradation. Have you streamed a movie or played a game online recently? Thank your cache, the engine that is serving your multimedia content.
Image source: Wikimedia Commons