A very shocking thing is spreading like wildfire over on Tumblr right now. There is this viral post titled “SOPA Emergency IP list” which list “details” on how to “access” various web sites in the event that SOPA and/or PIPA are to be passed and DNS systems become compromised from it.
Before getting into that post in detail, let me explain some basic pieces of information first.
An IP Address: Think of this like your phone number. Every one in the public space is unique. Every phone has a unique phone number you can call. Every computer on the public internet has a public numerical IP address which is used to uniquely identify it and allow other computers to “call” it. These numbers are usually expressed in the form of 4 groups of numbers ranging from 0 to 255. (Note: this article is referencing IPv4 only, IPv6 will be discussed another day)
Domain Name System: DNS for short. This is the process of converting human readable text based “domain names” into numerical IP addresses. As an example, you’re currently reading this post from Darkain.com. In order to do so, in the background, your computer made a DNS request which converted that name into the numerical IP address 18.104.22.168. This process is known as “Name Resolution” and is performed by a “DNS Server”
Okay, now that you know what IP addresses and DNS name resolution is, lets get down to business with this scary trend and what it REALLY means. The US SOPA/PIPA bills threaten to break the process of DNS servers to properly resolve certain domain names. There are plenty of documents circulating the internet which describe the process of these bills far greater than I ever could. Those bills are not the focus of this article. The purpose, rather, is to address falsified ideas that are spreading across the internet because of it.
This is a list that is spreading like wildfire across Tumblr right now. As of this post there are over 36,000 reblogs of it. My big concern is that whomever published this list initially did not do their homework one bit in terms of how the internet and DNS system actually function together to help deliver a reliable end user experience.
Lets take a look at the very first IP address on this list. According to what they’re publishing, we should simply be able to put that IP address into the browser and get the same web page as we would if we typed in the name of the site ourselves, right? According to what I’ve listed above about how DNS names are simply translated into IP addresses and it is really the IP address which is used means this should all work perfectly as normal, right?
!!! WRONG !!!
Go and take a visit to http://22.214.171.124/ and this is what you’ll find. It is a “NOT FOUND” page! But wait, what gives? This is the same IP address returned when attempting to connect to http://www.tumblr.com, so why would it return a different page?
What is really going on here? Well it turns out that web browsers, when connecting to a web server, pass along the name of the web site as well. So after DNS name resolution, and IP access occurs, the domain name is used a second time in accessing the particular web site. This can bee seen in the highlighted “host” field which is passed along to the web server.
So why is the name being used twice? Why does the IP address AND domain name of a web site matter to the web server? With the IP address alone we can already contact the web server which has the web site we want on it! Well, it turns out that to help the internet grow, HTTP/1.1 introduced this “host” setting to allow multiple domains and therefor multiple web sites to reside on a single IP address. As you can see from this screen shot, both the Darkain.com and Repost.me domains reside on the same public IP address.
So then what is going on with Tumblr? They only have one site on that IP address, right? WRONG! They have to manage multiple domains through it. An example of this would be my photo blog: http://photoblog.darkain.com/ – This address resolves to the same IP address as the Tumblr server because they’re the ones hosting this service for me. How does the Tumblr servers know which blog should be presented to end users when their IP address is accessed? Yeah, that’s right. This is exactly what the “host” field is for when the web browser connects to the web server!
The main reason for this post is how troubled I am by the person who made that initial posting on Tumblr. They took the effort and time to track down and compile this vast list of IP addresses of several major web sites around the world. What they failed to do was to test even the very first IP address on their own list to see if it would even work.
UPDATE 2012-01-20 9:00AM: There is now a new list traveling around Tumblr. They’ve updated Tumblr’s URL to also include “dashboard” as apart of the URL: 126.96.36.199/dashboard – The problem with this? Try visiting that page. It is a redirect page to http://www.tumblr.com/dashboard which again is the named version of the web site. This still requires the DNS name resolution of tumblr.com once again, making it impossible to access without access to the DNS servers for this domain.
!!! TO COMPLICATE MATTERS EVEN WORSE !!!
Described above is only ONE issue with simply using IP addresses to access website in the face of SOPA/PIPA taking down the DNS system. Here are some more real-world examples, though given in far less detail for the time being. Each of these topics will be expanded upon in future “server optimization” articles written on this site.
1) Elastic IP Addresses – This is the process of having a non-static IP address. This is common for home users wanting to access their computers on the road, and may setup a service such as DynDNS to manage this for them. This is now becoming a commonplace idea in the commercial sector as well with the emergence of services such as Amazon Cloud Computing. Using services such as these, IP addresses are often times changing to reflect various server changes over time.
2) Geographical IP Addresses – An optimization employed by larger corporations is to setup data centers with servers as close to end users as possible. As an example, Google recently opened a data center in the Seattle Washington area. My current home is in the Tacoma area approximately 35 miles south of there. Before this data center went into place, my ping times with the Google servers were in the range of 50-75ms. I can now get constant ping times around 20ms with this new location being online. What does this mean? Different IP addresses for different locations! How does this work? Companies such as Google have geographical profiles of where their servers are located as well as profiles of where IP addresses which request the DNS name resolution are physically located. The DNS results that they return are optimized for the shortest and quickest possible end routing point from the requesting user.
3) DNS Based Load Balancing – Some services exist on multiple IP addresses. The DNS server which is returning the IP address for a given domain name that you’re requesting may be different from the next person to request it. While both IP addresses have end point servers with the same content, you and the other user were given the different IP addresses, therefor the server resource load is split between the two physical end-point servers.
Now start combining all three of these together. Yes, this is indeed a very real thing, and becoming a much more common place practice than you might think. Because a DNS record may be updated at any time, new servers with mirrored content can come online or go offline at any time. With cloud computing and elastic IPs, these addresses can easily change from day to day. To server administrators, most of this process is fully automated and eases scalability and disaster recovery. To the end user this means a seamless experience as remote servers are upgraded or repaired.
One example of all of this would be the IP address ranges for Google Apps servers. Note that they list both current AND former address ranges on this page. Also please note that these ranges are large blocks which contain over 4,000 address each. Source: http://support.google.com/postini/bin/answer.py?hl=en&answer=141669