Most of us have heard the term ‘cloud computing’ used a lot lately. It seems to be one of the more popular buzz words in the lexicon of IT jargon. The main question for most IT professionals is “how can I use the cloud to increase service availability or decrease my cost?” Of course there are numerous other questions surrounding the cloud about security and privacy. Where is my data actually located? Who really has access to it? What are the real differences between public and private clouds? Let’s take a few minutes and explore some of the real (and mythical) issues surrounding the cloud.
As in all academic discussions, we should begin with a few definitions (to ensure we are all using the same jargon). According to Webopedia, “cloud computing is a type of computing that relies on sharing computing resources rather than having local servers or personal devices to handle applications.” If you will allow me to oversimplify, cloud computing is using the cloud (used synonymously for the Internet) to connect to data or applications located somewhere else in the world. The term thin-client refers to an application or device that is used to connect to a remote server that performs the majority of processing and storage. Thick-clients (also called fat-clients) perform most of the processing locally but still store data on a remote server. Most of us use both thin and thick-clients to access cloud based information on a daily basis. We can clearly see this through the use of email.
Browser based webmail (yahoo mail, gmail, and Outlook Web Access) is considered a thin-client application because most of the work is done by remote mail servers and sent via web browser to the client. Microsoft Outlook and Mac Mail on the other hand, are considered thick-clients because the application runs locally on your PC, and only uses remote servers to store email. In both of these cases, email is stored remotely on some distant mail server and content is delivered via the cloud to the end user.
Although email is arguably the most widely used application in the cloud (who doesn’t use gmail or yahoo mail these days?), most of the modern cloud discussions seem to be focused on cloud-based storage (DropBox, Box, Carbonite, Moxy, ect). So why are cloud-storage solutions such a hot topic? The answer is simple. They are cheap and provide high availability!
Most cloud-based storage solutions are cheaper than traditional on-site storage due to economy of scale. Wikipedia defines economies of scale as the “cost advantages that enterprises obtain due to size, throughput, or scale of operation, with cost per unit of output generally decreasing with increasing scale as fixed costs are spread out over more units of output.” Cloud-storage providers are buying much larger disk arrays and using complicated deduplication systems to reduce the cost of storage per byte. That allows them to offer virtually unlimited amounts of storage for a low monthly fee. As the cost of storage decreases over time, their model becomes increasingly more profitable. They are also able to offer high availability through the use of global data centers that are cost prohibitive for smaller organizations. For example, Dropbox uses over 10,000 physical servers to store the data for over 200,000,000 customers (http://www.datacenterknowledge.com/archives/2013/10/23/how-dropbox-stores-stuff-for-200-million-users/. I often use the analogy of warehouses to explain this concept. A person who rents a storage locker often pays a monthly/yearly rate to store their “stuff.” A local business is currently advertising 5’x5’ lockers for $38/month. That equates to $1.52/square foot. Less than 1 mile from that storage locker, you can rent a 20,000 square foot warehouse for $5,500/month. That equates to $.28/square foot. Similarly, cloud based storage providers are not buying storage by the gigabyte (gb) or even terabyte (1,000 gb). They are buying petabytes (1,000,000 gb) and even exabytes (1,000,000,000 gb). Although it is difficult to imagine that much data, they are pooling millions of customers and charging each of them for their respective little slice of storage. Although this storage model can save a lot of money for the customer, it also introduces very serious questions about privacy and security. Where exactly is my data being kept? Who else may have access to it? What happens if I stop paying my monthly fee? Will the cloud-storage company sell off my data like they do on Storage Wars? All very good questions…
Many vendors have tried to address these concerns through vague policies that stop just short of promising actual privacy. DropBox for example promises “security of your data is our highest priority” yet that same policy states that “employees may access file metadata (e.g., file names and locations) when they have a legitimate reason.” https://www.dropbox.com/help/27/en. Many privacy advocates are understandably weary of such interpretive verbiage. Other cloud-storage vendors (like Carbonite and Mozy) offer client-side encryption of data before it is transmitted to the vendor’s systems. This feature means that although your data does reside on someone else’s server, you are the only person who can theoretically access it. As the volume of world-wide data grows, many experts consider cloud-based storage only scalable solution for many business. My best advice is to choose your cloud-based storage provider carefully. Remember that your organizations data is probably one of your most valuable resources and you should not trust it to anyone that you haven’t thoroughly vetted.