It is hard to say which of the two buzz words tops IT news today: Big Data or Cloud? It is much easier to see how those can play nicely together.
Big Data symbolizes the explosion of computer generated data which is said to double every year outgrowing the capacity of IT data centers. Trying to cope with ever increasing data volumes and make sense of the data companies find themselves in a desperate need for high scale data aggregation, processing and analysis tools.
I won’t dare giving another definition to such a multi faceted thing as Cloud. Let me rather summarize a few of its inherent capabilities representing different levels of the cloud stack:
– On demand resource provisioning
This is a key attribute of the IaaS clouds where you can request CPU, memory and storage resources based on the application demand and release them as they become no longer needed. This is by far the most important benefit IT professionals realize from the Cloud today.
– Scalable data processing
Thanks to the bloom of open source technologies like MapReduce and their various commercial implementations scalable data processing becomes a part of the cloud development platform or PaaS. It lets application vendors harness the power of infinite cloud resources to perform complex computational tasks on the data.
– Rich data analytics services
The underlying cloud infrastructure enabled a plethora of cloud services collectively known as SaaS that use on demand resources and scalable data processing algorithms to exploit domain specific knowledge and provide actionable insight into data of a different kind.
Looking at all these features of the cloud it becomes apparent that it has a lot to do with big data and that the latter can be well managed in the cloud. Here are just a few examples that show how big data problems can be effectively solved in the cloud.
1. Cloud based data migration services is a natural fit especially if migrated data itself finds a new home in the cloud.
OnDemand Migration for Email is a cloud based service from Quest Software that automates migration of large on-premise email workloads such as Microsoft Exchange mailboxes to the cloud based email systems like Microsoft Office 365. The service relies on the elasticity of Microsoft Azure to provide unique benefits to its customers such as predictable project deadlines, controlled costs and ease of migration. Since the migrated data ultimately settles in a secure cloud email system it largely alleviates concerns about security of data trusted to the cloud service for the time of migration.
2. Application performance monitoring services like Quest Software’s Project Lucy exploit multi tenant nature of the cloud based services to define the “golden standard” of application performance and pinpoint performance degradation long before it adversely affect its users. Project Lucy correlates application performance metrics and configuration snapshots collected from its entire customer base – something that would never be possible for application performance monitoring solutions isolated within a customer’s own data center. Cloud skeptics are left with nothing to be concerned about. No personal identifying data leaves organizational boundaries and only averaged out application performance metrics and anonymous configuration options get sent to the cloud.
3. SIEM solutions like Dell SecureWorks also find a good use of the cloud technology for the threat monitoring use case. SIEM need to reduce overwhelming amounts of logs generated by applications, systems and network devices to detect and respond to security threats. There are two ways to do that: monitoring patterns of known malicious activity and continuous evaluation of user behavior profiles aka advanced persistent threats. Both tasks are very resources intensive and subject to a lot of false positive conclusions. Cloud based SIEM can leverage dynamic resource provisioning and cross customer threat correlation to significantly reduce the risk of false positives while ensuring adequate resources to deal with spikes in log volumes.
Don’t get me wrong. Cloud is not a panacea for all big data issues. There are many factors that have to be carefully considered before letting your data rest on the shoulders of the mighty cloud. Data privacy and ownership, data retention costs, cloud provider SLAs are just to name a few. However, there are quite a few of cases where cloud based services can help you manage and make sense of the data and I think it is safe to say that we’ll see more of those services in the future.