There is an interesting survey result published by Karmasphere that points to the shortage of big data professionals in the Industry and the advantages of leveraging *self service* tool like Karmasphere Analyst for Big data analytics.
I would like to share my views on this based on some of my experiences in this space ever since my first post on Karmasphere in the Yr 2010.
- Usage of Self service Big data analytics tools predominantly apply for processing static big data. It is not suitable for processing and analyzing data sets that are dynamic (changing from time to time),or for high velocity (near) real time data sets.
- Karmasphere kind of self -service tools are a boon to enterprises or Small to Mid size companies or Individuals that want to leverage big data assets available internally or on the cloud (through big data infomediaries) to gain insights from such data and apply it to the advantage of their business. For instance here are a few key things Karmasphere helps solve.,
– Slashes significant time to process and analyse large data sets on a Hadoop cluster
– The cost of assembling and maintaining a full fledged team to handle big data processing is quite high. This primarily includes the Big data project maneger or Architect,Linux/Hadoop administrators and Hive developers. With a tool like Karmasphere which has automated many key aspects of big data processing lifecyle, the resource overheads are significantly reduced.
– Karmasphere Analyst as a tool has a focus on processing big data on Hive which is a datawarehouse that runs on top of HDFS/Hadoop and converts SQL like programs automatically in to Hadoop Map reduce jobs. This enables a company to repurpose existing SQL developer skills in to Hive/HQL skills or hire people with good SQL /RDBMS skills from the market.
– Karmasphere kind of tools are a boon to small companies and talented individuals,typically the data analysts or data scientists who wants to focus on solving the “Big data problem” which is all about gaining valuable insights or maximising the value from massive unstructured data sets.
They donot need to worry about the IT infrastructure plumbing,hadoop clusters,job flow distribution so on..
– One of the interesting facts about Karmasphere analyst is, it is available on a pay per use basis on Amazon AWS Cloud, which lowers the barriers for any one to process big data. Be it an individual like me or a Fortune 2000 enterprise!!! For instance, when I first used Karmasphere Analyst, it took me less than an hour to set up processing a 1 GB data set running to be processed on a 10 node hadoop cluster on AWS Cloud. (Of course you would have to install Karmasphere tools on your Laptop or PC which can take 1 to 2 hours for a initial one time set up.. )
There are vendors and services emerging in this space such as Google Big query, Datameer so on.. which is a very interesting time for all those seeking to leverage big data ..
Finally,all said and done, Applying these tools and techniques in the big data space has its caveats which could differ from one user scenario to another. To this extent,the big data technology initiatives and projects have to carefully planned and orchestrated.
Disclosure : I neither have a bias for a product like Karmasphere Analyst nor this post is based on any one influencing to write. It is just based on my own experiences and it is one of the mature tools at this point of time for being considered as a representative specimen.