Brand Analytics on the GCE Cloud

At TEXTIENT Analytics, We are now trying out Google compute engine to run our complex and heavy analytics pipeline . We recently ran this brand analytics  on a standard single CPU n1 instance on GCE . Looks very promising in terms of performance and the cost of the run… Instance boot times are way faster then the well known Cloud providers we have come across.

Posted in big data, Cloud Computing | Tagged , , | Leave a comment

Exploring the Feelings of @Rackspace’ staying independent !

TEXTIENT’s cloud based Analytics reveals the feelings of  Rackspace, one of the biggest cloud service providers  deciding to stay independent after scouting for strategic options including partnerships and acquisition.

Read more at…  

Posted in big data, Cloud Computing | Tagged , , , , , , | Leave a comment

What feelings are driving Scotland’s big vote on Independence?

As Scotland’s day of destiny arrives, What feelings are driving Scotland’s big vote on Independence?

Read more on this analysis at

* TEXTIENT Analytics is run on Digital Ocean, AWS and GCE Cloud

Posted in big data | Tagged , , , , , , , | Leave a comment

The Moods at New York on the Easter day

Happy Easter to all. The rising Joyful moods at #newyork The happier side of #big data can be viewed  here .   This is a part of the analytics on a smaller data set of twitter streams from 16 to 20 April’14



Disclaimer : Data accuracy is not guaranteed.

Views are  own.  Data and charts if used, in the article have been sourced from available information in the public domain and has not been authenticated by any statutory authority.

Posted in Cloud Computing | Tagged , , , , | Leave a comment

Xplenty Expands Coverage to all Amazon Web Services’ Regions

Customers using Amazon CloudFront can now benefit from Xplenty to parse and process their log files, all within the Xplenty design environment

Tel Aviv, Israel – March 4, 2014 – Xplenty,, provider of the innovative Hadoop-as-a-service platform, Amazon Web Services (AWS) Technology Partner in the AWS Partner Network, and seller on the AWS Marketplace, now offers its big data processing technology directly to customers in all AWS Regions. Xplenty is now available to customers from AWS’ Regions in South America (Sao Paolo), Asia Pacific (Singapore), Asia Pacific (Sydney), and Asia Pacific (Tokyo). This adds to the existing Xplenty locations of U.S. East (N. Virginia), U.S. West (N. California and Oregon) and EU (Ireland).

Xplenty technology provides Hadoop processing on the cloud via a coding-free design environment, ensuring businesses can quickly and easily benefit from the opportunities offered by big data without having to invest in hardware, software or related personnel.

Meanwhile, users of the Amazon CloudFront content delivery network can now use Xplenty to analyze their log files. New predefined templates let users parse and process Amazon CloudFront logs easily. The processing engine transforms structured and semi-structured big data and easily scales to petabytes as data requirements grow, allowing companies to better understand their customers.

One company already using Xplenty to gain better insight to their customers is WalkMe. “We have customers from a wide range of industries and verticals – including banks, financial institutions, retail services, tourism, leading software vendors and more – all of which use WalkMe to simplify their customers’ online experience. By using Xplenty to break down our log files, we’re able to gain valuable insights into our customer needs and preferences,” says Nir Nahum, VP of R&D at WalkMe. “With the easy-to-use GUI, we just designate the file location for processing, and it automatically sets up the template and runs.”

Xplenty is available within the global AWS Marketplace to customers seeking to integrate a Hadoop-as-a-Service platform to solve their big data processing challenges.

“Big data is shaping the way companies of all sizes develop new products and identify new opportunities to increase their efficiency,” said Brian Matsubara, Head of Global Technology Alliances, Amazon Web Services.  “By bringing their Big Data analysis tools to the AWS cloud, Xplenty is giving customers an innovative approach to solve their business challenges.  Xplenty leverages the AWS global platform to provide scalable Big Data solutions to customers around the world.”

“As a cloud-based service provider, we offer organizations of any size the opportunity to learn more about their customers, further personalize their services, and increase their bottom lines, all by enabling their big data analyses,” says Yaniv Mor, co-founder and CEO of Xplenty. “Why shouldn’t everyone gain by using the data they are paying to store anyway?”

About Xplenty

Xplenty was founded by data professionals for data professionals to deliver on the promise of big data. Xplenty’s true big data solution provides ROI almost immediately by uncovering valuable business insights, translating into higher revenues and increased competitiveness. Xplenty delivers a coding-free, cloud-based Hadoop-as-a-Service platform that transforms structured, unstructured, and semi-structured data into usable information in the AWS, Rackspace and Softlayer environments. Our goal is to make Hadoop accessible and cost-effective for everybody.

 Media Contact

Amy Kenigsberg
K2 Global Communications
tel: +972-9-794-1681 (+2 GMT)
mobile: +972-524-761-341
U.S.: +1-913-440-4072 (+7 ET)

All product and company names herein may be trademarks of their registered owners.

Posted in Cloud Computing | Tagged , , , ,

Cloud Assets,Health & Analytics

The SaaS Solutions provided by Cloud Health Technologies , a new service provider on the block is  interesting

A holistic view of the Cloud environment especially when you have deployed hundreds of servers, store terabytes of data across mutiple storage locations drives cloud management innovation which is the heart of  CloudHealth Technologies. Their services combine cloud analysis technology and the integration of cloud business services such as provisioning, performance, monitoring, and financial, to support a holistic approach for IT management of your cloud ecosystem.  According to them, their services offer actionable business insights to help  answer the tough cloud management questions so that one can align the cloud assets and operations with their business needs such as for instance

  • What is my cloud usage per customer, service, or project?
  • How can I correlate my cloud performance to my customer service level requirements?
  • How can I correlate revenue per customer to COGS?
  • How can I determine the optimal number and types of reserved instances based on business needs?
  • How can I get a consolidated view of all of my accounts, services, regions, and asset detail?

At the Outset it seems to be an interesting service especially for companies which may have a number of their Amazon cloud assets to be streamlined and managed. Their website doesn’t provide information about pricing. 

Posted in Uncategorized | Tagged , , | Leave a comment

Digital Ocean is Awesome!

Where in the world today can you get a SSD storage machine on a Tier-1 Network with 1 TB Data transfer pre-packaged for $5? Its only at Digital Ocean. It is probably one of the most interesting Cloud  I came across and tried recently. One of the important things to consider about Digital Ocean’s service is the high quality of Disk IOPS and Network I/O that comes at an unbeatable price.

IOPS intensive workloads in AWS Cloud is pretty expensive  plus you have to shell out additionally for the network I/O.

A decent sized SME Web Application workload on a $5 or $10 Digital Ocean Server can cost atleast $20 (400% more) in AWS and on a similar scale in Rackspace Cloud Server.

My  Objective here is not to compare Digital Ocean against AWS or Rackspace ,because you know I never compare a  bright engineer in his 20s with a go getter attitude with a Veteran Engg.Manager in his late 30s or eary 40s when Hiring, Many a times in a start up or in a SME business,an innovative Engineer in his 20+ is good enough for the business….

Digital Ocean’s User interface is slick, They have a easy to use APIs. May not be comprehensive., But its good. There are 3rd party wrappers/client libraries. You can take server snapshots and they have a backup service.

Digital Ocean’ Support has been terrific. You must perhaps experience what I am saying here..

As I am writing this, I just saw a message in their Console that their Amsterdam region has run out of capacity and no new droplets can be created! I can imagine they are growing up very rapidly and I hope their value will continue to stay as this growth continues…

Digital Ocean is definitely a new service to think about seriously and has been truely awesome

Posted in Cloud Computing | Tagged , , , | Leave a comment

Mining Twitter Sentiments within an Hour of Earth Quake in Norther California Today

An Earth Quake with a magnitude of about 5.7 in the Richter scale had just struck Northern California today. I quickly ran one of the Twitter mining and sentiment analysis program that I had developed to get various metrics of the sentiment.

This test drive was from a fairly small representative samples which is about 1500 tweets

Here is a video of my program run

Just to interpret the results

Tweets Sentiment Vibe!! [Score between -1.0 and 1.0 range] is–: 2%.

At 2% , This indicates that the sentiments are low , being a grave incident

Tweet’s Objective Perceptions –: 5% . This indicates that the perceptions are more of subjective than objective which is ok since many tweeters are not expected to be on ground zero and its still mid night in the US

Tweet’s Degree of Certainity  :- 81% , This reflects that the nature of tweets indeed reflects   the seriousness and certainty of the content related to the topic

The Tweets Positivity is –: 4%         –     This reflects lower level of a positivity , there is something to suspect or something seems to be obviously wrong or closer to negativity

The below metric reflects the mood of the tweets , which directly reflects a strong belief which is significantly higher at 1357 meaning that “It is indeed happening and a fact” as against a probable or imaginary belief
(‘The *Belief Mood* is :-‘, 1357)
(‘The *Probable or Imaginary Mood* is :-‘, 7)

These are sample metrics with representative results  with just 1500 tweets with certain parameter thresholds. However, My belief is the tweet metrics and correlation perhaps does reflect the state of moods,perception / sentiments on the searched text ‘Northern California’ affected by an earth quake (which was trending high on twitter)

This software was not run on the cloud, but obviously when I intend to do an sentiment analysis on a larger scale from the twitter fire hose, I plan to use AWS Dynamo DB and Elastic search integrated to my core sentiment analyser software.

Posted in Cloud Computing | Tagged , , , | 1 Comment

Machine Learning & Twitter Recommendation : Should I Go & Watch Gippy or Aurangazeb Movie??

I Wanted to decide on whether to go and watch Gippy or Aurangazeb movie. Both the Bollywood movies have been released recently. For fun, I wanted to choose this based on the Twitter trends and I wanted to kind of decide this based on the Verbs “Watch” and/or “Book” appearing on the Tweets , which is a implicit way people could describe or recommend  an action over their tweets

In order to decide this, I did the following

. 1. Mine Twitter with hash tags for #Gippi and #AURANGAZEB (about 1500 tweets)

2. Each Tweet is parsed for its linguistics (NLP) and the Verbs are extracted(e.g Watch, Book)

* Here is a sample tweet[image] from Twitter


3.  The tweets are then vectorised for a Supervised Machine learning training . For this training, the feature vectors will be the ‘Verbs’ along with the tweet text and the labels that I applied (generated) was ‘GO-GIPPI’ and ‘GO-AURANGAZEB’

4. A Machine learning algorithm based on K-Nearest Neighbour with Manhattan distance was trained on this data. [I also tried with Euclidean distance]

5. Once trained, I applied the test label ‘Watch and then ‘Book’ on the trained model and asked the Algorithm to predict its results (classification) as ‘GO-GIPPI’ and ‘GO-AURANGAZEB’ for the given label (which is actually an expected Verb in the tweet implying the recommendation)

6. To my surprise, for both the test labels (the Verbs) Watch and Book which actually means where people are writing Watch ‘X’ movie or Book ‘X’ movie , the Algorithm classified and recommended ‘GO-AURANGAZEB’ as the result [see the results box of my program]


So, The twitter based recommendation algorithm pointed that I should go and Watch the AURANGAZEB movie!!

I quickly wanted to see if there is a quantification to this recommendation by my Machine learning ,predictive analytics program

To my Surprise, From this website , the Box office collections of Aurangazeb movie indicated that it is way higher than Gippi’s nox office collections which is perhaps a direct reflection that the film is doing well and more people go and watch it!! Isnt it?

As you can see from the below picture [excerpt of website screenshot], Aurangazeb had grossed Rs 14.7 Cr as against Gippy’s Rs 4.5 Cr.


Next I sampled a public recommendation from the yahoo answer website that again pointed to “AURANGAZEB” as the best movie!


So from these public information , I validated that my program predicted from the twitter /tweets and  recommended to go to “AURANGAZEB” which I am planning to see soon to really check !?

As I look for mining large volumes of twitter data and apply it to  machine learning algorithms to make complex predictions, I am going to need more storage,RAM and processing power and Cloud will be the right place to make this happen. Obviously Amazon AWS is my choice!

Posted in Uncategorized | Tagged , , , , , , , | Leave a comment

Amazon AWS’s role in my Data Science pursuits

In a matter if two months, I climbed to the top 1% of Kaggle by solving some very interesting problems provided by leading organisations and through applying various Machine learning techniques to Complex data. If you haven’t known about Kaggle, it is a global platform that connects Machine Learning Scientists and Engineers with Organisations that wants to solve their data science problems in the form of  competitions.

While I do have some exposure in the AI related areas several years ago, I am neither a real Data Scientists holding a Phd or a Post-doc researcher or an Industry Veteran working in the field of Analytics except the fact I have been learning and working on some of the connected areas offlate. When I started at Kaggle initially, I quickly realised that Solving complex machine learning problems in its true sense is not for the weak hearted!  and I am one of those in the process of getting stronger over every weekend hacks these days and its been an exciting intellectually rewarding journey!

Several times over these weekend pursuits, I had to run algorithms on machines that required very high capacity and I had to do it the lowest cost. Amazon AWS so far has helped me address both these problems with its high memory XL and spot instances combined with the ability to quickly launch different sets of pre-baked machine learning run times through AWS machine images and Cloud-formation deployment.

In essence AWS is significantly helping me to leap forward in my data science pursuits.


Posted in Uncategorized | Tagged , , | Leave a comment