Showing posts from September, 2015

Meet Cloudera’s Apache Spark Committers

From Cloudera Blog, thanks to for valuable post in cloudera blog.
The super-active Apache Spark community is exerting a strong gravitational pull within the Apache Hadoop ecosystem. I recently had that opportunity to ask Cloudera’s Apache Spark committers (Sean Owen, Imran Rashid [PMC], Sandy Ryza, and Marcelo Vanzin) for their perspectives about how the Spark community has worked and is working together, and the work to be done via the One Platform initiative to make the Spark stack enterprise-ready. Recently, Apache Spark has become the most currently active project in the Apache Hadoop ecosystem (measured by number of contributors/commits over time), if not the entire ASF. Why do you think that is? Owen: Partly because of scope: Apache Spark has been many sub-projects under an umbrella from the start, some large and complex in their own right, and has tacked on several more in just the last six months. Culture is another reason for it; even small changes are tracked i…

How Impala Scales for Business Intelligence: New Test Results

From Clodera Blog: Thanks to Yanpei Chen, Alan Choi, Dileep Kumar, David Rorke, Silvius Rus, and Devadutta Ghat
Impala, the open source MPP query engine designed for high-concurrency SQL over Apache Hadoop, has seen tremendous adoption across enterprises in industries such as financial services, telecom, healthcare, retail, gaming, government, and advertising. Impala has unlocked the ability to use business intelligence (BI) applications on Hadoop; these applications support critical business needs such as data discovery, operational dashboards, and reporting. For example, one customer has proven that Impala scales to 80 queries/second, supporting 1,000+ web dashboard end-users with sub-second response time. Clearly, BI applications represent a good fit for Impala, and customers can support more users simply by enlarging their clusters.
Cloudera’s previous testing already established that Impala is the clear winner among analytic SQL-on-Hadoop alternatives, and we will provide additio…

Big Data Trendz