UPDATES

WELCOME TO BIGDATATRENDZ      WELCOME TO CAMO      Architectural Patterns for Near Real-Time Data Processing with Apache Hadoop      Working with Apache Spark: Or, How I Learned to Stop Worrying and Love the Shuffle     

Sunday, 31 May 2015

Security, Hive-on-Spark, and Other Improvements in Apache Hive 1.2.0

Apache Hive 1.2.0, although not a major release, contains significant improvements.
Recently, the Apache Hive community moved to a more frequent, incremental release schedule. So, a little while ago, we covered the Apache Hive 1.0.0 release and explained how it was renamed from 0.14.1 with only minor feature additions since 0.14.0.
Shortly thereafter, Apache Hive 1.1.0 was released (renamed from Apache Hive 0.15.0), which included more significant features—including Hive-on-Spark.
Last week, the community released Apache Hive 1.2.0. Although a more narrow release than Hive 1.1.0, it nevertheless contains improvements in the following areas:

New Functionality

  • Support for Apache Spark 1.3 (HIVE-9726), enabling dynamic executor allocation and impersonation
  • Support for integration of Hive-on-Spark with Apache HBase (HIVE-10073)
  • Support for numeric partition columns with literals (HIVE-10313HIVE-10307)
  • Support for Union Distinct (HIVE-9039)
  • Support for specifying column list in insert statement (HIVE-9481)

Performance and Optimizations

Security

Usability and Stability

For a larger but still incomplete list of features, improvements, and bug fixes, see the release notes. (Most of the Hive-on-Spark JIRAs are missing from the list.)
The most important improvements and fixes above (such as those involving security, for example) are alreadyavailable in CDH 5.4.x releases. As another example, CDH users have been testing the Hive-on-Spark public beta since its first release, as well as improvements made to that beta in CDH 5.4.0.
We’re looking forward to working with the rest of the Apache Hive community to drive the project continually forward in the areas of SQL functionality, performance, security, and stability!

How-to: Use Parquet with Impala, Hive, Pig, and MapReduce

Source: Cloudera Blog The CDH software stack lets you use your tool of choice with the Parquet file format – – offering the benefits of ...