What version of Hadoop is in Windows Azure HDInsight?
HDInsight supports multiple Hadoop cluster versions that can be deployed at any time. Each version choice provisions a specific version of the HortonWorks Data Platform (HDP) distribution and a set of components that are contained within that distribution.
Cluster version 2.1
The default cluster version used by Windows Azure HDInsight is 2.1. It is based on the Hortonworks Data Platform version 1.3.0 and provides Hadoop services with the component versions itemized in the following table:
|Apache Hadoop ||1.2.0 |
|Apache Hive ||0.11.0 |
|Apache Pig ||0.11 |
|Apache Sqoop ||1.4.3 |
|Apache Oozie ||3.2.2 |
|Apache HCatalog ||Merged with Hive |
|Apache Templeton ||Merged with Hive |
|Ambari ||API v1.0 |
Cluster version 1.6
Windows Azure HDInsight cluster version 1.6 is also available. It is based on the Hortonworks Data Platform version 1.1.0 and provides Hadoop services with the component versions itemized in the following table:
|Apache Hadoop ||1.0.3 |
|Apache Hive ||0.9.0 |
|Apache Pig ||0.9.3 |
|Apache Sqoop ||1.4.2 |
|Apache Oozie ||3.2.0 |
|Apache HCatalog ||0.4.1 |
|Apache Templeton ||0.1.4 |
|SQL Server JDBC Driver ||3.0 |
Select a version when provisioning an HDInsight cluster
When creating a cluster through the HDInsight PowerShell Cmdlets or the HDInsight .NET SDK, you can choose a version using the “Version” parameter.
If you use the Quick Create option, you will get the 2.1 version. If you use the Custom Create option from the Windows Azure Portal, you can choose the version of the cluster you will deploy from the HDInsight Version drop-down on the Cluster Details page.
The following table lists the versions of HDInsight currently available, the corresponding Hortonworks Data Platform (HDP) versions that they use, and their release dates. When known, their deprecation dates will also be provided.
|HDInsight version||HDP version||Release date|
|HDI 1.6 ||HDP 1.1 ||10/28/2013 |
|HDI 2.1 ||HDP 1.3 ||10/28/2013 |
A note on support for each version
A “Support Window” refers to the period of time an HDInsight cluster version is supported by Microsoft Customer Support and bound by this SLA. An HDInsight cluster is outside the Support Window if its version has a Support Expiration Date past the current date. A list of supported HDInsight cluster versions may be found in the table above. The Support Expiration Date for a given HDInsight version (denoted as version X) is calculated as the later of:
- Formula 1: Add 180 days to the date HDInsight version X was released
- Formula 2: Add 90 days to the date HDInsight version X+1 (the subsequent version after X) is made available in the Windows Azure Management Portal.
Additional notes on versioning
The SQL Server JDBC Driver is used internally by HDInsight and is not used for external operations. If you wish to connect to HDInsight using ODBC, please use the Microsoft Hive ODBC driver. For more information on using Hive ODBC, Connect Excel to HDInsight with the Microsoft Hive ODBC Driver.
The default Hadoop distribution in Windows Azure HDInsight version 2.1 is based on the Hortonworks Data Platform 1.3.0.
The Hadoop distribution in Windows Azure HDInsight version 1.6 is based on the Hortonworks Data Platform 1.1.0.
The component versions associated with HDInsight cluster versions may change in future updates to HDInsight. One way to determine the available components and to verify which versions are being used for a cluster is to login to a cluster using remote desktop and examine the contents of the “C:\apps\dist\” directory directly.