Administer HDInsight using the Cross-platform Command-line Interface
In this article, you learn how to use the Cross-Platform Command-Line Interface to manage HDInsight clusters. The command-line tool is implemented in Node.js. It can be used on any platform that supports Node.js including Windows, Mac and Linux.
The command-line tool is open source. The source code is managed in GitHub at https://github.com/WindowsAzure/azure-sdk-tools-xplat.
This article only covers using the command-line interface from Windows. For a general guide on how to use the command-line interface, see How to use the Windows Azure Command-Line Tools for Mac and Linux. For comprehensive reference documentation, see Windows Azure command-line tool for Mac and Linux.
Before you begin this article, you must have the following:
In this article
The command-line interface can be installed using Node.js Package Manager (NPM) or Windows Installer.
To install the command-line interface using NPM
- Browse to www.nodejs.org.
- Click INSTALL and following the instructions using the default settings.
- Open Command Prompt (or Windows Azure Command Prompt, or Developer Command Prompt for VS2012) from your workstation.
Run the following command in the command prompt window.
npm install –g azure-cli
If you get an error saying the NPM command is not found, verify that the following paths are in the PATH environment variable:
C:\Program Files (x86)\nodejs;C:\Users\[username]\AppData\Roaming\npm
Run the following command to verify the installation:
azure hdinsight –h
You can use the -h switch at different levels to display the help information. For example:
azure hdinsight -h
azure hdinsight cluster -h
azure hdinsight cluster create -h
To install the command-line interface using windows installer
- Browse to http://www.windowsazure.com/en-us/downloads/.
- Scroll down to the Command line tools section, and then click Cross-platform Command Line Interface and follow the Web Platform Installer wizard.
Download and import Windows Azure account publishsettings
Before using the command-line interface, you must configure connectivity between your workstation and Windows Azure. Your Windows Azure subscription information is used by the command-line interface to connect to your account. This information can be obtained from Windows Azure in a publishsettings file. The publishsettings file can then be imported as a persistent local config setting that the command-line interface will use for subsequent operations. You only need to import your publishsettings once.
The publishsettings file contains sensitive information. It is recommended that you delete the file or take additional steps to encrypt the user folder that contains the file. On Windows, modify the folder properties or use BitLocker.
To download and import publishsettings
- Open a Command Prompt.
Run the following command to download the publishsettings file.
azure account download
The command shows the instructions for downloading the file, including an URL.
Open Internet Explorer and browse to the URL listed in the command prompt window.
- Click Save to save the file to the workstation.
From the command prompt window, run the following command to import the publishsettings file:
azure account import <file>
In the previous screenshot, the publishsettings file was saved to C:\HDInsight folder on the workstation.
Provision an HDInsight cluster
HDInsight uses a Windows Azure Blob Storage container as the default file system. A Windows Azure storage account is required before you can create an HDInsight cluster.
After you have imported the publishsettings file, you can use the following command to create a storage account:
azure account storage create [options] <StorageAccountName>
The storage account must be collocated in the same data center. Currently, you can only provision HDInsight clusters in the following data centers:
- US East
- US West
- Europe North
For information on creating a Windows Azure storage account using Windows Azure Management portal, see How to Create a Storage Account.
If you have already had a storage account but do not know the account name and account key, you can use the following commands to retrieve the information:
-- lists storage accounts
azure account storage list
-- Shows a storage account
azure account storage show <StorageAccountName>
-- Lists the keys for a storage account
azure account storage keys list <StorageAccountName>
For details on getting the information using the management portal, see the How to: View, copy and regenerate storage access keys section of How to Manage Storage Accounts.
The azure hdinsight cluster create command creates the container if it doesn’t exist. If you choose to create the container beforehand, you can use the following command:
azure storage container create –-account-name <StorageAccountName> --account-key <StorageAccountKey> [ContainerName]
Once you have the storage account and the blob container prepared, you are ready to create a cluster:
azure hdinsight cluster create –-clusterName <ClusterName> --storageAccountName <StorageAccountName> --storageAccountKey <storageAccountKey> --storageContainer <StorageContainer> --nodes <NumberOfNodes> --location <DataCenterLocation> --username <HDInsightClusterUsername> --clusterPassword <HDInsightClusterPassword>
Provision an HDInsight cluster using a configuration file
Typically, you provision an HDInsight cluster, run jobs on it, and then delete the cluster to cut down the cost. The command-line interface gives you the option to save the configurations into a file, so that you can reuse it every time you provision a cluster.
azure hdinsight cluster config create <file>
azure hdinsight cluster config set <file> --clusterName <ClusterName> --nodes <NumberOfNodes> --location "<DataCenterLocation>" --storageAccountName "<StorageAccountName>.blob.core.windows.net" --storageAccountKey "<StorageAccountKey>" --storageContainer "<BlobContainerName>" --username "<Username>" --clusterPassword "<UserPassword>"
azure hdinsight cluster config storage add <file> --storageAccountName "<StorageAccountName>.blob.core.windows.net"
azure hdinsight cluster config metastore set <file> --type "hive" --server "<SQLDatabaseName>.database.windows.net"
--database "<HiveDatabaseName>" --user "<Username>" --metastorePassword "<UserPassword>"
azure hdinsight cluster config metastore set <file> --type "oozie" --server "<SQLDatabaseName>.database.windows.net"
--database "<OozieDatabaseName>" --user "<SQLUsername>" --metastorePassword "<SQLPassword>"
azure hdinsight cluster create --config <file>
List and show cluster details
Use the following commands to list and show cluster details:
azure hdinsight cluster list
azure hdinsight cluster show <ClusterName>
Delete a cluster
Use the following command to delete a cluster:
azure hdinsight cluster delete <ClusterName>
In this article, you have learned how to perform different HDInsight cluster administrative tasks. To learn more, see the following articles: