Explore
A conceptual overview of the components of HDInsight.
Learn the fundamentals of working with the HDInsight service.
Windows HDInsight Server provides a local development environment for the Windows Azure HDInsight Service. Get started using the HDInsight Server Developer Preview with this on-premise implementation of HDInsight.
Plan
Learn what Hadoop components and versions are included in HDInsight.
HDInsight comes with interactive consoles for JavaScript and Hive that you can use as an alternative to remoting into the head node of a Hadoop cluster. Learn about how you can use the consoles to enter expressions, evaluate them, and then query and display the results of a MapReduce job immediately.
Learn how HDInsight works with data stored in Windows Azure Blob Storage, as well as when to store data in HDFS and when to store data in Windows Azure Blob Storage.
Upload
Learn how to upload and access data in HDInsight using Azure Storage Explorer, the interactive console, the Hadoop command line, or Sqoop.
Analyze
One key feature of Microsoft’s Big Data Solution is solid integration of Apache Hadoop with Microsoft Business Intelligence (BI) components. A good example of this is the ability for Excel to connect to the Windows Azure storage account that contains the data associated with your HDInsight cluster. This topic walks you through how to use the Data Explorer to import HDInsight data into Excel.
Data can also be imported from HDInsight into Excel using the Hive ODBC driver.
Apache Mahout is a machine learning library built for use in scalable machine learning applications. Recommender engines are some of the most immediately recognizable machine learning applications in use today. In this tutorial you use the Million Song Dataset site and download the dataset to create song recommendations for users based on their past listening habits.
In this tutorial you will query, explore, and analyze data from Twitter using the Apache Hadoop-based HDInsight Service for Windows Azure and a complex Hive example.
This tutorial will show you how to use the HDInsight service to process data stored in Windows Azure Blob Storage and move the results to a Windows Azure SQL Database.
Learn how to monitor an HDInsight cluster and view Hadoop job history through the Windows Azure Management Portal.
Learn how to use the Hadoop .NET SDK to execute Hive jobs on HDInsight.
Manage
Learn how to monitor an HDInsight cluster and view Hadoop job history through the Windows Azure Management Portal.
Learn how to use the WebHCat REST API to provide metadata management and remote job submission to your Hadoop cluster.
Learn how to use the Windows Azure Management portal and HDInsight Dashboard to work with the HDInsight Service