Bright HDInsight 7: Customise Your Cluster

HDinsight as a clustered, cloud backed resource allows for the offloading of management overhead of cluster maintenance, tuning, patching and other sysadmin type operations. The flipside to this is that for the management overhead that you lose you pay a price of a lower level of control. You cannot elevate your user context to administrative privileges. You cannot expect tweaks applied after a cluster has been created to be permanent. Certain operations inside Azure will cause “reimaging” events, where cluster members are returned to their initial state. This is in line with all clouds offering PaaS compute. In order that this loss of control should not hamper the users of HDInsight, Microsoft have supplied a comprehensive cluster customization capacity at provision time.

Using this capacity, it is possible to affect core Hadoop configuration settings in:

  • Core-site.xml
  • Mapred-site.xml
  • Hdfs-site.xml

These should be used in utter total preference to making manual changes after having connected with RDP.

The code for achieving this is similarly straight forward as the previous examples. Simply create objects to represent the configuration settings required at provision time, and submit them with the provisioning request. Again using Powershell as our exemplar language, we can achieve this with 4 simple commands.

$coreConfig = @{    

“io.compression.codec”=”org.apache.hadoop.io.compress.GzipCodec, org.apache.hadoop.io.compress.DefaultCodec, org.apache.hadoop.io.compress.BZip2Codec”;        

“io.sort.mb” = “2048″;
}

$mapredConfig = new-object ‘Microsoft.WindowsAzure.Management.HDInsight.Cmdlet.DataObjects.AzureHDInsightMapReduceConfiguration’

$mapredConfig.Configuration = @{                

“mapred.tasktracker.map.tasks.maximum”=”4″;

$clusterConfig = New-AzureHDInsightClusterConfig -ClusterSizeInNodes 64 

$clusterConfig = $clusterConfig | Add-AzureHDInsightConfigValues -Core $coreConfig -MapReduce $mapredConfig 

$clusterConfig | New-AzureHDInsightCluster -Credential $clusterCreds -Location $location -Name $clusterName -Verbose -ErrorAction Stop

Happy Hadooping!

Leave a Reply

Your email address will not be published. Required fields are marked *

*

You may use these HTML tags and attributes: <a href="" title=""> <abbr title=""> <acronym title=""> <b> <blockquote cite=""> <cite> <code> <del datetime=""> <em> <i> <q cite=""> <strike> <strong>