Automated Big Data Stack Cloud Deployment and Configuration

I am happy to announce that we recently open-sourced under the BSD license, ¬†the tools I’ve developed and used for my research at UCSB, to automatically deploy and configure in high availability mode a full big data stack in the cloud. The tools automate the deployment and configuration of Apache Mesos cluster manager and Spark, Hadoop, Storm, Kafka, and Hama data processing engines in Eucalyptus private cloud. High Availability mode is also supported (A full functioning Zookeeper cluster, and Mesos masters/ secondary masters are setup automatically).

These tools have been severely tested with Mesos, Spark, Map-Reduce, and Storm for the specific versions specified on the readme file. They also provide the option to deploy a Spark standalone cluster on Eucalyptus if you don’t need Apache Mesos. The only prerequisite is that you have a running Eucalyptus cloud and root access to your cluster. Everything else is very easily configurable on the scripts and you only need to run a simple command with arguments and wait until everything is done for you!

If you want to use on Amazon EC2 you will need to change the connector (Notice though that if you only care about Spark/ Mesos deployment on EC2 a better starting point might be this github repo instead). Similarly, if you need to use with more recent versions you’ll need to modify a couple of lines on the configuration files. I’ll try to support any reasonable requests but in general you are on your own ūüôā

Happy deployments!!!

 

Advertisements

Most Useful Hadoop Commands

This guide is not meant to be comprehensive and I am not trying to list simple and very frequently used commands like hadoop fs -ls that are very similar to what we use in a linux shell. If you need a comprehensive guide you can always check the Hadoop commands manual. Instead, I am listing some more “special” commands that I use frequently to manage a Hadoop cluster (HDFS, MapReduce, Zookeepers etc) in the long term and to detect/ fix problems. This is a work in progress post – I intend to update it with new commands based on how often I am using them on my clusters. Feel free to add a comment with commands you find more useful to perform administrative tasks.

HDFS

  1. Set the dfs replication recursively for all existing files
    hadoop dfs -setrep -w 1 -R /
  2. Create a report to check HDFS health
    hadoop dfsadmin -report
  3. Check HDFS file system
    hadoop fsck /
  4. Run cluster balancer – make sure files are distributed in a balanced way across slaves.
    sudo -u hdfs hdfs balancer
  5. Use after removing or adding a datanode
    hadoop dfsadmin -refreshNodes
    sudo -u mapred hadoop mradmin -refreshNodes
  6. When hadoop master enters safe node (often because disk space is not enough to support your desired replication factor)
     hadoop dfsadmin -safemode leave
  7. Display the datanodes that store a particular file with name “filename”.
    hadoop fsck /file-path/filename -files -locations -blocks

    Sample output:

     FSCK started by root (auth:SIMPLE) from /10.2.31.174 for path /spark-1.2.1-bin-2.3.0-mr1-cdh5.1.2.tgz at Wed May 27 17:31:15 PDT 2015
    
     /spark-1.2.1-bin-2.3.0-mr1-cdh5.1.2.tgz 186192499 bytes, 2 block(s): OK
    
     0. BP-323016323-10.2.31.174-1424856295335:blk_1073799439_60141 len=134217728 repl=3 [ip1:50010, ip2:50010, ip3:50010]
    
     1. BP-323016323-10.2.31.174-1424856295335:blk_1073799440_60142 len=51974771 repl=3 [ip1:50010, ip2:50010, ip3:50010]Status: HEALTHY
    
     Total size: 186192499 B
    
     Total dirs: 0
    
     Total files: 1
    
     Total symlinks: 0
    
     Total blocks (validated): 2 (avg. block size 93096249 B)
    
     Minimally replicated blocks: 2 (100.0 %)
    
     Over-replicated blocks: 0 (0.0 %)
    
     Under-replicated blocks: 0 (0.0 %)
    
     Mis-replicated blocks: 0 (0.0 %)
    
     Default replication factor: 3
    
     Average block replication: 3.0
    
     Corrupt blocks: 0
    
     Missing replicas: 0 (0.0 %)
    
     Number of data-nodes: 6
    
     Number of racks: 1
    
     FSCK ended at Wed May 27 17:31:15 PDT 2015 in 1 milliseconds

Map Reduce

  1. List active map-reduce jobs
    hadoop job -list
  2. Kill a job
    hadoop job -kill jobname
  3. Get jobtrackers state (active – standby)
    sudo -u mapred hadoop mrhaadmin -getServiceState jt1

    – where jt1 is the name of each of the jobtracker as configured on you mapred-site.xml file.

Zookeeper

  1. Initialize the High Availability state on zookeeper
    hdfs zkfc -formatZK
  2. Check mode of each zookeeper server:
    echo srvr | nc localhost 2181 | grep Mode

Apache Hama on Mesos

This post describes how you can set up Apache Hama to work with Apache Mesos. In another post I also describe how you can set up Apache Hadoop (http://wp.me/p3pTv0-j) and Apache Spark (http://wp.me/p3pTv0-c) to work on Mesos.

*The instructions have been tested with mesos 0.20.0 and Hama 0.7.0. My cluster is a Eucalyptus private cloud (Version 3.4.2) with an Ubuntu 12.04.4 LTS image, but the instructions should work for any cluster running Ubuntu or even different Linux distributions after some small changes.

Prerequisites:

  • I assume you have already set up HDFS CDH 5.1.2 to your cluster. If not, follow my post here
  • I also assume you have already installed Mesos. If not, follow¬†the instructions here

Installation Steps:

  • IMPORTANT: The current git repo has a bug working with CDH5 on Mesos. ¬†If you compile with this version you will get an error similar to the following:
    ERROR bsp.MesosExecutor: Caught exception, committing suicide.
    java.util.ServiceConfigurationError: org.apache.hadoop.fs.FileSystem: Provider org.apache.hadoop.fs.LocalFileSystem not found
            at java.util.ServiceLoader.fail(ServiceLoader.java:231)
            at java.util.ServiceLoader.access$300(ServiceLoader.java:181)
            at java.util.ServiceLoader$LazyIterator.next(ServiceLoader.java:365)
            at java.util.ServiceLoader$1.next(ServiceLoader.java:445)
            at org.apache.hadoop.fs.FileSystem.loadFileSystems(FileSystem.java:2364)
            at org.apache.hadoop.fs.FileSystem.getFileSystemClass(FileSystem.java:2375)
            at org.apache.hadoop.fs.FileSystem.createFileSystem(FileSystem.java:2392)
            at org.apache.hadoop.fs.FileSystem.access$200(FileSystem.java:89)
            at org.apache.hadoop.fs.FileSystem$Cache.getInternal(FileSystem.java:2431)
            at org.apache.hadoop.fs.FileSystem$Cache.get(FileSystem.java:2413)
            at org.apache.hadoop.fs.FileSystem.get(FileSystem.java:368)
            at org.apache.hadoop.fs.FileSystem.getLocal(FileSystem.java:339)
            at org.apache.hama.bsp.GroomServer.deleteLocalFiles(GroomServer.java:483)
            at org.apache.hama.bsp.GroomServer.initialize(GroomServer.java:321)
            at org.apache.hama.bsp.GroomServer.run(GroomServer.java:860)
            at org.apache.hama.bsp.MesosExecutor$1.run(MesosExecutor.java:92)
    
  • SOLUTION: Thanks to the Hama community and in particular Jeff Fenchel the bug was quickly fixed. You can get a patched Hama 0.7.0 version from Jeff’s github repo here¬†¬†or¬†my forked repo here in case Jeff deletes it in the future. When the patch is merged I will update the links.

So now we are good to go:

  1.  Remove any previous versions on Hama on your HDFS:
    $ hadoop fs -rm /hama.tar.gz
  2. Build Hama for the particular HDFS and Mesos version you are using:
    $ mvn clean install -Phadoop2 -Dhadoop.version=2.3.0-cdh5.1.2 -Dmesos.version=0.20.0 -DskipTests
  3. Put it to HDFS (Careful with the naming of your tar file – Has to be the same with your configuration file)
    $ hadoop fs -put dist/target/hama-0.7.0-SNAPSHOT.tar.gz /hama.tar.gz
  4. Move to the distribution directory:
    $ cd dist/target/hama-0.7.0-SNAPSHOT/hama-0.7.0-SNAPSHOT/
  5. Make sure LD_LIBRARY_PATH or the MESOS_NATIVE_LIBRARY environment variables are set correctly to your mesos installation libraries. This can be under usr/lib/mesos (default) or anywhere you specified when installing mesos. For example:
    $ export LD_LIBRARY_PATH=/root/mesos-installation/lib/

    If you don’t set them correctly then you will get an ugly stack trace like this one on your bspmaster logs and the bspmaster won’t start:

    2014-11-08 01:23:50,646 FATAL org.apache.hama.BSPMasterRunner: java.lang.UnsatisfiedLinkError: Expecting an absolute path of the library:
            at java.lang.Runtime.load0(Runtime.java:792)
            at java.lang.System.load(System.java:1062)
            at org.apache.mesos.MesosNativeLibrary.load(MesosNativeLibrary.java:50)
            at org.apache.mesos.MesosNativeLibrary.load(MesosNativeLibrary.java:79)
            at org.apache.mesos.MesosSchedulerDriver.(MesosSchedulerDriver.java:61)
            at org.apache.hama.bsp.MesosScheduler.start(MesosScheduler.java:63)
            at org.apache.hama.bsp.MesosScheduler.init(MesosScheduler.java:46)
            at org.apache.hama.bsp.SimpleTaskScheduler.start(SimpleTaskScheduler.java:273)
            at org.apache.hama.bsp.BSPMaster.startMaster(BSPMaster.java:523)
            at org.apache.hama.bsp.BSPMaster.startMaster(BSPMaster.java:500)
            at org.apache.hama.BSPMasterRunner.run(BSPMasterRunner.java:46)
            at org.apache.hadoop.util.ToolRunner.run(ToolRunner.java:70)
            at org.apache.hadoop.util.ToolRunner.run(ToolRunner.java:84)
            at org.apache.hama.BSPMasterRunner.main(BSPMasterRunner.java:56)
  6. Configure hama with Mesos. I found that the instructions from the Hama wiki  are missing some things. For example it is not described to add the fs.default.name property and without this you will get an error like this on the master when running your code:
    Error reading task output: http://euca-x-x-x-x.eucalyptus.race.cs.ucsb.edu:40015/tasklog?plaintext=true&taskid=attempt_201411051242_0001_000000_0&filter=stdout

    and an ugly stack trace on the Groom Server executor log like this one:

    14/11/05 12:43:08 INFO bsp.GroomServer: Launch 1 tasks.
    14/11/05 12:43:38 WARN bsp.GroomServer: Error initializing attempt_201411051242_0001_000000_0:
    java.io.FileNotFoundException: File file:/mnt/bsp/system/submit_6i82le/job.xml does not exist
    at org.apache.hadoop.fs.RawLocalFileSystem.deprecatedGetFileStatus(RawLocalFileSystem.java:511)
    at org.apache.hadoop.fs.RawLocalFileSystem.getFileLinkStatusInternal(RawLocalFileSystem.java:724)
    at org.apache.hadoop.fs.RawLocalFileSystem.getFileStatus(RawLocalFileSystem.java:501)
    at org.apache.hadoop.fs.FilterFileSystem.getFileStatus(FilterFileSystem.java:397)
    at org.apache.hadoop.fs.FileUtil.copy(FileUtil.java:337)at org.apache.hadoop.fs.FileUtil.copy(FileUtil.java:289)
    at org.apache.hadoop.fs.LocalFileSystem.copyToLocalFile(LocalFileSystem.java:88)
    at org.apache.hadoop.fs.FileSystem.copyToLocalFile(FileSystem.java:1913)
    at org.apache.hama.bsp.GroomServer.localizeJob(GroomServer.java:707)
    at org.apache.hama.bsp.GroomServer.startNewTask(GroomServer.java:562)
    at org.apache.hama.bsp.GroomServer.access$000(GroomServer.java:86)
    at org.apache.hama.bsp.GroomServer$DispatchTasksHandler.handle(GroomServer.java:173)
    at org.apache.hama.bsp.GroomServer$Instructor.run(GroomServer.java:237)14/11/05 12:43:39 INFO bsp.GroomServer: Launch 1 tasks.
    14/11/05 12:43:40 INFO bsp.MesosExecutor: Killing task : Task_0
    14/11/05 12:43:40 INFO ipc.Server: Stopping server on 31000
    14/11/05 12:43:40 INFO ipc.Server: IPC Server handler 0 on 31000: exiting
    14/11/05 12:43:40 INFO ipc.Server: Stopping IPC Server listener on 31000
    14/11/05 12:43:40 INFO ipc.Server: Stopping IPC Server Responder
    14/11/05 12:43:40 INFO ipc.Server: Stopping server on 50001
    14/11/05 12:43:40 INFO ipc.Server: IPC Server handler 1 on 50001: exiting
    14/11/05 12:43:40 INFO ipc.Server: IPC Server handler 3 on 50001: exiting
    14/11/05 12:43:40 INFO ipc.Server: IPC Server handler 4 on 50001: exiting
    14/11/05 12:43:40 INFO ipc.Server: IPC Server handler 0 on 50001: exiting
    14/11/05 12:43:40 INFO ipc.Server: IPC Server handler 2 on 50001: exiting
    14/11/05 12:43:40 INFO ipc.Server: Stopping IPC Server listener on 5000114/11/05 12:43:40 INFO ipc.Server: Stopping IPC Server Responder
    14/11/05 12:43:42 WARN mortbay.log: /tasklog: java.io.IOException: Closed
    
    

    To be fair you are able to find all the configuration you need from the Hama configuration instructions but its not everything on one page.

    So, to save you from some of the issues you might encounter configuring Hama, the hama-site.xml configuration in my case looks like this:

    <configuration>
     <property>
     <name>bsp.master.address</name>
     <value>euca-10-2-112-10.eucalyptus.internal:40000</value>
     <description>The address of the bsp master server. Either the
     literal string "local" or a host[:port] (where host is a name or
     IP address) for distributed mode.
     </description>
     </property>
     <property>
     <name>bsp.master.port</name>
     <value>40000</value>
     <description>The port master should bind to.</description>
     </property>
     <property>
     <name>bsp.master.TaskWorkerManager.class</name>
     <value>org.apache.hama.bsp.MesosScheduler</value>
     <description>Instructs the scheduler to use Mesos to execute tasks of each job
     </description>
     </property>
     <property>
     <name>fs.default.name</name>
     <value>hdfs://euca-10-2-112-10.eucalyptus.internal:9000</value>
     <description>
     The name of the default file system. Either the literal string
     "local" or a host:port for HDFS.
     </description>
     </property>
     <property>
     <name>hama.mesos.executor.uri</name>
     <value>hdfs://euca-10-2-112-10.eucalyptus.internal:9000/hama.tar.gz</value>
     <description>This is the URI of the Hama distribution
     </description>
     </property>
    
     <!-- Hama requires one cpu and memory defined by bsp.child.java.opts for each slot.
     This means that a cluster with bsp.tasks.maximum.total set to 2 and bsp.child.jova.opts set to -Xmx1024m
     will need at least 2 cpus and and 2048m of memory. -->
    
     <property>
     <name>bsp.tasks.maximum.total</name>
     <value>2</value>
     <description>This is an override for the total maximum tasks that may be run.
     The default behavior is to determine a value based on the available groom servers.
     However, if using Mesos, the groom servers are not yet allocated.
     So, a value indicating the maximum number of slots available in the cluster is needed.
     </description>
     </property>
    
     <property>
     <name>hama.mesos.master</name>
     <value>zk://euca-10-2-112-10.eucalyptus.internal:2181/mesos</value>
     <description>This is the address of the Mesos master instance.
     If you're using Zookeeper for master election, use the Zookeeper address here (i.e.,zk://zk.apache.org:2181/hadoop/mesos).
     </description>
     </property>
     <property>
     <name>bsp.child.java.opts</name>
     <value>-Xmx1024m</value>
     <description>Java opts for the groom server child processes.
     </description>
     </property>
    
     <property>
     <name>bsp.system.dir</name>
     <value>${hadoop.tmp.dir}/bsp/system</value>
     <description>The shared directory where BSP stores control files.
     </description>
     </property>
     <property>
     <name>bsp.local.dir</name>
     <value>/mnt/bsp/local</value>
     <description>local directory for temporal store.</description>
     </property>
     <property>
     <name>hama.tmp.dir</name>
     <value>/mnt/hama/tmp/hama-${user.name}</value>
     <description>Temporary directory on the local filesystem.</description>
     </property>
     <property>
     <name>bsp.disk.queue.dir</name>
     <value>${hama.tmp.dir}/messages/</value>
     <description>Temporary directory on the local message buffer on disk.</description>
     </property>
     <property>
     <name>hama.zookeeper.quorum</name>
     <value>euca-10-2-112-10.eucalyptus.internal</value>
     <description>Comma separated list of servers in the ZooKeeper Quorum.
     For example, "host1.mydomain.com,host2.mydomain.com,host3.mydomain.com".
     By default this is set to localhost for local and pseudo-distributed modes
     of operation. For a fully-distributed setup, this should be set to a full
     list of ZooKeeper quorum servers. If HAMA_MANAGES_ZK is set in hama-env.sh
     this is the list of servers which we will start/stop zookeeper on.
     </description>
     </property>
     <property>
     <name>hama.zookeeper.property.clientPort</name>
     <value>2181</value>
     <description>The port to which the zookeeper clients connect
     </description>
     </property>
    
    </configuration>
     
  7. You can also configure hama-env.xml if you want for example to output the logs to a different directory than the default. *Note: You don’t need to ship the configuration to your slaves.
  8. Now we are ready to run bspmaster:
    $ ./bin/hama-daemon.sh start bspmaster
    

    Make sure that the bspmaster started without problems by checking the log files

    $ less hama-root-bspmaster-$HOSTNAME.log 
    $ less hama-root-bspmaster-$HOSTNAME.out
    
  9. If everything went ok you are ready to run some examples:
    $ $ ./bin/hama jar hama-examples-0.7.0-SNAPSHOT.jar gen fastgen 100 10 randomgraph 2

    Listing the directories on your HDFS to make sure everything went fine. You should get something like this:

    Found 2 items
    -rw-r--r-- 3 root hadoop 2241 2014-11-07 16:49 /user/root/randomgraph/part-00000
    -rw-r--r-- 3 root hadoop 2243 2014-11-07 16:49 /user/root/randomgraph/part-00001

    These two files are two partitions of a graph with 100 nodes and 1K edges.

  10. $ ./bin/hama jar hama-examples-0.7.0-SNAPSHOT.jar pagerank randomgraph pagerankresult 4

    And again if you list your files in HDFS

    $ hadoop fs -ls /user/root/randomgraph
    

    you should get something like this:

    Found 2 items
    -rw-r--r-- 3 root hadoop 1194 2014-11-05 16:02 /user/root/pagerankresult/part-00000
    -rw-r--r-- 3 root hadoop 1189 2014-11-05 16:02 /user/root/pagerankresult/part-00001

    -You can find more examples to run on the Apache Hama website example’s page here

Cloudera HDFS CDH5 Installation to use with Mesos

This post is a guide for installing Cloudera HDFS CDH5 on Eucalyptus (Version 3.4.2) in order to use it later with Apache Mesos. The difference of installing HDFS for any kind of use is that we don’t start the Tasktracker on the slave nodes. This is something that Mesos will do each time a hadoop job is running.

The steps you should take are the following:

  • Disable iptables firewall:
    $ ufw disable
  • Disable selinux:
    $ setenforce 0
  • Make sure instances have unique hostnames
  • Make sure the /etc/hosts file on each system has the IP addresses and fully-qualified domain names (FQDN) of all the members of the cluster.
    • hostname –fqdn
    • An example configuration is the following:
      • For the datanode:
        127.0.0.1 localhost
        10.2.24.25 euca-10-2-24-25.eucalyptus.internal
        x.x.x.x euca-10-2-24-25
        
      • For the namenode:
        127.0.0.1 localhost
        10.2.85.213 euca-10-2-85-213.eucalyptus.internal
        y.y.y.y euca-10-2-85-213
        
      • where x.x.x.x and y.y.y.y are the external IPs of your namenode and datanode respectively.
  • Be careful to create the directories that the data ¬†node and name node are using andwill be set later on the configuration files inside the /etc/hadoop/conf.name directory. Also remember to change ownership tohdfs:hdfs
    • on data node:
      $ mkdir -p /mnt/cloudera-hdfs/1/dfs/dn /mnt/cloudera-hdfs/2/dfs/dn /mnt/cloudera-hdfs/3/dfs/dn /mnt/cloudera-hdfs/4/dfs/dn
      $ chown -R hdfs:hdfs /mnt/cloudera-hdfs/1/dfs/dn /mnt/cloudera-hdfs/2/dfs/dn /mnt/cloudera-hdfs/3/dfs/dn /mnt/cloudera-hdfs/4/dfs/dn
      • Typically each of the /1 /2 /3 /4 directories should be different mounted devices. Though using Eucalyptus volumes to do so might add up some latency.
    • on name¬†node:
      $ mkdir -p /mnt/cloudera-hdfs/1/dfs/nn /nfsmount/dfs/nn
      $ chown -R hdfs:hdfs /mnt/cloudera-hdfs/1/dfs/nn /nfsmount/dfs/nn
      $ chmod 700 /mnt/cloudera-hdfs/1/dfs/nn /nfsmount/dfs/nn
  • Deploy configuration to all nodes in the cluster
  • Set alternatives to each node:
    $ update-alternatives --install /etc/hadoop/conf hadoop-conf /etc/hadoop/conf.mesos-cluster 50
    $ update-alternatives --set hadoop-conf /etc/hadoop/conf.mesos-cluster
    $ update-alternatives --set hadoop-conf /etc/hadoop/conf.mesos-cluster
  • Add configuration to core-site.xml located under /etc/hadoop/
    • Make sure they have the¬†hostnames ‚Äď not the IP address of the NameNode
  • Install cloudera CDH5: There are multiple ways to do this. Probably the easiest is the following:
    $ wget http://archive.cloudera.com/cdh5/one-click-install/precise/amd64/cdh5-repository_1.0_all.deb
    $ dpkg -i cdh5-repository_1.0_all.deb
    • To master node:
      $ sudo apt-get update;
      $ sudo apt-get update; sudo apt-get install hadoop-hdfs-namenode
    • To slave nodes:
      $ sudo apt-get update; sudo apt-get install hadoop-hdfs-datanode
      $ sudo apt-get update; sudo apt-get install hadoop-client

      cp /etc/hadoop/conf.empty/log4j.properties /etc/hadoop/conf.name/log4j.properties

  • Upgrade/ Format namenode:
    $ service hadoop-hdfs-namenode upgrade
    $ sudo -u hdfs hdfs namenode -format
  • *If the storage directories change¬†hdfs should be reformatted
  • Start HDFS:
    • On master:
      $ service hadoop-hdfs-namenode start
    • On slave:
      $ service hadoop-hdfs-datanode start
  • Optionally: Configure your cluster to start the services after a system restart
    • On¬†master node:
      $ update-rc.d hadoop-hdfs-namenode defaults
      
      • If you have also setted up zookeeper then:
        $ update-rc.d zookeeper-server defaults
    • On the slaves:
      $ update-rc.d hadoop-hdfs-datanode defaults