Spark on Mesos Installation Guide

This post describes how you can set up Apache Spark to work with Apache Mesos. In another post I also describe how you can set up Apache Hadoop to work on Mesos. So by following these instructions you can have both Spark and Hadoop running on the same Mesos cluster.
*The instructions bellow have been tested with mesos 0.20 and Spark 1.1.0 ((Update: 3/16/2016: Have also tested with mesos 0.27.2 and Spark 1.6.1 on Ubuntu Trusty 14.04 – Most steps work as bellow just by changing the version numbers. I explicitly note when there is some difference on compiling steps between the two versions)


  • I assume you have already set up HDFS to your cluster. If not, follow my post here
  • I also assume you have already installed Mesos. If not follow the instructions here

To run Spark with Mesos you should do the following:

  1. Download the Spark tar file
$ wget 
$ tar zxfv spark-1.1.0.tgz
$ cd spark-1.1.0
  1. Build spark
    • (METHOD A)
      $ sbt/sbt assembly
    • (METHOD B)
      $ ./ --tgz

      Careful! Build as necessary if you use a different HDFS version than the default (1.0.4) – For example for the Cloudera CDH5.1.2 that I described how to install in another post use:

      $ ./ --tgz -Dhadoop.version=2.3.0-cdh5.1.2 -Dprotobuf.version=2.5.0 -DskipTests
      • Compiling with -Dhadoop.version=2.3.0-mr1-cdh5.1.2 won’t work anymore as the codehaus  repo is no longer active and the files hosted now on Apache central are renamed!
    • In the above example I am specifying the protobuf version to use to avoid the following error that will be thrown when trying to run Spark (This is no longer needed on the newer Spark 1.6+ version):
      Exception in thread "main" java.lang.VerifyError: class org.apache.hadoop.hdfs.protocol.proto.ClientNamenodeProtocolProtos$SetOwnerRequestProto overrides final method getUnknownFields.()Lcom/google/protobuf/UnknownFieldSet;
    • You can find a detailed list of how to build with different HDFS versions here
    • Careful for the Spark 1.1 version! I recommend you compile by setting your JAVA_HOME to JAVA 6.  Later its ok to set it back to JAVA 7. If you compile with JAVA 7 and then run executors with JAVA 6 you will get an error on the executors. Moreover, I’ve seen this error even when having JAVA 7 on the executors. The worst part is that with a newer version of SPARK (1.2.1) there is not even an error thrown on the executors. You just see executors getting lost…
  2. Tar the version you’ve built and put into HDFS so it can be shipped to the executors
    $ tar czfv spark-1.1.0.tgz spark-1.1.0
    $ sudo -u hdfs hadoop fs -put /hdfsuser/spark-x.x.x.tgz /

    Careful! If you deploy the wrong .tgz file (such as the one you just downloaded instead of the one produced after running make-distribution) then the tasks will fail. To debug you should check your executor log files. In this case the error will look like this:

    ls: cannot access /tmp/mesos/slaves/20140913-135403-421003786-5050-24966-0/frameworks/20140914-153830-421003786-5050-27925-0001/executors/20140913-135403-421003786-5050-24966-0/runs/b17fb191-4db1-4aa7-8363-3ff0f12d88a3/spark-1.1.0/assembly/target/scala-2.10: No such file or directory
  3. Modify spark-1.1.0/conf/ by adding the following lines of configuration code:
    export MESOS_NATIVE_LIBRARY=/root/mesos-installation/lib/
    export SPARK_EXECUTOR_URI=hdfs://euca-10-2-24-25:9000/spark-1.1.0-bin-2.3.0.tgz
    export HADOOP_CONF_DIR=/etc/hadoop/conf.mesos-cluster/
  4. To run your applications with spark-submit without having to modify each time yourcode edit the configuration file: spark-1.1.0/conf/spark-defaults.conf by adding the following line:
    # is the internal Eucalyptus IP of Master
    spark.master mesos://zk://
    # euca-10-2-24-25 is the hostname – put there whatever you get by running the command `hostname`
    spark.executor.uri hdfs://euca-10-2-24-25:9000/spark-1.1.0-bin-2.3.0.tgz

6 thoughts on “Spark on Mesos Installation Guide

  1. i have set the SPARK_EXECUTOR_URI=hdfs:///tmp/spark-1.1.1-bin-cdh4.tgz like thisand verified that URI is correct, also tried changing URI like this: SPARK_EXECUTOR_URI=hdfs://mesos-master:9000/tmp/spark-1.1.1-bin-cdh4.tgz

    any job fails to run using spark-shell that involves the slaves. I’m seeing the following error on the slaves: Failed to fetch URIs for container so it seems that my slaves are unable to access the executor URI?

    Do you have any ideas?

    • Make sure you can access the file from the slave machine. It is probably a network-access or file permissions issue. The URI should be in the second of the formats you provided, ie : hdfs://ip-address:9000/filename.tgz
      Make sure that if you run hadoop fs -get hdfs://ip-address:9000/filename.tgz you can retrieve the file. This is the command that the executor will run internally. So if this command works your executor will be able to see the file correctly.

    • Have you resolved this issue? It sounds like you haven’t configured correctly the network communication between your slaves and masters. Make sure you can ssh to each of them and this will probably fix this issue.

