Articles by Alex Rovner

Alex Rovner has joined the Magnetic team as the Director of Data Engineering in February 2014. He has a long history of using Hadoop and other open source projects within it’s ecosystem. Alex is very bullish on Spark and believes that it will replace Hadoop MapReduce as the defacto data processing engine in the very near future.

  1. Installing Spark 1.5 on CDH 5.4

    If you have not tried processing data with Spark yet, you should. It’s the next happening framework, centered around processing data up to 100x more efficiently than Hadoop, while leveraging the existing Hadoop’s components (HDFS and YARN). Since Spark is evolving rapidly, in most cases you will want to run the latest released version by the Spark community, rather than the version packaged with your Hadoop distribution. This guide will walk you through what it takes to get the latest version of Spark running on your cluster.