Apache Spark

Using Apache Spark for big data processing

Introduction to Spark Over the last few years, Hadoop has emerged as the standard platform for big data processing.  At a high level, Hadoop consists of distributed storage (HDFS) and distributed computing (MapReduce).  However MapReduce has a couple of limitations.  Programming in MapReduce is difficult and needs chaining multiple MapReduce jobs in a sequence for …

Using Apache Spark for big data processingRead More »