MapReduce Tutorial

MapReduce Tutorial MapReduce tutorial provides important and advanced MapReduce concept. Our MapReduce tutorial involves all MapReduce topics such as MapReduce API, MapReduce Data Flow, Word Count Example, Character Count Example, etc. What is MapReduce? MapReduce is...

Spark Tutorial

Spark tutorial provides important and advanced Spark concepts. Spark is a single analytics platform for large scale data processing. It includes SQL, streaming, machine learning, and graph processing modules. Our Spark tutorial covers all of Apache Spark’s topics,...

Pig Data Types

Pig Data Types: It includes the data types of pig and how they handle concepts such as missing data. It also helps us to explain the data to a pig. The data types of Pig can be divided into two categories: Scalar Data TypesComplex Data Types Scalar Data Types Pig...

HBase Installation

HBase Installation Java and Hadoop should be installed on your Linux machine.          Use the below URL: $cd usr/local/$wget http://www.interior-dsgn.com/apache/hbase/stable/hbase-0.98.8-hadoop2-bin.tar.gz  ...

Apache Spark Components

The Spark project consists of various types of components that are closely integrated. Spark is at its core a computational engine capable of scheduling, distributing, and monitoring multiple apps. The main components of Spark are: Spark CoreSpark SQLSpark...

Spark Architecture

The Apache Spark architecture is well layered, connecting all types of Spark layers and components to the architecture. The main two important components of architecture are: RDD (Resilient Distributed Datasets)DAG (Directed Acyclic Graph) RDD (Resilient Distributed...

Installation Hadoop on Ubuntu

Level 1) Download and install Hadoop 1. beginning you have to create a Hadoop system user through the following command- sudo addgroup hadoop_ sudo adduser --ingroup hadoop_ hduser_ 123  sudo adduser --ingroup hadoop_ hduser_  Now, write your credentials...

Hive Data Type

The Hive Data Types are classified into major categories, which are discussed below: Primitive Data Type in Hive Primitive Data is categorized into four types, which are listed below: Numeric Data TypeDate/Time Data TypeString Data TypeMiscellaneous Data Type Numeric...

HBase Technology

HBase, together with HDFS and MapReduce, is one of the core components of the Hadoop. The Apache Hadoop platform is a highly secure, enterprise-ready extensive data application, which is part of the Hortonworks Data Platform. Some of the biggest companies, such as the...

Apache Pig Run Modes

Apache Pig basically has two execution modes: Local ModeMapReduce Mode Local Mode In Local mode, pig is running in a single JVM and accessing the local file system.This mode is only suitable for small data sets and pig typing.All the available files are installed and...

Pin It on Pinterest