by admin | Mar 29, 2020 | Hadoop
MapReduce Tutorial MapReduce tutorial provides important and advanced MapReduce concept. Our MapReduce tutorial involves all MapReduce topics such as MapReduce API, MapReduce Data Flow, Word Count Example, Character Count Example, etc. What is MapReduce? MapReduce is...
by admin | Mar 16, 2020 | Hadoop
Spark tutorial provides important and advanced Spark concepts. Spark is a single analytics platform for large scale data processing. It includes SQL, streaming, machine learning, and graph processing modules. Our Spark tutorial covers all of Apache Spark’s topics,...
by admin | Mar 16, 2020 | Hadoop
Pig Data Types: It includes the data types of pig and how they handle concepts such as missing data. It also helps us to explain the data to a pig. The data types of Pig can be divided into two categories: Scalar Data TypesComplex Data Types Scalar Data Types Pig...
by admin | Mar 16, 2020 | Hadoop
HBase Installation Java and Hadoop should be installed on your Linux machine. Use the below URL: $cd usr/local/$wget http://www.interior-dsgn.com/apache/hbase/stable/hbase-0.98.8-hadoop2-bin.tar.gz ...
by mayankjtp | Mar 4, 2020 | Hadoop
The Spark project consists of various types of components that are closely integrated. Spark is at its core a computational engine capable of scheduling, distributing, and monitoring multiple apps. The main components of Spark are: Spark CoreSpark SQLSpark...
by mayankjtp | Mar 4, 2020 | Hadoop
The Apache Spark architecture is well layered, connecting all types of Spark layers and components to the architecture. The main two important components of architecture are: RDD (Resilient Distributed Datasets)DAG (Directed Acyclic Graph) RDD (Resilient Distributed...
by mayankjtp | Feb 28, 2020 | Hadoop
Level 1) Download and install Hadoop 1. beginning you have to create a Hadoop system user through the following command- sudo addgroup hadoop_ sudo adduser --ingroup hadoop_ hduser_ 123 sudo adduser --ingroup hadoop_ hduser_ Now, write your credentials...
by mayankjtp | Feb 28, 2020 | Hadoop
The Hive Data Types are classified into major categories, which are discussed below: Primitive Data Type in Hive Primitive Data is categorized into four types, which are listed below: Numeric Data TypeDate/Time Data TypeString Data TypeMiscellaneous Data Type Numeric...
by mayankjtp | Feb 28, 2020 | Hadoop
HBase, together with HDFS and MapReduce, is one of the core components of the Hadoop. The Apache Hadoop platform is a highly secure, enterprise-ready extensive data application, which is part of the Hortonworks Data Platform. Some of the biggest companies, such as the...
by mayankjtp | Feb 28, 2020 | Hadoop
Apache Pig basically has two execution modes: Local ModeMapReduce Mode Local Mode In Local mode, pig is running in a single JVM and accessing the local file system.This mode is only suitable for small data sets and pig typing.All the available files are installed and...