Big Data/Big Insight

 

Understanding Big Data and Hadoop

 

 

Learning Objectives – In this module, you will understand Big Data, the limitations of the existing solutions for Big Data problem, how Hadoop solves the Big Data problem, the common Hadoop ecosystem component, Anatomy of File Write and Read, how MapReduce Framework works.
Hadoop Architecture and HDFS

Learning Objectives – In this module, you will learn the Hadoop HDFS Architecture, Important Configuration files in a Hadoop Cluster, Data Loading Techniques, how to setup single node hadoop cluster.How to work with local file system comands

Sqoop

Why we need sqoop, How to display tables from Rdbms Mysql for Sqoop, how to display Database from Rdbms mysql for sqoop, how to import all tables from a specific databasefrom RDBMS Mysql to HDFS(Hadoop), How to Import tables from RDBMS MYSQL FROM HDFS(HADOOP), How to export data from HDFS TO RDBMS MYSQL, how to import part of the table from RDBMS MYSQL TO HDF

Hadoop MapReduce Framework

Learning Objectives – In this module, you will understand Hadoop MapReduce framework and the working of MapReduce on data stored in HDFS. You will understand concepts like Input Splits in MapReduce, Combiner & Partitioner ,what are all the file input formats in hadoop(Mapreduce),What type of Keyvalue Pair will be generated our file format is key value text input format,Can we set Required no of mappers and Reducers?,Word Count Job Implementation in Hadoop,How to debugg Word count Job,
Differenace btween Old and New api in Mapreduce,What is importence of RecordReader in Hadoop.

Hive

Learning Objectives – This module will help you in understanding Hive concepts,Hive Data types, Loading and Querying Data in Hive,Hive Background, About Hive, Hive Vs Pig, Hive Architecture and Components, Metastore in Hive, Limitations of Hive, Comparison with Traditional Database, Hive Data Types and Data Models,How to Load data into Hive Table in Hadoop, Partitions and Buckets, Hive Tables(Managed Tables and External Tables),Querying Data, Managing Outputs, What is Single and Multitable insrtion

Pig

Learning Objectives – In this module, you will learn Pig, types of use case we can
use Pig, Pig Latin scripting, PIG running modes, Pig Streaming, Testing PIG Scripts. Pig Data Types, Shell and Utility Commands, Pig Latin : Relational Operators,File Loaders, Group Operator, COGROUP Operator, Joins and COGROUP

Hbase
What is HBase, HBase Model, HBase Read, HBase Write, HBase MemStore, HBase Installation, RDBMS vs HBase, HBase Commands, HBase Example

Oozie and Hadoop Project

Learning Objectives – In this module, you will understand working of multiple
Hadoop ecosystem components together in a Hadoop implementation to solve Big Data
problems. We will discuss multiple data sets and specifications of the project. This
module will also cover Flume & Sqoop demo, Apache Oozie Workflow Scheduler for
Hadoop Jobs, and Hadoop Talend integration.

Topics – Flume and Sqoop Demo, Oozie, Oozie Components, Oozie Workflow,
Scheduling with Oozie, PIG, Hive, and Sqoop,Hadoop Project Demo,