Course Detail

Big Data Analytics

Big Data Analytics - tvashtaa data solution


Course Detail


Course Description

WHY BIGDATA?


  • What is Big Data?
  • Hadoop Architecture & Components
  • Hadoop Storage & File Formats

HDFS


  • HDFS Basics
  • File Storage
  • Fault Tolerance
  • Hadoop Processing – MapReduce

HIVE


  • What is Hive
  • Modeling in Hive and data loading process
  • Concepts of Partitioning, Bucketing
  • Hive data storage formats (ORC, RC, and Parquet)
  • Introduction to Hive QL and examples
  • Hive as an ELT tool
  • Performance tuning in Hive

MAPREDUCE


  • What Is MapReduce?
  • Basic MapReduce Concepts
  • Concepts of Mappers, Reducers
  • Combiners and Paritioning
  • Inputs and Output to MR Program

BIGDATA USING R


  • Map/Reduce Programming using Java and R
  • Hadoop with R

NOSQL


  • NoSQL in Hadoop

HBASE


  • HBase – Introduction
  • HBase Data Model, HBase Master
  • HBase Families & Components
  • Data Storage and Distribution
  • HBase Master
 

PIG LATIN


  • Basics of Pig and Why Pig?
  • Grunt
  • Pig’s Data Model
  • Writing Evaluation
  • Filter
  • Load & Store Functions
  • Benefits of Pig over SQL language
  • Input and Output formats to MR program

SQOOP


  • Sqoop Overview
  • Sqoop Exercises

OOZIE/FLUME/YARN


  • Oozie Overview
  • Oozie Workflows

[OPTIONAL]
CLOUDERA


  • Introduction to Cloudera Manager
  • Ambari Administration

SPARK/SCALA


  • What Is Spark?, Basic concepts
  • How Spark differs from Map Reduce?
  • Working with SCALA
  • Parallel Programming with Spark
  • Spark Streaming

HADOOP SECURITY


  • Security Overview
  • Knox Exercise
  • Access Control Labels

 

Institute Overview

Hyderabad, Telangana, India

ABOUT US At the outset, Tvashtaa Data Solutions was formed with an objective to emerge as the leading IT Training Provider and create an international presence at a global platform. There is an enormous need and potential to create the right eco... Read More

Related Courses

Google Map