Hadoop for Developers (4 days) Schulung

Kurs Code

hadoopdev

Dauer

28 hours (üblicherweise 4 Tage inklusive Pausen)

Voraussetzungen

  • comfortable with Java programming language (most programming exercises are in java)
  • comfortable in Linux environment (be able to navigate Linux command line, edit files using vi / nano)

Lab environment

Zero Install : There is no need to install hadoop software on students’ machines! A working hadoop cluster will be provided for students.

Students will need the following

  • an SSH client (Linux and Mac already have ssh clients, for Windows Putty is recommended)
  • a browser to access the cluster. We recommend Firefox browser

Überblick

Apache Hadoop ist das beliebteste Framework für die Verarbeitung von Big Data auf Serverclustern. In diesem Kurs lernen Entwickler verschiedene Komponenten des Hadoop Ökosystems (HDFS, MapReduce, Pig, Hive und HBase) kennen.

    Machine Translated

    Schulungsübersicht

    Section 1: Introduction to Hadoop

    • hadoop history, concepts
    • eco system
    • distributions
    • high level architecture
    • hadoop myths
    • hadoop challenges
    • hardware / software
    • lab : first look at Hadoop

    Section 2: HDFS

    • Design and architecture
    • concepts (horizontal scaling, replication, data locality, rack awareness)
    • Daemons : Namenode, Secondary namenode, Data node
    • communications / heart-beats
    • data integrity
    • read / write path
    • Namenode High Availability (HA), Federation
    • labs : Interacting with HDFS

    Section 3 : Map Reduce

    • concepts and architecture
    • daemons (MRV1) : jobtracker / tasktracker
    • phases : driver, mapper, shuffle/sort, reducer
    • Map Reduce Version 1 and Version 2 (YARN)
    • Internals of Map Reduce
    • Introduction to Java Map Reduce program
    • labs : Running a sample MapReduce program

    Section 4 : Pig

    • pig vs java map reduce
    • pig job flow
    • pig latin language
    • ETL with Pig
    • Transformations & Joins
    • User defined functions (UDF)
    • labs : writing Pig scripts to analyze data

    Section 5: Hive

    • architecture and design
    • data types
    • SQL support in Hive
    • Creating Hive tables and querying
    • partitions
    • joins
    • text processing
    • labs : various labs on processing data with Hive

    Section 6: HBase

    • concepts and architecture
    • hbase vs RDBMS vs cassandra
    • HBase Java API
    • Time series data on HBase
    • schema design
    • labs : Interacting with HBase using shell;   programming in HBase Java API ; Schema design exercise

    Erfahrungsberichte

    ★★★★★
    ★★★★★

    Verwandte Kategorien

    Sonderangebote

    Sonderangebote Newsletter

    Wir behandeln Ihre Daten vertraulich und werden sie nicht an Dritte weitergeben.
    Sie können Ihre Einstellungen jederzeit ändern oder sich ganz abmelden.

    EINIGE UNSERER KUNDEN

    is growing fast!

    We are looking to expand our presence in Austria!

    As a Business Development Manager you will:

    • expand business in Austria
    • recruit local talent (sales, agents, trainers, consultants)
    • recruit local trainers and consultants

    We offer:

    • Artificial Intelligence and Big Data systems to support your local operation
    • high-tech automation
    • continuously upgraded course catalogue and content
    • good fun in international team

    If you are interested in running a high-tech, high-quality training and consulting business.

    Apply now!