Disclosure: when you buy through links on our site, we may earn an affiliate commission.

Practical Guide to setup Hadoop and Spark Cluster using CDH

Step by step instructions to setup Hadoop and Spark Cluster using Cloudera Distribution of Hadoop (Formerly CCA 131)
3.7
3.7/5
(475 reviews)
7,465 students
Created by

7.5

CourseMarks Score®

6.5

Freshness

6.8

Feedback

8.5

Content

Platform: Udemy
Video: 20h 56m
Language: English
Next start: On Demand

Top Cloudera courses:

Detailed Analysis

CourseMarks Score®

7.5 / 10

CourseMarks Score® helps students to find the best classes. We aggregate 18 factors, including freshness, student feedback and content diversity.

Freshness Score

6.5 / 10
This course was last updated on 6/2019.

Course content can become outdated quite quickly. After analysing 71,530 courses, we found that the highest rated courses are updated every year. If a course has not been updated for more than 2 years, you should carefully evaluate the course before enrolling.

Student Feedback

6.8 / 10
We analyzed factors such as the rating (3.7/5) and the ratio between the number of reviews and the number of students, which is a great signal of student commitment.

New courses are hard to evaluate because there are no or just a few student ratings, but Student Feedback Score helps you find great courses even with fewer reviews.

Content Score

8.5 / 10
Video Score: 10.0 / 10
The course includes 20h 56m video content. Courses with more videos usually have a higher average rating. We have found that the sweet spot is 16 hours of video, which is long enough to teach a topic comprehensively, but not overwhelming. Courses over 16 hours of video gets the maximum score.
The average video length is 6 hours 57 minutes of 10 Cloudera courses on Udemy.
Detail Score: 10.0 / 10

The top online course contains a detailed description of the course, what you will learn and also a detailed description about the instructor.

Extra Content Score: 5.5 / 10

Tests, exercises, articles and other resources help students to better understand and deepen their understanding of the topic.

This course contains:

0 article.
0 resource.
0 exercise.
0 test.

Table of contents

Description

Cloudera is one of the leading vendor for distributions related to Hadoop and Spark. As part of this Practical Guide, you will learn step by step process of setting up Hadoop and Spark Cluster using CDH.
Install – Demonstrate an understanding of the installation process for Cloudera Manager, CDH, and the ecosystem projects.
•Set up a local CDH repository
•Perform OS-level configuration for Hadoop installation
•Install Cloudera Manager server and agents
•Install CDH using Cloudera Manager
•Add a new node to an existing cluster
•Add a service using Cloudera Manager
Configure – Perform basic and advanced configuration needed to effectively administer a Hadoop cluster
•Configure a service using Cloudera Manager
•Create an HDFS user’s home directory
•Configure NameNode HA
•Configure ResourceManager HA
•Configure proxy for Hiveserver2/Impala
Manage – Maintain and modify the cluster to support day-to-day operations in the enterprise
•Rebalance the cluster
•Set up alerting for excessive disk fill
•Define and install a rack topology script
•Install new type of I/O compression library in cluster
•Revise YARN resource assignment based on user feedback
•Commission/decommission a node
Secure – Enable relevant services and configure the cluster to meet goals defined by security policy; demonstrate knowledge of basic security practices
•Configure HDFS ACLs
•Install and configure Sentry
•Configure Hue user authorization and authentication
•Enable/configure log and query redaction
•Create encrypted zones in HDFS
Test – Benchmark the cluster operational metrics, test system configuration for operation and efficiency
•Execute file system commands via HTTPFS
•Efficiently copy data within a cluster/between clusters
•Create/restore a snapshot of an HDFS directory
•Get/set ACLs for a file or directory structure
•Benchmark the cluster (I/O, CPU, network)
Troubleshoot – Demonstrate ability to find the root cause of a problem, optimize inefficient execution, and resolve resource contention scenarios
•Resolve errors/warnings in Cloudera Manager
•Resolve performance problems/errors in cluster operation
•Determine reason for application failure
•Configure the Fair Scheduler to resolve application delays
Our Approach
•You will start with creating Cloudera QuickStart VM (in case you have laptop with 16 GB RAM with Quad Core). This will facilitate you to get comfortable with Cloudera Manager.
•You will be able to sign up for GCP and avail credit up to $300 while offer lasts. Credits are valid up to year.
•You will then understand brief overview about GCP and provision 7 to 8 Virtual Machines using templates. You will also attaching external hard drive to configure for HDFS later.
•Once servers are provisioned, you will go ahead and set up Ansible for Server Automation.
•You will take care of local repository for Cloudera Manager and Cloudera Distribution of Hadoop using Packages.
•You will then setup Cloudera Manager with custom database and then Cloudera Distribution of Hadoop using Wizard that comes as part of Cloudera Manager.
•As part of setting up of Cloudera Distribution of Hadoop you will setup HDFS, learn HDFS Commands, Setup YARN, Configure HDFS and YARN High Availability, Understand about Schedulers, Setup Spark, Transition to Parcels, Setup Hive and Impala, Setup HBase and Kafka etc.

You will learn

✓ Learn Hadoop and Spark Administration using CDH
✓ Provision Cluster from GCP (Google Cloud Platform) to setup Hadoop and Spark Cluster using CDH
✓ Setup Ansible for server automation to setup pre-requisites to setup Hadoop and Spark Cluster using CDH
✓ Setup 8 node cluster from scratch using CDH
✓ Understand Architecture of HDFS, YARN, Spark, Hive, Hue and many more

Requirements

• Basic Linux Skills
• A 64 bit computer with minimum of 4 GB RAM
• Operating System – Windows 10 or Mac or Linux Flavor

This course is for

• System Administrators who want to understand Big Data eco system and setup clusters
• Experienced Big Data Administrators who want to learn how to manage Hadoop and Spark Clusters setup using CDH
• Entry level professionals who want to learn basics and Setup Big Data Clusters

How much does the Practical Guide to setup Hadoop and Spark Cluster using CDH course cost? Is it worth it?

The course costs $14.99. And currently there is a 40% discount on the original price of the course, which was $24.99. So you save $10 if you enroll the course now.
The average price is $15.3 of 10 Cloudera courses. So this course is 2% cheaper than the average Cloudera course on Udemy.

Does the Practical Guide to setup Hadoop and Spark Cluster using CDH course have a money back guarantee or refund policy?

YES, Practical Guide to setup Hadoop and Spark Cluster using CDH has a 30-day money back guarantee. The 30-day refund policy is designed to allow students to study without risk.

Are there any SCHOLARSHIPS for this course?

Currently we could not find a scholarship for the Practical Guide to setup Hadoop and Spark Cluster using CDH course, but there is a $10 discount from the original price ($24.99). So the current price is just $14.99.

Who is the instructor? Is Durga Viswanatha Raju Gadiraju a SCAM or a TRUSTED instructor?

Durga Viswanatha Raju Gadiraju has created 15 courses that got 9,800 reviews which are generally positive. Durga Viswanatha Raju Gadiraju has taught 235,985 students and received a 4.4 average review out of 9,800 reviews. Depending on the information available, Durga Viswanatha Raju Gadiraju is a TRUSTED instructor.
CEO at ITVersity and CTO at Analytiqs, Inc
20+ years of experience in executing complex projects using a vast array of technologies including Big Data and the Cloud.
ITVersity, Inc. – is a US-based organization that provides quality training for IT professionals and we have a track record of training hundreds of thousands of professionals globally.
Building an IT career for people with required tools such as high-quality material, labs, live support, etc to upskill and cross-skill is paramount for our organization.
At this time our training offerings are focused on the following areas:
* Application Development using Python and SQL
* Big Data and Business Intelligence
* Cloud
* Datawarehousing, Databases
Browse all courses by on Coursemarks.

7.5

CourseMarks Score®

6.5

Freshness

6.8

Feedback

8.5

Content

Platform: Udemy
Video: 20h 56m
Language: English
Next start: On Demand

Students are also interested in

Review widget (for course creators):

Practical Guide to setup Hadoop and Spark Cluster using CDH rating
Code for the widget (just copy and paste it to your site):
<a href="https://coursemarks.com/course/cca-131-cloudera-certified-hadoop-and-spark-administrator/" target="_blank" title="Practical Guide to setup Hadoop and Spark Cluster using CDH on Coursemarks.com"><img border="0" src="https://coursemarks.com/widget/cmrated.svg" width="200px" alt="Practical Guide to setup Hadoop and Spark Cluster using CDH rating"/></a>