Disclosure: when you buy through links on our site, we may earn an affiliate commission.

Learning Path: Data Science With Apache Spark 2

This Learning Path begins with an introduction to Apache Spark. We first cover the basics of Spark, introduce SparkR, then look at the charting and plotting features of Python in conjunction with Spark data processing, and finally Spark's data processing libraries.
/5
Created by Packt Publishing

8.5

CourseMarks Score®

8.8

Freshness

N/A

Feedback

7.7

Content

Platform: Simpliv Learning
Price: $39.99
Video: 8h58m
Language: English
Next start: On Demand

Top Cloud Computing courses:

Detailed Analysis

CourseMarks Score®

8.5 / 10

CourseMarks Score® helps students to find the best classes. We aggregate 18 factors, including freshness, student feedback and content diversity.

Freshness Score

8.8 / 10
This course was last updated on 05/2020.

Course content can become outdated quite quickly. After analysing 71,530 courses, we found that the highest rated courses are updated every year. If a course has not been updated for more than 2 years, you should carefully evaluate the course before enrolling.

Student Feedback

We analyzed factors such as the rating and the ratio between the number of reviews and the number of students, which is a great signal of student commitment. If a course does not yet have a rating, we exclude Feedback Score from the overall CourseMarks Score.

New courses are hard to evaluate because there are no or just a few student ratings, but Student Feedback Score helps you find great courses even with fewer reviews.

Content Score

7.7 / 10
Video Score: 7.6 / 10
The course includes 8h58m video content. Courses with more videos usually have a higher average rating. We have found that the sweet spot is 16 hours of video, which is long enough to teach a topic comprehensively, but not overwhelming. Courses over 16 hours of video gets the maximum score.
The average video length is 1 hours 42 minutes of 166 Cloud Computing courses on Simpliv Learning.
Detail Score: 10.0 / 10

The top online course contains a detailed description of the course, what you will learn and also a detailed description about the instructor.

Extra Content Score: 5.5 / 10

Tests, exercises, articles and other resources help students to better understand and deepen their understanding of the topic.

This course contains:

0 article.
0 resource.
0 exercise.
0 test.

Table of contents

Description

Spark is one of the most widely-used large-scale data processing engines and runs extremely fast. It is a framework that has tools that are equally useful for application developers as well as data scientists.

This Learning Path begins with an introduction to Apache Spark. We first cover the basics of Spark, introduce SparkR, then look at the charting and plotting features of Python in conjunction with Spark data processing, and finally Spark’s data processing libraries. We then develop a real-world Spark application. Next, we enable you to become comfortable and confident working with Spark for data science by exploring Spark’s data science libraries on a dataset of tweets.

Begin your journey into fast, large-scale, and distributed data processing using Spark with this Learning Path.

About the Authors

Rajanarayanan Thottuvaikkatumana

Rajanarayanan Thottuvaikkatumana, Raj, is a seasoned technologist with more than 23 years of software development experience at various multinational companies. He has lived and worked in India, Singapore, and the USA, and is presently based out of the UK. His experience includes architecting, designing, and developing software applications. He has worked on various technologies including major databases, application development platforms, web technologies, and big data technologies. Since 2000, he has been working mainly in Java related technologies, and does heavy-duty server-side programming in Java and Scala. He has worked on very highly concurrent, highly distributed, and high transaction volume systems. Currently he is building a next generation Hadoop YARN-based data processing platform and an application suite built with Spark using Scala.

Raj holds one master’s degree in Mathematics, one master’s degree in Computer Information Systems and has many certifications in ITIL and cloud computing to his credit. Raj is the author of Cassandra Design Patterns – Second Edition, published by Packt.

When not working on the assignments his day job demands, Raj is an avid listener to classical music and watches a lot of tennis.

Eric Charles

Eric Charles has 10 years’ experience in the field of Data Science and is the founder of Datalayer (http://datalayer.io/docker), a social network for Data Scientists. He is passionate about using software and mathematics to help companies get insights from data.

His typical day includes building efficient processing with advanced machine learning algorithms, easy SQL, streaming and graph analytics. He also focuses a lot on visualization and result sharing.

He is passionate about open source and is an active Apache Member. He regularly gives talks to corporate clients and at open source events. He can be contacted on Twitter on @echarles.

Basic knowledge
Requires basic knowledge of either Python or R

Requirements

• Requires basic knowledge of either Python or R

You will learn

What will you learn
✓ Get to know the fundamentals of Spark 2.0 and the Spark programming model using Scala and Python
✓ Know how to use Spark SQL and DataFrames using Scala and Python
✓ Get an introduction to Spark programming using R
✓ Perform Spark data processing, charting, and plotting using Python
✓ Get acquainted with Spark stream processing using Scala and Python
✓ Be introduced to machine learning with Spark using Scala and Python
✓ Get started with graph processing with Spark using Scala
✓ Develop a complete Spark application
✓ Understand the Spark programming language and its ecosystem of packages in Data Science
✓ Obtain and clean data before processing it
✓ Understand the Spark machine learning algorithm to build a simple pipeline
✓ Work with interactive visualization packages in Spark
✓ Apply data mining techniques on the available data sets
✓ Build a recommendation engine

This course is for

• Application developers, data scientists, or big data architects interested in combining the data processing power of Apache Spark will find this course to be very useful. As implementations of Apache Spark will be shown with Scala and Python, some programming knowledge on these languages will be needed. This course is for anyone who wants to work with Spark on large and complex datasets. A basic knowledge about statistics and computational mathematics is expected.
• With the help of real-world use cases on the main features of Spark, this course offers an easy introduction to the framework. This practical hands-on course covers the fundamentals of Spark needed to get to grips with data science through a single dataset. It expands on the next learning curve for those comfortable with Spark programming who are looking to apply Spark in the field of data science.

How much does the Learning Path: Data Science With Apache Spark 2 course cost? Is it worth it?

The course costs $39.99. And currently there is a 80% discount on the original price of the course, which was $199.99. So you save $160 if you enroll the course now.
The average price is $33.9 of 166 Cloud Computing courses. So this course is 18% more expensive than the average Cloud Computing course on Simpliv Learning.

Does the Learning Path: Data Science With Apache Spark 2 course have a money back guarantee or refund policy?

YES, Learning Path: Data Science With Apache Spark 2 has a 20-day money back guarantee. The 20-day refund policy is designed to allow students to study without risk.

Are there any SCHOLARSHIPS for this course?

Currently we could not find a scholarship for the Learning Path: Data Science With Apache Spark 2 course, but there is a $160 discount from the original price ($199.99). So the current price is just $39.99.

Who is the instructor? Is Packt Publishing a SCAM or a TRUSTED instructor?

Packt Publishing has created 659 courses that got reviews which are generally positive. Packt Publishing has taught 27 students and received a average review out of reviews. Depending on the information available, Packt Publishing is a TRUSTED instructor.

More info about the instructor, Packt Publishing

8.5

CourseMarks Score®

8.8

Freshness

N/A

Feedback

7.7

Content

Platform: Simpliv Learning
Price: $39.99
Video: 8h58m
Language: English
Next start: On Demand

Students are also interested in

Other courses by ​Packt Publishing

Get this widget on your website (for course creators):

Learning Path: Data Science With Apache Spark 2 rating
Copy this code and paste it to your website:
<a href="https://coursemarks.com/course/learning-path-data-science-with-apache-spark-2-2/" target="_blank" title="Learning Path: Data Science With Apache Spark 2 on Coursemarks.com"><img border="0" src="https://coursemarks.com/widget/86.svg" width="200px" alt="Learning Path: Data Science With Apache Spark 2 rating"/></a>