Analyzing Large Data Sets with Apache Spark

Get ready for your exam by enrolling in our comprehensive training course. This course includes a full set of instructional videos designed to equip you with in-depth knowledge essential for passing the certification exam with flying colors.

What’s included

  • 42 : Lectures
  • 4h 54m 45s : Duration
video-file-formats

$14.99/24.99


Lectures
1. Introduction- 2m 16s
2. How to Use This Course- 1m 41s
3. [Activity]Getting Set Up: Installing Python, a JDK, Spark, and its Dependencies.- 14m 50s
4. [Activity] Installing the MovieLens Movie Rating Dataset- 3m 35s
5. [Activity] Run your first Spark program! Ratings histogram example.- 4m 52s

Lectures
1. Introduction to Spark- 10m 11s
2. The Resilient Distributed Dataset (RDD)- 12m 17s
3. Ratings Histogram Walkthrough- 13m 33s
4. Key/Value RDD's, and the Average Friends by Age Example- 16m 13s
5. [Activity] Running the Average Friends by Age Example- 5m 39s
6. Filtering RDD's, and the Minimum Temperature by Location Example- 8m 10s
7. [Activity]Running the Minimum Temperature Example, and Modifying it for Maximums- 5m 8s
8. [Activity] Running the Maximum Temperature by Location Example- 3m 21s
9. [Activity] Counting Word Occurrences using flatmap()- 7m 28s
10. [Activity] Improving the Word Count Script with Regular Expressions- 4m 44s
11. [Activity] Sorting the Word Count Results- 7m 44s

Lectures
1. [Activity] Find the Most Popular Movie- 5m 52s
2. [Activity] Use Broadcast Variables to Display Movie Names Instead of ID Numbers- 8m 23s
3. Find the Most Popular Superhero in a Social Graph- 4m 29s
4. [Activity] Run the Script - Discover Who the Most Popular Superhero is!- 6m
5. Superhero Degrees of Separation: Introducing Breadth-First Search- 7m 54s
6. Superhero Degrees of Separation: Accumulators, and Implementing BFS in Spark- 6m 44s
7. [Activity] Superhero Degrees of Separation: Review the Code and Run it- 9m 14s
8. Item-Based Collaborative Filtering in Spark, cache(), and persist()- 10m 12s
9. [Activity] Running the Similar Movies Script using Spark's Cluster Manager- 10m 54s
10. [Exercise] Improve the Quality of Similar Movies- 2m 58s

Lectures
1. Introducing Elastic MapReduce- 5m 8s
2. [Activity] Setting up your AWS / Elastic MapReduce Account and Setting Up PuTTY- 9m 55s
3. Partitioning- 4m 21s
4. Create Similar Movies from One Million Ratings - Part 1- 5m 12s
5. [Activity] Create Similar Movies from One Million Ratings - Part 2- 11m 27s
6. Create Similar Movies from One Million Ratings - Part 3- 3m 28s
7. Troubleshooting Spark on a Cluster- 3m 43s
8. More Troubleshooting, and Managing Dependencies- 5m 47s

Lectures
1. Introducing SparkSQL- 6m 8s
2. Executing SQL commands and SQL-style functions on a DataFrame- 8m 16s
3. Using DataFrames instead of RDD's- 5m 52s

Lectures
1. Introducing MLLib- 8m 10s
2. [Activity] Using MLLib to Produce Movie Recommendations- 2m 56s
3. Analyzing the ALS Recommendations Results- 4m 53s
4. Using DataFrames with MLLib- 7m 31s
5. Spark Streaming and GraphX- 7m 36s

PassQueen does not provide real Microsoft exam questions. Similarly, PassQueen does not supply real Amazon exam questions. The materials offered by PassQueen lack real questions and answers of certification exams. The CFA Institute neither endorses nor assures the accuracy or quality of PassQueen content. CFA® and Chartered Financial Analyst® are registered trademarks held by the CFA Institute.

Helpful Pages

© 2025 All Rights Reserved passqueen.com.