dataset

EMR-for-data-engineers

airscholar

or hover any field below to flag it

Overview

Name
EMR-for-data-engineers
Source
airscholar
Episodes
0
Robot count
0
Format
other
Description
This project demonstrates the use of Amazon Elastic Map Reduce (EMR) for processing large datasets using Apache Spark. It includes a Spark script for ETL (Extract, Transform, Load) operations, AWS command line instructions for setting up and managing the EMR cluster, and a dataset for testing and de
Robots used
null

Links

HuggingFace dataset
null