dataset
EMR-for-data-engineers
airscholar
or hover any field below to flag it
Overview
Name
EMR-for-data-engineers
Source
airscholar
Episodes
0
Robot count
0
Format
other
Description
This project demonstrates the use of Amazon Elastic Map Reduce (EMR) for processing large datasets using Apache Spark. It includes a Spark script for ETL (Extract, Transform, Load) operations, AWS command line instructions for setting up and managing the EMR cluster, and a dataset for testing and de
Robots used
null
Links
HuggingFace dataset
null