This project demonstrates a structured approach to building an ETL (Extract, Transform, Load) job using Python, PySpark, and custom exceptions. The code is organised into separate modules for logging, ...
To demonstrate a simple Spark streaming with aggregation using PySpark. Used the Python Faker library to produce fake data files (CSV, JSON) which is populated to a source destination at intervals set ...
Some results have been hidden because they may be inaccessible to you
Show inaccessible results