To enable the business to plan new product offerings, your marketing team wants to use product reviews for greater insight into which “Home and Grocery” products are popular with customers on a state by state basis. The business users want to generate these reports more frequently and with the performance SLA of completing these reports in seconds. They also want to integrate the analyzed data back to the datalake on Amazon S3 to be used by various analytical applications.
To meet the business needs the data engineering team has come up with the following data model.

For this lab we will leverage the following datasets.
customer and customer_address tables are already created, as part of the lab we will re-create these tables and load data using 3TB TPC-DS dataset.