Problem Scenario 10 : You have been given a database named retail_db with following detail. Which consists 6 tables and datamodel you can see in image.

jdbc URL = jdbc:mysql://quickstart:3306/retail_db

1. Import the entire database in a file format this good for analytical applications on Hadoop e.g. group your data in columns and should be able to query this data using Impala.
Also, while importing to save space you do compression using snappy codec.
2. In impala write the query, which can produce 5 Most popular product categories and save the results in HadoopExam/best_categories.csv in hdfs .
3. In Impala write the query, which can produce top 10 revenue generating products and save the results in HadoopExam/best_products.csv  in hdfs .

CCA175 : Cloudera Hadoop and Spark Developer Certifications

ApacheSpark Interview Questions

·         Apache Spark InterviewQuestions-1

·         Apache Spark Interview Questions-2

·         Apache Spark Interview Questions-3

·         Apache Spark Interview Questions-4

·         Apache Spark Interview Questions-5

·         Apache Spark Interview Questions-6 

          Apache Spark Interview Questions-7