In this second set of exercices you will have to develop two Machine Learning procedures. In both cases is mandatory to use Apache Spark 2.x and if you need any necessary library to manage data or to generate features you must use Apache MLlib (DataFrame version). Check the following aspects:
Problem 1:
Using the dataset, build a Machine Learning procedure to predict the price of houses having neighbourhood variables. The Boston House Price Dataset involves the prediction of a house price in thousands of dollars given details of the house and its neighborhood. You can download the data from here: https://raw.githubusercontent.com/jbrownlee/Datasets/master/housing.data. More info here: https://www.cs.toronto.edu/~delve/data/boston/bostonDetail.html.
Problem 2:
Using the dataset, build a Machine Learning procedure to classify if the return of a SONAR signal is a Rock or a Mine. You have all the data available at: https://archive.ics.uci.edu/ml/datasets/Connectionist+Bench+%28Sonar%2C+Mines+vs.+Rocks%29.
Make sure that you use sonar.all-data dataset. Check the following aspects: