The goal is an inexpensive, 'on demand' scientific cluster. A long time ago PiCloud.com achieved this. These are simple exploratory efforts along those lines.
See notebook on "dream of own PiCloud."
Things to look into:
Specific links that look promising:
R, sparklr: https://aws.amazon.com/blogs/big-data/running-sparklyr-rstudios-r-interface-to-spark-on-amazon-emr/
EMR guide: http://docs.aws.amazon.com/emr/latest/ManagementGuide/emr-what-is-emr.html