Machine Learning in Astronomy

  • A wide range of powerful "machine learning" algorithms have been developed by computer scientists and statisticians, in an attempt to automate the analysis of data, especially large datasets.
  • Machine learning has emerged from the world of engineering, where costly decisions have to be made: here, accuracy of prediction is at a premium, and academic understanding of the system is of secondary importance.
  • When will machine learning be most useful?

    • When you suspect the data can provide more information about the uncertainties than you can via your physical assumptions;

    • When the size and/or complexity of the data means that simple physical models for it are unlikely to yield meaningful results;

    • When you want to go one further in your initial data exploration and visualization (we didn't get into unsupervised machine learning here, but exploration is a good way of thinking about this class of methods).

    • When you need to make reliable predictions (think: photo-z's) or decisions (think: quasar target selection) based on large and or complex data. Much of industrial data science is occupied with this activity.

  • When might more thought be required, in an astrophysical context?

    • When you bring significant prior physical knowledge to the problem. Can you adapt a machine learning algorithm to accommodate your prior knowledge?

    • When the uncertainty on an inference (or prediction) is as important as the parameter value itself. Can you unpack the hidden assumptions in a machine learning algorithm, and then make sure the uncertainties correctly propagate the information you have?

  • Machine learning in astronomy is a growing field: there are some very interesting opportunities here!