Questions (particularly for the fall):
In terms of distributed computing, PySpark seems like a very good way to go:
I have found a Medical Image Registration Toolbox which works extrememly well with some of the data I'll be working with. However, this is a MATLAB toolbox. Will this play nicely with the PySpark framework? Will it have to be rewritten? I was planning to work on this on the fall to handle the Registration part of the pipeline
Matlab just introduced its own distributed computing/cloud computing options...should I try those instead?
What about doing registration on the GPU instead of a cluster?