Winter Meeting Notes

  • People: Risa, Phil, David.
  • Date: 2017/04/17
  • Major Theme:
    • Make model/data/parameters more interesting/relevant for the community. How to best achieve that?
  • Assortment of Notes from beggining of meeting:

    • Photometric more relevant than spectroscopic.
    • Would be given: central_id, redshift, observed luminosity.
    • Cori, Knight's Landing machine.
    • Could swap CLF to one that applies to both centrals and satellites.
  • Notes on future direction:

    • Centrals and total accuracy (main result for desc note).
    • Explore satellite relation.
    • Run on Knight's landing.
  • More Discussion Notes:

    • 3d dark matter density, $P(L_c| M, z, d)$ where $d$ is distance to nearest cluster.
      • might need to go to 1 million objects.
    • Shapes of galaxies in different parts of halos 'galaxy environments' in 3d mass map.
    • Can we tie to assembly bias?
    • (A lot of talk about scaling relation M, mass mapping)
    • Could push the scatter S(M) down to low mass.
      • might not have enough constraining info in our model.
    • Risa mentioned relationship between:
      • halo mass
      • galaxy number density
      • galaxy luminosity

Back of the Envelope Numbers

  • Number of halos in the field of view: 115919
  • Number of samples per halo: 100
  • Total number of integrations in one likelihood calculations: $2 \times 115919 = 231838$
  • Performance (c++ on Mac laptop, Apple LLVM version 8.0.0 (clang-800.0.42.1), 2.4 GHz Intel Core i5, 8 GB 1600 MHz DDR3): $$\frac{375 \text{ seconds}}{50 \text{ likelihoods}} \approx ~7 \text{ seconds / likelihood}$$ cpu bound $\implies$ will scale close to linearly with number of cores (might also see big boost from gpu since primarily vectorized sequences of math operations)
  • Assuming we would require around 10,000 simple monte carlo hyperparameter samples, and we can utilize 1,000 cores (500 16-core nodes) through MPI and OpenMP then the computation should take approximately 1 minute: $$\left(\frac{10,000\text{ samples}}{1,000 \text{ cores}}\right) \left(\frac{7 \text{ seconds}}{\text{sample}}\right) = 70 \text{ seconds/core} \approx 1 \text{ minute}$$
  • Doing MCMC would inhibit the capacity for parallelization, might be a lot slower. Doing MCMC on my laptop (utilizing dual-core) would take about 10 hours: $$\frac{10,000 \text{ samples} \cdot 7 \text{ seconds}}{2 \text{cores}} = 35,000 \text{ seconds} \approx 10 \text{ hours}$$

Miscellaneous

  • Is there anything else we should be doing in terms of LSST or DESC membership?
  • David will be out of town next Friday June 30th, but available the rest of the week.
  • Longer term investments in high performance computing, cosmology, gravitational lensing.
  • David's Monday-Thursday Desk:

Meeting Summary

  • Random items:
    • Yashar may have c++/MPI emcee
    • hyper-parameter [samples/prior/posterior] => hyper-[samples/prior/posterior]
    • Zenhub for project management
    • Recommended Mackay's book and Schneider's Astrophysics and Cosmology book
  • Questions Phil Marshall would like to explore: Basic setup is observer observing a source through a sequence of central and satellite galaxies/halos.
    • Ingredients:
      • halos (shape, concentration)
      • galaxies (centrals, satellites)
      • membership uncertainty
      • filaments
    • Constraints:
      • position, galaxy photometry (brightness, color)
      • uncertain redshifts => uncertain, correlated luminosity
      • "observed clustering" => Redmapper clusters, membership, cluster redshift, luminosity
        • Redmapper is interesting because it gives us a set of new observables
      • weak lensing
  • Include weak lensing in two step inference
    • first infer hyper-parameters and generate mass maps
    • use mass maps to further refine hyper-parameters.
  • Potential application: denoising strong lenses
    • CFHTLS, publicly available weak lensing data
    • Rusu, Marshall paper 2017
    • Would need to do inference through stellar-mass-halo-mass relation but might be able to reuse bigmali core
    • Might be worth applying Pangloss to this problem, reread Spencer's thesis with eye for mass modelling accuracy.

Moving Forward

  • Stress the scaling
    • distribute bigmali computation
    • build hyper-posterior
    • confirm hyper-posterior consistent with seed hyperparameter when using masses sampled from mass prior
    • get familiar with Sherlock/SLURM
  • Mass mapping
    • Use $P(\alpha, S|data)$ to make $P(M_k|data)$ and asses its accuracy. Review derivation \begin{align*} P(M_k|d) &= \int P(M_k,M,\alpha,S|data)dMd\alpha dS\\ &\propto \int P(M_k|\alpha, S, data_k?)P(\alpha,S|data_k?)d\alpha dS\\ &\approx \frac{1}{N}\sum P(M_k|\alpha, S, data_k?)\\ \end{align*} Consider using median or mean squared error to assess accuracy.
  • Read up on central/satellite luminosity relation in Reddick's thesis.
    • How can we incorporate into our model?
  • Explore information gain in weak lensing and mass luminosity
    • KL divergence
    • Can we use mass luminsotiy and mass luminosity then weak lensing to get sense of information gain from weak lensing only.
  • Next meeting Thursday @ 3