Statistical overdensities along Bubble Rims

Here we visualize the spatial correlation between Bubbles and YSOs, following the analysis of Kendrew et al. 2012. The data files were created by ../scripts/trigger.py.

The punchline is that high-P bubbles show the strongest excess YSO density around $\theta = 1$ bubble radius. This is a stronger signal than the size-depdendent behavior Kendrew et al noted in Figure 15, and also caused by a very different subset of bubbles.


In [4]:
%matplotlib inline

import pandas as pd
import numpy as np
import matplotlib.pyplot as plt
import matplotlib
matplotlib.rcParams['font.size'] = 14
matplotlib.rcParams['grid.linewidth'] = 1
matplotlib.rcParams['grid.color'] = '#666666'

figures = {}

def corr_plot(b, **kwargs):
    p, = plt.plot(b.theta, b.w, 'o', **kwargs)
    y1 = (b.w + b.dw).values
    y2 = (b.w - b.dw).values
    plt.fill_between(b.theta.values, y1, y2, color=p.get_color(), alpha=.4)

    
def remove_border(axes=None, top=False, right=False, left=True, bottom=True):
    """
    Minimize chartjunk by stripping out unnecesasry plot borders and axis ticks
    
    The top/right/left/bottom keywords toggle whether the corresponding plot border is drawn
    """
    ax = axes or plt.gca()
    ax.spines['top'].set_visible(top)
    ax.spines['right'].set_visible(right)
    ax.spines['left'].set_visible(left)
    ax.spines['bottom'].set_visible(bottom)
    
    #turn off all ticks
    ax.yaxis.set_ticks_position('none')
    ax.xaxis.set_ticks_position('none')
    
    #now re-enable visibles
    if top:
        ax.xaxis.tick_top()
    if bottom:
        ax.xaxis.tick_bottom()
    if left:
        ax.yaxis.tick_left()
    if right:
        ax.yaxis.tick_right()

In [5]:
b0 = pd.read_csv('../data/cluster_all.csv')
b1 = pd.read_csv('../data/cluster_plow.csv')
b2 = pd.read_csv('../data/cluster_pmid.csv')
b3 = pd.read_csv('../data/cluster_phi.csv')

bin0 = pd.read_csv('../data/cluster_all_bin.csv')
bin1 = pd.read_csv('../data/cluster_plow_bin.csv')
bin2 = pd.read_csv('../data/cluster_pmid_bin.csv')
bin3 = pd.read_csv('../data/cluster_phi_bin.csv')

s0 = pd.read_csv('../data/cluster_all_s.csv')
s1 = pd.read_csv('../data/cluster_plow_s.csv')
s2 = pd.read_csv('../data/cluster_pmid_s.csv')
s3 = pd.read_csv('../data/cluster_phi_s.csv')

h0 = pd.read_csv('../data/cluster_all_nohii.csv')
h1 = pd.read_csv('../data/cluster_plow_nohii.csv')
h2 = pd.read_csv('../data/cluster_pmid_nohii.csv')
h3 = pd.read_csv('../data/cluster_phi_nohii.csv')

small = pd.read_csv('../data/cluster_small.csv')
med = pd.read_csv('../data/cluster_medium.csv')
large = pd.read_csv('../data/cluster_large.csv')

pyso = pd.read_csv('../data/cluster_ysopos.csv')

Bubble / YSO correlation, as a function of Bubble Probability


In [16]:
def corr_plot(b, **kwargs):
    p, = plt.plot(b.theta, b.w, 'o', **kwargs)
     
    y1 = (b.w + b.dw).values
    y2 = (b.w - b.dw).values
    plt.fill_between(b.theta.values, y1, y2, color=p.get_color(), alpha=.4)
    
fig, axes = plt.subplots(nrows=2, ncols=1, tight_layout=True, sharex=True, sharey=True,
                         figsize=(4, 6))

figures['trigger'] = fig

plt.sca(axes[0])
corr_plot(b0, label='All bubbles', color='k')
plt.legend(numpoints=1, frameon=False)
plt.xlim(0, 2)
plt.ylim(-2, 14)
plt.yticks([-3, 0, 3, 6, 9, 12])
plt.ylabel(r"$w(\theta)$", fontsize=18)
plt.annotate('a', xy=(.2, 12), fontsize=24)
remove_border()

loc, medc, hic = ['#737373', '#FE9929', '#1D91C0']

plt.sca(axes[1])
plt.subplot(212)
corr_plot(b3, label='P > 0.9', color=hic, zorder=10)
corr_plot(b2, label='0.5 < P < 0.9', color=medc, zorder=9)
corr_plot(b1, label='P < 0.5', color=loc, zorder=8)
corr_plot(pyso, label='Control', color='r')

plt.xlabel(r"$\theta (R_{\rm eff})$", fontsize=18)
plt.ylabel(r"$w(\theta)$", fontsize=18)
plt.yticks([-3, 0, 3, 6, 9, 12])

plt.legend(numpoints=1, frameon=False)
plt.xlim(0, 3)
plt.ylim(-2, 14)
plt.annotate('b', xy=(.2, 12), fontsize=24)
remove_border()


High-probability objects have the strongest YSO excesses at 0.5-1 radii. This is reminiscent of Figure 12 in Kendrew et al. 2012, which shows the formation of a "bump" around R=1 as YSOs are artificially positioned on bubble rims.

This again suggests that the P score is selecting out dynamically interesting objects, which are more correlated with YSOs than low-P objects.

Here's the entire catalog:


In [7]:
f = plt.figure()
corr_plot(b0, label='all', color='k')
plt.legend(numpoints=1, frameon=False)
plt.xlim(0, 3)
plt.ylim(-2, 14)
remove_border()


Shuffling position, size

Here's the version when we shuffle the lat, lon, and reff columns independently -- preserving the distributions of all 3, but destroying the correlation with YSOs. I think this establishes that the difference in the first figure above can't (only) be due to different position/size distributions of the three different categories.


In [24]:
corr_plot(s1, label='P < 0.5')
corr_plot(s2, label='0.5 < P < 0.9')
corr_plot(s3, label='P > 0.9')
plt.legend(numpoints=1)
plt.xlabel(r"$\theta (R_{\rm eff})$", fontsize=18)
plt.ylabel(r"$w(\theta)$", fontsize=18)
plt.xlim(0, 3)
plt.ylim(-2, 14)


Out[24]:
(-2, 14)

Splitting by size

Compare this to Figure 15 of Kendrew et al 2012. We reproduce mostly the same result.


In [25]:
corr_plot(b0, color='k', label='All MWP')
corr_plot(small, label='Largest 50%')
corr_plot(med, label='Largest 25%')
corr_plot(large, label='Largest 10%')
plt.legend(numpoints=1)
plt.xlabel(r"$\theta (R_{\rm eff})$", fontsize=18)
plt.ylabel(r"$w(\theta)$", fontsize=18)
plt.xlim(0, 3)
plt.ylim(-2, 6)


Out[25]:
(-2, 6)

This trend is similar to, but weaker than, the dependence on Bubble probability. Let's see to what extent the size catgories and bubble categories partition bubbles into similar groups


In [26]:
bub = pd.read_csv('../data/pdr1.csv').dropna()
reff = np.sqrt(bub['a'] * bub['b'])

size_bins = np.percentile(reff, [0, 50, 75, 90, 100])
size_bins[0] = 0
p_bins = [0, .5, .9, 1]


size_cat = pd.cut(reff, size_bins)
p_cat = pd.cut(bub.prob, p_bins)

ct = pd.crosstab(size_cat.labels, p_cat.labels, rownames=['Size'], colnames=['Prob'])
ct


Out[26]:
Prob 0 1 2
Size
0 521 702 608
1 327 261 327
2 225 146 178
3 151 111 105

In [27]:
for i, row in ct.iterrows():
    plt.plot(1. * row / row.sum(), '-o', label=i)
plt.legend(title='Size cut', frameon=False)
plt.xlabel("Probability Cut")
plt.ylabel("Fraction")


Out[27]:
<matplotlib.text.Text at 0x106b775d0>

There is little correlation here. If anything, the magenta line suggests that a large bubble (size cut=3) is more likely to be low-probability than high-probability.

Excluding RMS sources that may be HII regions

Thisi mostly weakens the signal, since we are throwing away most of the RMS sources


In [28]:
corr_plot(h1, label='P < 0.5 (No HII)')
corr_plot(h2, label='0.5 < P < 0.9 (No HII)')
corr_plot(h3, label='P > 0.9 (No HII)')
plt.legend(numpoints=1)
plt.xlabel(r"$\theta (R_{\rm eff})$", fontsize=18)
plt.ylabel(r"$w(\theta)$", fontsize=18)
plt.xlim(0, 3)
plt.ylim(-2, 14)


Out[28]:
(-2, 14)

Dependence on binsize

This plot is noisier, but the order of the lines is preserved.


In [29]:
corr_plot(bin1, label='P < 0.5 (Smaller bin)')
corr_plot(bin2, label='0.5 < P < 0.9 (Smaller bin)')
corr_plot(bin3, label='P > 0.9 (Smaller bin)')
plt.legend(numpoints=1)
plt.xlabel(r"$\theta (R_{\rm eff})$", fontsize=18)
plt.ylabel(r"$w(\theta)$", fontsize=18)
plt.xlim(0, 3)
plt.ylim(-2, 14)


Out[29]:
(-2, 14)