Group Galaxy Catalog for the DR5 Gallery

The purpose of this notebook is to build a group catalog (using a simple friends-of-friends algorithm) from a diameter-limited (D25>5 arcsec) parent sample of galaxies defined and documented as part of the Legacy Survey Large Galaxy Atlas.

Preliminaries

Import the libraries we need, define the I/O path, and specify the desired linking length (in arcminutes) and the minimum D(25) of the galaxy sample.


In [1]:
import os
import numpy as np
import matplotlib.pyplot as plt

In [2]:
import astropy.units as u
from astropy.table import Table, Column
from astropy.coordinates import SkyCoord

In [3]:
import fitsio
from pydl.pydlutils.spheregroup import spheregroup

In [4]:
%matplotlib inline

In [5]:
LSLGAdir = os.getenv('LSLGA_DIR')

In [6]:
mindiameter = 0.25 # [arcmin]
linking_length = 2.5 # [arcmin]

Read the parent HyperLeda catalog.

We immediately throw out objects with objtype='g' in Hyperleda, which are "probably extended" and many (most? all?) have incorrect D(25) diameters. We also toss out objects with D(25)>2.5 arcmin and B>16, which are also probably incorrect.


In [7]:
suffix = '0.05'

In [8]:
ledafile = os.path.join(LSLGAdir, 'sample', 'leda-logd25-{}.fits'.format(suffix))
leda = Table.read(ledafile)

keep = (np.char.strip(leda['OBJTYPE']) != 'g') * (leda['D25'] / 60 > mindiameter)
leda = leda[keep]

keep = ['SDSS' not in gg and '2MAS' not in gg for gg in leda['GALAXY']]
#keep = np.logical_and( (np.char.strip(leda['OBJTYPE']) != 'g'), ~((leda['D25'] / 60 > 2.5) * (leda['BMAG'] > 16)) )
leda = leda[keep]
leda


Out[8]:
<Table length=862419>
GALAXYPGCRADECTYPEOBJTYPEMULTIPLED25BAPABMAGIMAGVHELIO
str28str10float64float64str4str2str1float32float32float32float32float32float32
PGC622563PGC06225630.00045-37.47607G17.70730.758578128.017.7515.84-999.0
PGC1982072PGC19820720.000632.1366G19.41560.74131136.517.78-999.0-999.0
PGC535833PGC05358330.0006-44.57789G20.33060.588844148.517.6317.14-999.0
PGC520795PGC05207950.00075-45.95405G20.33060.67608324.717.5415.5-999.0
PGC1961515PGC19615150.0007531.7311G23.34270.616595160.317.38-999.0-999.0
PGC228194PGC02281940.0009-80.24194G19.86790.707946175.017.7815.61-999.0
PGC124374PGC01243740.00135-41.42293G22.29210.954993-999.016.8215.0518488.0
PGC398935PGC03989350.0015-56.58832G15.78160.61659515.518.3316.0455701.0
PGC2058887PGC20588870.001835.06725G20.33060.724436118.117.36-999.0-999.0
PGC2008846PGC20088460.002432.70309G19.86790.60256122.018.42-999.051786.0
.......................................
PGC283463PGC0283463359.9967-68.7852G16.52540.77624733.517.5715.5-999.0
PGC1349333PGC1349333359.997158.56508G26.8010.912011-999.016.8-999.025183.0
PGC518798PGC0518798359.9973-46.11733G15.78160.562341107.018.1616.48-999.0
PGC1644789PGC1644789359.997621.19222G20.80420.891251127.017.4-999.0-999.0
PGC085920PGC0085920359.9977524.90753ScG30.07120.39810722.017.08-999.011181.0
PGC875298PGC0875298359.9979-17.92631G22.81140.562341175.517.64-999.0-999.0
PGC475212PGC0475212359.99865-49.59475G19.86790.52480785.017.4415.12-999.0
PGC309024PGC0309024359.9991-65.91558G16.52540.446684147.018.4517.42-999.0
PGC1046833PGC1046833359.99925-5.28219G23.88640.4265847.517.13-999.0-999.0
PGC129172PGC0129172359.9994-49.91808ScG38.73930.446684113.016.1814.62-999.0

In [9]:
fig, ax = plt.subplots()
ax.scatter(leda['RA'], leda['DEC'], s=1, alpha=0.5)


Out[9]:
<matplotlib.collections.PathCollection at 0x10876a630>

In [10]:
fig, ax = plt.subplots()
ax.hexbin(leda['BMAG'], leda['D25'] / 60, extent=(5, 20, 0, 20),
          mincnt=1)
ax.set_xlabel('B mag')
ax.set_ylabel('D(25) (arcmin)')


Out[10]:
<matplotlib.text.Text at 0x113981b38>

In [11]:
if False:
    these = (leda['RA'] > 200) * (leda['RA'] < 210) * (leda['DEC'] > 5) * (leda['DEC'] < 10.0)
    leda = leda[these]
    print(np.sum(these))

Run FoF with spheregroup

Identify groups using a simple angular linking length. Then construct a catalog of group properties.


In [12]:
%time grp, mult, frst, nxt = spheregroup(leda['RA'], leda['DEC'], linking_length / 60.0)


CPU times: user 6min 43s, sys: 3.11 s, total: 6min 46s
Wall time: 6min 47s

In [13]:
npergrp, _ = np.histogram(grp, bins=len(grp), range=(0, len(grp)))
nbiggrp = np.sum(npergrp > 1).astype('int')
nsmallgrp = np.sum(npergrp == 1).astype('int')
ngrp = nbiggrp + nsmallgrp

In [14]:
print('Found {} total groups, including:'.format(ngrp))
print('  {} groups with 1 member'.format(nsmallgrp))
print('  {} groups with 2-5 members'.format(np.sum( (npergrp > 1)*(npergrp <= 5) ).astype('int')))
print('  {} groups with 5-10 members'.format(np.sum( (npergrp > 5)*(npergrp <= 10) ).astype('int')))
print('  {} groups with >10 members'.format(np.sum( (npergrp > 10) ).astype('int')))


Found 713903 total groups, including:
  611300 groups with 1 member
  100532 groups with 2-5 members
  1790 groups with 5-10 members
  281 groups with >10 members

Populate the output group catalog

Also add GROUPID to parent catalog to make it easier to cross-reference the two tables. D25MAX and D25MIN are the maximum and minimum D(25) diameters of the galaxies in the group.


In [15]:
groupcat = Table()
groupcat.add_column(Column(name='GROUPID', dtype='i4', length=ngrp, data=np.arange(ngrp))) # unique ID number
groupcat.add_column(Column(name='GALAXY', dtype='S1000', length=ngrp))
groupcat.add_column(Column(name='NMEMBERS', dtype='i4', length=ngrp))
groupcat.add_column(Column(name='RA', dtype='f8', length=ngrp))  # average RA
groupcat.add_column(Column(name='DEC', dtype='f8', length=ngrp)) # average Dec
groupcat.add_column(Column(name='DIAMETER', dtype='f4', length=ngrp))
groupcat.add_column(Column(name='D25MAX', dtype='f4', length=ngrp))
groupcat.add_column(Column(name='D25MIN', dtype='f4', length=ngrp))

In [16]:
leda_groupid = leda.copy()
leda_groupid.add_column(Column(name='GROUPID', dtype='i4', length=len(leda)))
leda_groupid


Out[16]:
<Table length=862419>
GALAXYPGCRADECTYPEOBJTYPEMULTIPLED25BAPABMAGIMAGVHELIOGROUPID
str28str10float64float64str4str2str1float32float32float32float32float32float32int32
PGC622563PGC06225630.00045-37.47607G17.70730.758578128.017.7515.84-999.00
PGC1982072PGC19820720.000632.1366G19.41560.74131136.517.78-999.0-999.00
PGC535833PGC05358330.0006-44.57789G20.33060.588844148.517.6317.14-999.00
PGC520795PGC05207950.00075-45.95405G20.33060.67608324.717.5415.5-999.00
PGC1961515PGC19615150.0007531.7311G23.34270.616595160.317.38-999.0-999.00
PGC228194PGC02281940.0009-80.24194G19.86790.707946175.017.7815.61-999.00
PGC124374PGC01243740.00135-41.42293G22.29210.954993-999.016.8215.0518488.00
PGC398935PGC03989350.0015-56.58832G15.78160.61659515.518.3316.0455701.00
PGC2058887PGC20588870.001835.06725G20.33060.724436118.117.36-999.0-999.00
PGC2008846PGC20088460.002432.70309G19.86790.60256122.018.42-999.051786.00
..........................................
PGC283463PGC0283463359.9967-68.7852G16.52540.77624733.517.5715.5-999.00
PGC1349333PGC1349333359.997158.56508G26.8010.912011-999.016.8-999.025183.00
PGC518798PGC0518798359.9973-46.11733G15.78160.562341107.018.1616.48-999.00
PGC1644789PGC1644789359.997621.19222G20.80420.891251127.017.4-999.0-999.00
PGC085920PGC0085920359.9977524.90753ScG30.07120.39810722.017.08-999.011181.00
PGC875298PGC0875298359.9979-17.92631G22.81140.562341175.517.64-999.0-999.00
PGC475212PGC0475212359.99865-49.59475G19.86790.52480785.017.4415.12-999.00
PGC309024PGC0309024359.9991-65.91558G16.52540.446684147.018.4517.42-999.00
PGC1046833PGC1046833359.99925-5.28219G23.88640.4265847.517.13-999.0-999.00
PGC129172PGC0129172359.9994-49.91808ScG38.73930.446684113.016.1814.62-999.00

Groups with one member--


In [17]:
smallindx = np.arange(nsmallgrp)

In [18]:
ledaindx = np.where(npergrp == 1)[0]
groupcat['RA'][smallindx] = leda['RA'][ledaindx]
groupcat['DEC'][smallindx] = leda['DEC'][ledaindx]
groupcat['NMEMBERS'][smallindx] = 1
groupcat['GALAXY'][smallindx] = np.char.strip(leda['GALAXY'][ledaindx])
groupcat['DIAMETER'][smallindx] = leda['D25'][ledaindx] # [arcsec]
groupcat['D25MAX'][smallindx] = leda['D25'][ledaindx]   # [arcsec]
groupcat['D25MIN'][smallindx] = leda['D25'][ledaindx]   # [arcsec]

leda_groupid['GROUPID'][ledaindx] = groupcat['GROUPID'][smallindx]

Groups with more than one member--


In [19]:
bigindx = np.arange(nbiggrp) + nsmallgrp

In [20]:
coord = SkyCoord(ra=leda['RA']*u.degree, dec=leda['DEC']*u.degree)

In [21]:
def biggroups():
    for grpindx, indx in zip(bigindx, np.where(npergrp > 1)[0]):

        ledaindx = np.where(grp == indx)[0]
        _ra, _dec = np.mean(leda['RA'][ledaindx]), np.mean(leda['DEC'][ledaindx])
        d25min, d25max = np.min(leda['D25'][ledaindx]), np.max(leda['D25'][ledaindx])
    
        groupcat['RA'][grpindx] = _ra
        groupcat['DEC'][grpindx] = _dec
        groupcat['D25MAX'][grpindx] = d25max
        groupcat['D25MIN'][grpindx] = d25min

        groupcat['NMEMBERS'][grpindx] = len(ledaindx)
        groupcat['GALAXY'][grpindx] = ','.join(np.char.strip(leda['GALAXY'][ledaindx]))
        leda_groupid['GROUPID'][ledaindx] = groupcat['GROUPID'][grpindx]

        # Get the distance of each object from the group center.
        cc = SkyCoord(ra=_ra*u.degree, dec=_dec*u.degree)
        diameter = 2 * coord[ledaindx].separation(cc).arcsec.max()

        groupcat['DIAMETER'][grpindx] = np.max( (diameter*1.02, d25max) )

In [22]:
%time biggroups()


/usr/local/anaconda3/envs/LSLGA/lib/python3.6/site-packages/ipykernel_launcher.py:14: StringTruncateWarning: truncated right side string(s) longer than 1000 character(s) during assignment
  
CPU times: user 13min 13s, sys: 3.67 s, total: 13min 16s
Wall time: 13min 23s

In [23]:
leda_groupid


Out[23]:
<Table length=862419>
GALAXYPGCRADECTYPEOBJTYPEMULTIPLED25BAPABMAGIMAGVHELIOGROUPID
str28str10float64float64str4str2str1float32float32float32float32float32float32int32
PGC622563PGC06225630.00045-37.47607G17.70730.758578128.017.7515.84-999.00
PGC1982072PGC19820720.000632.1366G19.41560.74131136.517.78-999.0-999.0611300
PGC535833PGC05358330.0006-44.57789G20.33060.588844148.517.6317.14-999.01
PGC520795PGC05207950.00075-45.95405G20.33060.67608324.717.5415.5-999.0611301
PGC1961515PGC19615150.0007531.7311G23.34270.616595160.317.38-999.0-999.0611302
PGC228194PGC02281940.0009-80.24194G19.86790.707946175.017.7815.61-999.02
PGC124374PGC01243740.00135-41.42293G22.29210.954993-999.016.8215.0518488.03
PGC398935PGC03989350.0015-56.58832G15.78160.61659515.518.3316.0455701.0611303
PGC2058887PGC20588870.001835.06725G20.33060.724436118.117.36-999.0-999.04
PGC2008846PGC20088460.002432.70309G19.86790.60256122.018.42-999.051786.0611304
..........................................
PGC283463PGC0283463359.9967-68.7852G16.52540.77624733.517.5715.5-999.00
PGC1349333PGC1349333359.997158.56508G26.8010.912011-999.016.8-999.025183.00
PGC518798PGC0518798359.9973-46.11733G15.78160.562341107.018.1616.48-999.00
PGC1644789PGC1644789359.997621.19222G20.80420.891251127.017.4-999.0-999.00
PGC085920PGC0085920359.9977524.90753ScG30.07120.39810722.017.08-999.011181.00
PGC875298PGC0875298359.9979-17.92631G22.81140.562341175.517.64-999.0-999.00
PGC475212PGC0475212359.99865-49.59475G19.86790.52480785.017.4415.12-999.0713883
PGC309024PGC0309024359.9991-65.91558G16.52540.446684147.018.4517.42-999.00
PGC1046833PGC1046833359.99925-5.28219G23.88640.4265847.517.13-999.0-999.00
PGC129172PGC0129172359.9994-49.91808ScG38.73930.446684113.016.1814.62-999.00

In [24]:
groupcat


Out[24]:
<Table length=713903>
GROUPIDGALAXYNMEMBERSRADECDIAMETERD25MAXD25MIN
int32bytes1000int32float64float64float32float32float32
0PGC62256310.00045-37.4760717.707317.707317.7073
1PGC53583310.0006-44.5778920.330620.330620.3306
2PGC22819410.0009-80.2419419.867919.867919.8679
3PGC12437410.00135-41.4229322.292122.292122.2921
4PGC205888710.001835.0672520.330620.330620.3306
5PGC198587210.002432.2091936.995736.995736.9957
6PGC128352410.002555.442536.153636.153636.1536
7PGC58265210.003-40.7705620.330620.330620.3306
8PGC99369710.00315-9.2222616.525416.525416.5254
9PGC39911710.0033-56.5634117.304217.304217.3042
........................
713893PGC1089622,PGC10890262359.961525-2.5349776.209935.330629.3867
713894PGC603742,PGC124867,PGC1998723359.9805-38.9185666667119.39227.425322.8114
713895PGC756023,PGC7560942359.9613-27.364824.442824.442817.3042
713896PGC788643,PGC7887262359.9694-24.5588752.481525.594817.7073
713897PGC1172313,PGC1172431,PGC1173523,PGC0000014359.98398750.69087187.03630.071216.1492
713898PGC1012883,PGC10133232359.9718-7.784125109.60818.119716.9103
713899PGC700157,PGC7005202359.977275-31.847255101.92423.342715.7816
713900PGC1709703,PGC17096192359.987424.53516551.343316.910316.9103
713901PGC092825,PGC5790272359.9886-41.09153120.96138.739316.9103
713902PGC829476,PGC8291232359.987325-21.359945112.86616.149215.4224

In [25]:
ww = np.where(groupcat['NMEMBERS'] >= 2)[0]
fig, ax = plt.subplots()
ax.scatter(groupcat['RA'][ww], groupcat['DEC'][ww], s=1, alpha=0.5)


Out[25]:
<matplotlib.collections.PathCollection at 0x11faf0358>

In [26]:
groupfile = os.path.join(LSLGAdir, 'sample', 'leda-logd25-{}-groupcat.fits'.format(suffix))
print('Writing {}'.format(groupfile))
groupcat.write(groupfile, overwrite=True)


Writing /Users/ioannis/research/projects/LSLGA/sample/leda-logd25-0.05-groupcat.fits

In [27]:
ledafile_groupid = os.path.join(LSLGAdir, 'sample', 'leda-logd25-{}-groupid.fits'.format(suffix))
print('Writing {}'.format(ledafile_groupid))
leda_groupid.write(ledafile_groupid, overwrite=True)


Writing /Users/ioannis/research/projects/LSLGA/sample/leda-logd25-0.05-groupid.fits