What is the most important project that we need to download first?

What I did is simply counting the occurance of the word (project name) in the report file


In [1]:
from collections import Counter

In [2]:
filename = "report_8_nonALMACAL_priority.txt"

In [3]:
with open(filename, 'r') as ifile:
    wordcount = Counter(ifile.read().split())

In [4]:
list_of_project = []

for item in wordcount:
    if len(item) == 14 and item[-1] == 'S': # project_name
        list_of_project.append([item, wordcount[item]])

In [5]:
sorted_project = sorted(list_of_project, key=lambda data: data[1])

In [6]:
print("Number of project: ", len(sorted_project))


Number of project:  633

In [7]:
sorted_from_large = list(reversed(sorted_project))

due to the structure of the report this number can not be used directly as a reference

e.g. maybe large occurance due to small integration and observed many time and also it is possible only for one object in one band (like the first project in here)

I think the year of Cycle is more important due to number of antenna.


In [15]:
# 15 first
for i in sorted_from_large[0:15]:
    print(i)


['2012.1.00453.S', 33]
['2015.1.01289.S', 32]
['2012.1.00139.S', 20]
['2012.1.00377.S', 13]
['2015.1.00412.S', 13]
['2012.1.00729.S', 13]
['2013.1.00020.S', 13]
['2012.1.00317.S', 13]
['2013.1.00700.S', 13]
['2015.1.01352.S', 12]
['2015.1.00144.S', 12]
['2015.1.00027.S', 11]
['2015.1.00932.S', 11]
['2015.1.01454.S', 11]
['2016.1.00567.S', 10]

Sorted based on year


In [9]:
sorted_project_year = sorted(list_of_project, key=lambda data: data[0])

In [10]:
sorted_from_new = list(reversed(sorted_project_year))

In [16]:
# 15 first
for i in sorted_from_new[0:15]:
    print(i)


['2016.A.00011.S', 2]
['2016.A.00010.S', 2]
['2016.1.01609.S', 2]
['2016.1.01604.S', 3]
['2016.1.01567.S', 4]
['2016.1.01559.S', 2]
['2016.1.01552.S', 3]
['2016.1.01546.S', 2]
['2016.1.01541.S', 2]
['2016.1.01520.S', 6]
['2016.1.01515.S', 4]
['2016.1.01512.S', 6]
['2016.1.01495.S', 2]
['2016.1.01493.S', 2]
['2016.1.01481.S', 2]

There is 'A' in project name e.g. 2016.A.00011.S, 2016.A.00010.S, what does it mean?


In [ ]: