In [1]:
import fileinput as fi
I have already run grep
from the command line using
grep -i -n "common" *.f > marcs_common_blocks.txt
The option -i
indicates that the search should be case insensitive and -n
returns the line number on which the search phrase is used. I've also piped the output to a file called marcs_common_blocks.txt
for easy manipulation. All MARCS files have the .f
Fortran extension, so all instances should be returned using this search.
Now, let's look at the file structure.
In [2]:
!head -n 5 marcs_common_blocks.txt
The basic structure is filename.f:##:
followed by the contents on the line. Since older Fortran required users to start in the 7th column, there is ample whitespace between the file information and the line content. The only exception is when a line is commented out.
We can read the data in and separate it using the colon, :
, as a delimeter.
In [3]:
marcs_common_blocks = [line.split(':') for line in fi.input('marcs_common_blocks.txt')]
Check to make sure we've acheived what we set out to do.
In [4]:
marcs_common_blocks[0]
Out[4]:
Now we need to figure out whether we can easily access common block names. They are always surrounded by / /
, but we need to be careful to avoid irregular spacings. It is therefore advantageous to trim all whitespace in the third column before populating the list. We also want to strip new line characters and convert everything to lower case. However, let us also avoid commented lines and focus only on active common blocks. Comments are indicated by either c
, !
, or *
.
In [5]:
common_block_names = [entry[2].rstrip('\n').lower().replace(' ', '')
for entry in marcs_common_blocks if entry[2][0].lower() not in ['c', '!', '*']]
With commented entries removed, all common blocks can be identified by their initial c
character. This will ensure that all unwanted entries that spuriously ended up in the list are removed. Then, we extract common block names by looking what is between the / /
.
In [6]:
common_block_names = [entry[entry.find('/') + 1:entry.rfind('/')] for entry in common_block_names
if entry[0].lower() == 'c']
Check whether we've isolated common block names.
In [7]:
common_block_names[0], common_block_names[50], common_block_names[-1]
Out[7]:
Remove duplicates from the list.
In [8]:
common_block_names = list(set(common_block_names))
Here's a full listing.
In [9]:
common_block_names.sort()
common_block_names
Out[9]:
Thre are clearly some issues related to programming styles. Most repeated occurrences are the result of the user "closing" the common block or by including multiple common blocks on a single line. Let's remove those with some brute force tactics.
In [10]:
second_round_names = [entry[entry.rfind('/') + 1:] for entry in common_block_names if entry.rfind('/') != -1]
In [11]:
second_round_names
Out[11]:
Only one entry has three common block names, but luckily the third name is already indexed, so we can move on. Get only the first common block name from the original list.
In [12]:
first_round_names = [entry for entry in common_block_names if entry.rfind('/') == -1]
Combine the two lists and remove duplicate entries.
In [13]:
common_block_names = list(set(first_round_names + second_round_names))
In [14]:
common_block_names.sort()
Now we are in a position to create a table of contents for our common blocks. It may be best for visualization if we write it in both plain text and markdown.
First, a test to get a proper formatting.
In [15]:
key = common_block_names[0]
print key.upper()
for entry in marcs_common_blocks:
if entry[2].lower().find(key) != -1:
print "\t {:16s} on line: {:4s}".format(entry[0], entry[1])
That seems to be quite reasonable. Now for all keys,
In [16]:
for key in common_block_names:
print key.upper()
for entry in marcs_common_blocks:
if entry[2].lower().find(key) != -1:
print "\t {:16s} on line: {:4s}".format(entry[0], entry[1])
else:
pass
That clearly works, so let's output that information to a plain text file.
In [17]:
plaint = open('common_block_index.txt', 'w')
for key in common_block_names:
plaint.write(key.upper() + '\n')
for entry in marcs_common_blocks:
if entry[2].lower().find(key) != -1:
plaint.write("\t {:30s} on line: {:4s} \n".format(entry[0], entry[1]))
else:
pass
plaint.write('\n')
plaint.close()
And in markdown for easy reading online.
In [18]:
markd = open('common_block_index.md', 'w')
for key in common_block_names:
markd.write('## ' + key.upper() + '\n')
for entry in marcs_common_blocks:
if entry[2].lower().find(key) != -1:
markd.write("\t {:30s} on line: {:4s} \n".format(entry[0], entry[1]))
else:
pass
markd.write('\n')
markd.close()