This notebook will teach you how to use disciplines for the analysis of particular theory. Here we will see theory of emergence.
First, lets import the theory.
In [1]:
from disciplines.theory import emergence
We can get the basic information about theory from its docstring.
In [2]:
print emergence.__doc__
The docstring above is written manually, in the future editions it will be generated from the data associated with the module.
Now lets see who are the authors associated with the disciplines.
In [3]:
emergence.authors
Out[3]:
both author and authors work in the same way.
You can get list of concepts used within theory like this. For this moment, the list of concept was extracted manually by reading the theory itself. However, in future versions of disciplines we will try to automatize extraction of main concepts of the text. That will likely be done by using NLP algorithms. I have seen some but I cannot give names for now. So, what concepts are used in emergence theory? (For now we use only concepts that are used by Ziman.)
In [4]:
emergence.concepts
Out[4]:
Also, concepts here should be defined. Two types of definitions should be available:
a) on definitions by the author and
b) by general consensus use of definitions.
In both cases we can look into. Term extraction.
In theory of emergence of disciplines, as in any other theory, we have some ideas about what kind of data we need. The theories data requirements were extracted from text itself. Datatypes can be access like this:
In [6]:
emergence.data
Out[6]:
Based on this list, scripts can understand, what kind of data will have to be prepared from database and other locations.
In this case we will might need network of citations, list of journals, newsletters, and conferences.
Still, this list is problematic because it does not provide full information what kind of aspects we will need. For example, it is not written that we will need list of conferences, but this can be deduced from the fact that we will need list of conference participants.
Also, we do not know why we need list of journals
The description of particular theory can be retrieved.
In [5]:
print emergence.theory
Currently it is only a text that was used to formulate the claims. That is going to be changed.
In future, theory will be associated with many related texts and particular extracts from them. For now, we will have only this extract as it is dense enough to continue with.
But, before we continue, let's look into description of our approach.
In [6]:
print emergence.approach
The part is totally written by humans. This is part that glues all hard-to-related aspects of developing this theory. At this point I hardly imagine that this is going to be done automatically.
However, we plan to get this into stage where text written here could be interpreted with NLP and necessery actions might be proposed. Still, we have a loong way till that. Why? Well, maybe not that long.
In every theory we have little details that it is made of. Every sentence or paragrah can be implemented as little code. All of them should be stored as lists, dictionaries, functions and objects. For now, let's look at functions that are already available.
In [7]:
import inspect
all_functions = inspect.getmembers(emergence, inspect.isfunction)
[x[0] for x in all_functions]
Out[7]:
lets get back to data types that are required to test the theory.
In [8]:
emergence.data
Out[8]:
Now lets look whether our current database has any of the following data required.
In [9]:
from data import availability
availability('citation network') #return all datasets that fulfill criteria of citation network.
availability('journal') #return all datasets that fulfill criteria of citation network.
# Problem, a more specific set is required, not a journal, but special editions, in which a particular set of resarchers are going
availability('newsletter') #check whether historic account of newsletter available.
availability('conference participants') #check whether historic account of newsletter available.
We have found that only citation network and hierarchy of a journal exists. Therefore we will continue the reseach based on available information.
In [10]:
url = 'd:\Desktop\DBLP_Citation_2014_May\publications.txt'
In [11]:
with open(url) as infile:
count = 0
mylist = []
thelist = []
for line in infile:
count += 1
if count == 1:
mylist.append(line)
if count == 2:
mylist.append(line)
if count == 3:
mylist.append(line)
if count == 4:
mylist.append(line)
if count == 5:
mylist.append(line)
if count == 6:
mylist.append(line)
if count == 7:
mylist.append(line)
if count == 8:
thelist.append(mylist)
mylist = []
count = 0
#do_something_with(line)
In [12]:
len(thelist)
Out[12]:
In [14]:
thelist[2]
Out[14]:
In [40]:
url = 'd:\Desktop\DBLP_Citation_2014_May\domains'
def get_citations(url):
print 'reading {}'.format(link)
with open(url) as infile:
mylist = []
thelist = []
for line in infile:
mylist.append(line)
if ' \n' in line:
mylist.pop()
thelist.append(mylist)
mylist = []
return thelist
list_of_lists = []
for x in os.listdir(url):
link = '\\'.join([url,x])
citations = get_citations(link)
list_of_lists.append(citations)
len(list_of_lists)
Out[40]:
In [24]:
if line '#*':
paperTitle = line
if line '#@':
Authors = line
if line '#t':
Year = line
if line '#c':
publication_venue = line
if line '#index 00':
index_id = line
if line '#%':
references.append(line)
if line '#!':
abstract = line
In [1]:
emergence.what_emergence_state('social studies of science')
In [ ]:
emergence.is_emergence_state(1, 'social studies of science')
In [ ]:
network_of_citations = 'some graph'
In [ ]:
emergence.observe_nodal_points(network_of_citations, 3)
In [ ]:
person_list = ['John Peter', 'Pete Johner']
In [ ]:
emergence.get_conferences_organized_by_cluster(person_list)
In [ ]:
emergence.recreate_emerge('sociology')
In [ ]:
emergence.detect_emergences()
In [ ]:
emergence.get_special_issues_of_a_primary_journal('Science and Society Studies') # what is a primary journal