In [1]:
ls
In [2]:
import pandas as pd
In [3]:
pd.read_csv?
pandasを利用したい。インストールする必要がある。
In [4]:
join_df = pd.read_csv('join20140303.csv')
In [5]:
join_df.head()
Out[5]:
pandas.read_csv()だけで、csvはいい感じに読める。
In [6]:
line_df = pd.read_csv('line20140303free.csv')
In [7]:
line_df.head()
Out[7]:
In [8]:
station_df = pd.read_csv('station20140303free.csv')
In [9]:
station_df.head()
Out[9]:
In [10]:
station_df.columns
Out[10]:
In [11]:
import networkx as nx
In [12]:
nx.__version__
Out[12]:
networkxも使おう。
In [13]:
line_df['line_cd']
Out[13]:
In [14]:
station_df[station_df['station_name'] == '東京']
Out[14]:
東京駅は路線の違いで10個有るらしい。東京駅に乗り入れている路線が10路線ある?
In [15]:
line_in_tokyo = station_df[station_df['station_name'] == '東京']['line_cd']
print line_in_tokyo
In [16]:
[line_df[line_df['line_cd']==x]['line_name'] for x in line_in_tokyo]
Out[16]:
In [17]:
line_df[line_df['line_name'] == '東海道新幹線'].head()
Out[17]:
In [18]:
import re
re.match('hoge.*', 'hogehoge')
Out[18]:
In [19]:
re.match('.*東海道.*', '東海道本線 ')
Out[19]:
In [20]:
for line_name in line_df['line_name']:
if re.match('.*東海道.*', line_name):
print line_name
In [21]:
line_df[line_df['line_name'] == 'JR東海道本線(東京~熱海)'].head()
Out[21]:
In [22]:
def line_name_cd(line_name):
matched_lines = line_df[line_df['line_name'] == line_name]['line_cd']
if len(matched_lines) != 1:
raise ValueError('There is no lines such as {}'.format(line_name))
return int(matched_lines)
路線名を入力すると、その路線のIDを返す関数
In [23]:
line_name_cd('東海道新幹線')
Out[23]:
In [24]:
int(line_name_cd('JR東海道本線(東京~熱海)'))
Out[24]:
In [25]:
station_df[station_df['line_cd'] == line_name_cd('JR東海道本線(東京~熱海)')]
Out[25]:
In [26]:
def stations_in_line(line_name):
return station_df[station_df['line_cd'] == line_name_cd('JR東海道本線(東京~熱海)')]
路線名を入力すると、その路線の駅のDataFrameを返す関数。(入力は路線IDの方がよいか?)
In [27]:
stations_in_line('JR東海道本線(東京〜熱海)')['station_name']
Out[27]:
隣接している駅はstations_in_lineで隣り合っているものであろう。networkxのエッジの作り方としては、
In [ ]: