In [18]:
from selenium import webdriver
import pandas as pd
import time
Launch the Firefox web browser.
In [43]:
driver = webdriver.Firefox()
Using get function, we can tell Firefox to visit the web page (shown below).
The web page only has an input box, a submit and a clear button.
In [ ]:
url = 'http://aps.unmc.edu/AP/prediction/prediction_main.php' # the url of the main page
driver.get(url)
In [48]:
from IPython.display import HTML ## for displaying the webpage in IFrame
In [50]:
HTML('<iframe src=http://aps.unmc.edu/AP/prediction/prediction_main.php width=700 height=500></iframe>')
Out[50]:
Read query sequences from an Excel file.
In [26]:
table = pd.read_excel('T_cell_epitope_positive.xlsx')
Then we iterate over the table (DataFrame) to get an epitope ID and a query sequence.
The query sequence is entered in the input box using send_keys function and submit function is called to submit the query.
HTML source is then saved to the output file named after the epitope ID. The script waits for 5 seconds then proceeds to the next iteration.
In [27]:
for row in table.iterrows():
epitope_id, seq = row[1][0], row[1][2]
op = open('data/t_cell_positive/%s.txt' % str(epitope_id), 'w')
driver.get(url) # load the main page
input_element = driver.find_element_by_name('input') # find input element
input_element.send_keys(seq) # enter the query sequence in the input box
input_element.submit() # submit the query sequence to the server
time.sleep(5) # wait for 5 seconds
op.write(str(driver.page_source)) # HTML source is saved to the output file
op.close()
We do the same thing for another two Excel files.
In [ ]:
table = pd.read_excel('data/t_cell_epitope_negative.xlsx')
In [ ]:
url = 'http://aps.unmc.edu/AP/prediction/prediction_main.php'
for row in table.iterrows():
epitope_id, seq = row[1][0], row[1][1]
op = open('data/t_cell_negative/%s.txt' % str(epitope_id), 'w')
driver.get(url)
input_element = driver.find_element_by_name('input')
input_element.send_keys(seq)
input_element.submit()
time.sleep(5)
op.write(str(driver.page_source))
op.close()
In [44]:
table = pd.read_excel('data/B_cell_epitope_negative.xlsx')
In [45]:
url = 'http://aps.unmc.edu/AP/prediction/prediction_main.php'
for row in table.iterrows():
epitope_id, seq = row[1][0], row[1][1]
op = open('data/b_cell_negative/%s.txt' % str(epitope_id), 'w')
driver.get(url)
input_element = driver.find_element_by_name('input')
input_element.send_keys(seq)
input_element.submit()
time.sleep(3)
op.write(str(driver.page_source))
op.close()
Quit Firefox.
In [ ]:
driver.quit()