In-Class Coding Lab: Scrape NASDAQ Stock Quotes

For this lab we will walk you through scraping stock data from the NASDAQ website: http://www.nasdaq.com/

We will walk you through the process and when you're done you'll be tasked with creating a program which when given an NASDAQ symbol will retrieve the name of the stock, price, and percent change.

While we work through the example, we will use Amazon.com's stock symbol:amzn

For a list of NASDAQ stocks to try with the completed program, see here: http://www.cnbc.com/nasdaq-100/

Our plan

Here's our plan for a given stock symbol, for example amzn :

  1. use Requests to get HTML from this page http://www.nasdaq.com/symbol/amzn
  2. use BeautifulSoup4 to extract data from the site. save the data we extract into a dict
  3. print the stock info from the dict

We will write each step as its own function below


In [ ]:
import requests
from bs4 import BeautifulSoup
import time

Use Requests to get HTML

Here's the code:


In [ ]:
symbol = "amzn"
url = 'http://www.nasdaq.com/symbol/' + symbol
response = requests.get(url)
if response.ok:
    print (response.text)
else:
    print ("Error retrieving " + url)

It looks like it works, so we should refactor it into a function and then test again. we pass in the symbol as input and return what we would print as output.


In [ ]:
def get_html_from_nasdaq(symbol):
    url = 'http://www.nasdaq.com/symbol/' + symbol
    response = requests.get(url)
    if response.ok:
        return response.text
    else:
        return "Error retrieving " + url

Let's try it out... should work


In [ ]:
html = get_html_from_nasdaq('amzn')
html

use BeautifulSoup4 to extract data from the site

Next we want to take html and extract out the meaningful bits. This is the part that requires time and patience. You'll need to use a browser's developer tools to find the important CSS selectors so you can retrieve the data.

For simplicity's sake, we've done this for you. Feel free to open http://www.nasdaq.com/symbol/amzn in your browser's developer tools and locate each of these three selectors:


In [ ]:
soup = BeautifulSoup(html, "lxml")
name = soup.select("div#qwidget_pageheader h1")[0].text
price = soup.select("div#qwidget_lastsale")[0].text
change = soup.select("div#qwidget_percent")[0].text
print(name,price,change)

We can't easily return 3 values from a function (actually you can in python, but we don't like to teach you that) so instead we will create a dictionary of these values first:


In [ ]:
soup = BeautifulSoup(html, "lxml")
name = soup.select("div#qwidget_pageheader h1")[0].text
price = soup.select("div#qwidget_lastsale")[0].text
change = soup.select("div#qwidget_percent")[0].text
stock= { 'Name' : name,
        "Price" : price,
        "Change" : change
}

print(stock)

Once again, its time to refactor this into a function we take html as input (its the one thing we require to make this code work) and stock as output (since it is what we print.


In [ ]:
def extract_stock_data(html):
    soup = BeautifulSoup(html, "lxml")
    name = soup.select("div#qwidget_pageheader h1")[0].text
    price = soup.select("div#qwidget_lastsale")[0].text
    change = soup.select("div#qwidget_percent")[0].text
    stock= { 'Name' : name,
            "Price" : price,
            "Change" : change
    }

    return stock

And we should test out our new function:


In [ ]:
stock = extract_stock_data(html)
print(stock)

Putting it all together

Now you need to put it all together. Write a program to:

  1. input a stock symbol on the NASDAQ exchange
  2. get the html from the stock symbol on the NASDAQ website
  3. extract the stock data from the html
  4. print out the stock informtiom

The program should work like this:

Enter a stock symbol on the NASDAQ Exchange: amzn
Name: Amazon.com, Inc. Common Stock Quote & Summary Data
Price: $852.53
Change: 0.24%

In [ ]:
# todo write code here:

Metacognition

Please answer the following questions. This should be a personal narrative, in your own voice. Answer the questions by double clicking on the question and placing your answer next to the Answer: prompt.

  1. Record any questions you have about this lab that you would like to ask in recitation. It is expected you will have questions if you did not complete the code sections correctly. Learning how to articulate what you do not understand is an important skill of critical thinking.

Answer:

  1. What was the most difficult aspect of completing this lab? Least difficult?

Answer:

  1. What aspects of this lab do you find most valuable? Least valuable?

Answer:

  1. Rate your comfort level with this week's material so far.

1 ==> I can do this on my own and explain how to do it.
2 ==> I can do this on my own without any help.
3 ==> I can do this with help or guidance from others. If you choose this level please list those who helped you.
4 ==> I don't understand this at all yet and need extra help. If you choose this please try to articulate that which you do not understand.

Answer:


In [ ]: