HTML stands for 'Hyper Text Markup Language', which is text that loads elements of a webpage.
Parsing HTML in Python can be done via the beautifulsoup module. It is a third party module, and must be installed via pip.
In [2]:
    
import bs4
    
This module can be used in conjunction with requests to download and parse webpages.
For example, to download a webpage from Amazon, and find price information on the page.
In [12]:
    
import bs4
import requests
res = requests.get('http://www.amazon.ca/Automate-Boring-Stuff-Python-Programming/dp/1593275994')
res.raise_for_status()
    
No errors were raised, we can now parse the webpage text:
In [13]:
    
soup = bs4.BeautifulSoup(res.text)
    
    
This typically raises a warning, but it is not an exception. It can be solved by using bs4.BeautifulSoup(res.text, 'html.paser') as described.
We can now find elements in the page, and we can do this via CSS selections and the .select method.
In Chrome, we can copy the CSS Path via 'Inspect > Copy > Copy selector'.
In [14]:
    
# Store this element in that list at index 0
elems = soup.select('#buyNewSection > div > div > span > span')
print(elems[0])
print(elems[0].text)
    
    
The element was sucessful imported and stored.
If we tie this all together, we can create a simple program to perform these steps.
In [23]:
    
import bs4, requests
def getAmazonPrice(productUrl):
    # Use requests to download a URL
    res = requests.get(productUrl)
    # Raise for status to check for errors and crash if there are issues
    res.raise_for_status()
    
    # Pass the HTML to Beautiful Soup
    soup = bs4.BeautifulSoup(res.text, 'html.parser')
    # Pass the price CSS selector, and store into a list of all matching elements
    elems = soup.select('#buyNewSection > div > div > span > span')
    # Examine and return the text of the first (and only) element in the list
    return elems[0].text.strip()
    
    
    
price = getAmazonPrice('http://www.amazon.ca/Automate-Boring-Stuff-Python-Programming/dp/1593275994')
print('The current price of \'Automate the Boring Stuff with Python\' on Amazon is ' + price)
    
    
This method can be used to automate web scraping without ever using the browser. However, it may require try and except statement to handle special conditions; the CSS Selector may not apply to every page.
BeautifulSoup module.BeautifulSoup is imported as bs4.bs4.BeautifulSoup() function to get a Soup object..select() method that can be passed a string of the CSS Selector for an HTML tag..select() method will return a list of matching element objects.