A simple example of scraping the apiconsf site for the list of sponsors.


In [1]:
from lxml.html import fromstring
import requests

url = "http://www.apiconsf.com/apiconsponsors"
doc = fromstring(requests.get(url).content)

In [2]:
doc.cssselect(".sponsor_bio")


Out[2]:
[<Element div at 0x1051339f0>,
 <Element div at 0x105133a48>,
 <Element div at 0x105133aa0>,
 <Element div at 0x105133af8>,
 <Element div at 0x105133b50>,
 <Element div at 0x105133ba8>,
 <Element div at 0x105133c00>,
 <Element div at 0x105133c58>,
 <Element div at 0x105133cb0>,
 <Element div at 0x105133d08>,
 <Element div at 0x105133d60>,
 <Element div at 0x105133db8>,
 <Element div at 0x105133e10>,
 <Element div at 0x105133e68>,
 <Element div at 0x105133ec0>,
 <Element div at 0x105133f18>,
 <Element div at 0x105133f70>]

In [3]:
# if we just want name
print "\n".join([e.text for e in doc.cssselect(".bio_name")])


Evernote
MuleSoft
Concur
Getty Images
SecureKey
Orchestrate
SendGrid
3Scale
Restlet
Wit.AI
Nexmo
Eagle Eye Networks
Mojio
Microsoft Azure
Import.io
General Assembly
Hack Reactor