Scraping Tutorial https://www.youtube.com/watch?v=XjNm9bazxn8&index=5&list=WL



In [4]:

    
import requests
from bs4 import BeautifulSoup



In [20]:

    
def trade_spider(max_pages):
    page = 1
    while page < max_pages:
        url = "http://www.imagefap.com/gallery.php?type=1&gen=44&userid=&search=&page=" + str(page)
        source_code = requests.get(url)
        plain_text = source_code.text
        soup = BeautifulSoup(plain_text)
        
        for link in soup.findAll('a', {'class': 'gal_title'}):
            href = "http://www.imagefap.com" + link.get('href')
            title = link.get('title')
#             print(href)
            get_single_item_data(href)
            
        page += 1



In [19]:

    
trade_spider(3)









    



C:\ProgramData\Anaconda3\lib\site-packages\bs4\__init__.py:181: UserWarning: No parser was explicitly specified, so I'm using the best available HTML parser for this system ("lxml"). This usually isn't a problem, but if you run this code on another system, or in a different virtual environment, it may use a different parser and behave differently.

The code that caused this warning is on line 193 of the file C:\ProgramData\Anaconda3\lib\runpy.py. To get rid of this warning, change code that looks like this:

 BeautifulSoup(YOUR_MARKUP})

to this:

 BeautifulSoup(YOUR_MARKUP, "lxml")

  markup_type=markup_type))






    



http://www.imagefap.com/gallery.php?gid=7083896
http://www.imagefap.com/profile/sexypanda50
http://www.imagefap.com/gallery.php?gid=7083823
http://www.imagefap.com/profile/Lambda21
http://www.imagefap.com/gallery.php?gid=7083758
http://www.imagefap.com/profile/SatinLover7
http://www.imagefap.com/gallery.php?gid=7083497
http://www.imagefap.com/profile/susiwil666
http://www.imagefap.com/gallery.php?gid=7083356
http://www.imagefap.com/profile/maleheart
http://www.imagefap.com/gallery.php?gid=7082970
http://www.imagefap.com/profile/Dirndlwilderer
http://www.imagefap.com/gallery.php?gid=7082335
http://www.imagefap.com/profile/pomelo123
http://www.imagefap.com/gallery.php?gid=7082217
http://www.imagefap.com/profile/streetdogey
http://www.imagefap.com/gallery.php?gid=7082197
http://www.imagefap.com/profile/partners_in_porn
http://www.imagefap.com/gallery.php?gid=7081681
http://www.imagefap.com/profile/Another_Pervert
http://www.imagefap.com/gallery.php?gid=7080148
http://www.imagefap.com/profile/Serpentinestyle
http://www.imagefap.com/gallery.php?gid=7079751
http://www.imagefap.com/profile/fruhrhope
http://www.imagefap.com/gallery.php?gid=7079743
http://www.imagefap.com/profile/fruhrhope
http://www.imagefap.com/gallery.php?gid=7079730
http://www.imagefap.com/profile/fruhrhope
http://www.imagefap.com/gallery.php?gid=7079158
http://www.imagefap.com/profile/winston777
http://www.imagefap.com/gallery.php?gid=7079045
http://www.imagefap.com/profile/StaceyDoesPorn
http://www.imagefap.com/gallery.php?gid=7077956
http://www.imagefap.com/profile/Trinity55
http://www.imagefap.com/gallery.php?gid=7077493
http://www.imagefap.com/profile/susiwil666
http://www.imagefap.com/gallery.php?gid=7077456
http://www.imagefap.com/profile/susiwil666
http://www.imagefap.com/gallery.php?gid=7077396
http://www.imagefap.com/profile/scuttle
http://www.imagefap.com/gallery.php?gid=7076572
http://www.imagefap.com/profile/fruhrhope
http://www.imagefap.com/gallery.php?gid=7075966
http://www.imagefap.com/profile/baseball1003
http://www.imagefap.com/gallery.php?gid=7075474
http://www.imagefap.com/profile/Dirndlwilderer
http://www.imagefap.com/gallery.php?gid=7075167
http://www.imagefap.com/profile/zbima
http://www.imagefap.com/gallery.php?gid=7075144
http://www.imagefap.com/profile/zbima
http://www.imagefap.com/gallery.php?gid=7074557
http://www.imagefap.com/profile/Nagasaki2012
http://www.imagefap.com/gallery.php?gid=7073681
http://www.imagefap.com/profile/apf666_2
http://www.imagefap.com/gallery.php?gid=7073502
http://www.imagefap.com/profile/streetdogey
http://www.imagefap.com/gallery.php?gid=7073223
http://www.imagefap.com/profile/purdie
http://www.imagefap.com/gallery.php?gid=7072962
http://www.imagefap.com/profile/apf666_2
http://www.imagefap.com/gallery.php?gid=7072771
http://www.imagefap.com/profile/bluesmoker
http://www.imagefap.com/gallery.php?gid=7072708
http://www.imagefap.com/profile/lipian
http://www.imagefap.com/gallery.php?gid=7072572
http://www.imagefap.com/profile/exhib5959
http://www.imagefap.com/gallery.php?gid=7071892
http://www.imagefap.com/profile/babedu13
http://www.imagefap.com/gallery.php?gid=7071281
http://www.imagefap.com/profile/pomelo123
http://www.imagefap.com/gallery.php?gid=7070540
http://www.imagefap.com/profile/mimil37
http://www.imagefap.com/gallery.php?gid=7070290
http://www.imagefap.com/profile/Matezy
http://www.imagefap.com/gallery.php?gid=7070201
http://www.imagefap.com/profile/nacktimnetz
http://www.imagefap.com/gallery.php?gid=7070127
http://www.imagefap.com/profile/Windstorm17
http://www.imagefap.com/gallery.php?gid=7069425
http://www.imagefap.com/profile/susiwil666
http://www.imagefap.com/gallery.php?gid=7069396
http://www.imagefap.com/profile/susiwil666
http://www.imagefap.com/gallery.php?gid=7069212
http://www.imagefap.com/profile/Gemorrah
http://www.imagefap.com/gallery.php?gid=7069134
http://www.imagefap.com/profile/maj16
http://www.imagefap.com/gallery.php?gid=7068369
http://www.imagefap.com/profile/Serpentinestyle
http://www.imagefap.com/gallery.php?gid=7068229
http://www.imagefap.com/profile/Negro2000
http://www.imagefap.com/gallery.php?gid=7067024
http://www.imagefap.com/profile/vintage-love
http://www.imagefap.com/gallery.php?gid=7066989
http://www.imagefap.com/profile/fruhrhope
http://www.imagefap.com/gallery.php?gid=7066148
http://www.imagefap.com/profile/susiwil666
http://www.imagefap.com/gallery.php?gid=7066113
http://www.imagefap.com/profile/pilpil7



In [32]:

    
def get_single_item_data(item_url):
    source_code = requests.get(item_url)
    plain_text = source_code.text
    soup = BeautifulSoup(plain_text)
    
    for item_name in soup.findAll('div', {'id': 'cnt_cats'}):
        links = item_name.find('a').contents[0]
        print(links)



In [33]:

    
trade_spider(2)









    



C:\ProgramData\Anaconda3\lib\site-packages\bs4\__init__.py:181: UserWarning: No parser was explicitly specified, so I'm using the best available HTML parser for this system ("lxml"). This usually isn't a problem, but if you run this code on another system, or in a different virtual environment, it may use a different parser and behave differently.

The code that caused this warning is on line 193 of the file C:\ProgramData\Anaconda3\lib\runpy.py. To get rid of this warning, change code that looks like this:

 BeautifulSoup(YOUR_MARKUP})

to this:

 BeautifulSoup(YOUR_MARKUP, "lxml")

  markup_type=markup_type))






    



Amateur
Big Tits
Mature
Voyeur
Amateur
Amateur
Amateur
Voyeur
Big Tits
Amateur
Downblouse
Masturbation
Masturbation
Amateur
Upskirt
Upskirt
Amateur
Voyeur
Voyeur
Big Tits
Fetish
Amateur
Amateur
Downblouse
Downblouse



In [ ]: