Scrapy is a Python framework for data scraping, which, to say in short, is the combination of almost everything we learnt until now: requests, css selectors (BeautifulSoup), xpath (lxml), regex (re) and even checking robots.txt or putting hte scraper to sleep.
Generally, as Scrapy is a framework, one does not code inside Jupyter Notebook. To mimic Scrapy behavior inside the Notebook, we will have to make some additional imports which would not be required otherwise.
Key points:
In [1]:
    
import requests
from scrapy.http import TextResponse
    
In [2]:
    
url = "http://quotes.toscrape.com/"
r = requests.get(url)
response = TextResponse(r.url,body=r.text,encoding="utf-8")
    
In [3]:
    
response
    
    Out[3]:
In [10]:
    
#get heading-css
response.css("a").extract_first()
    
    Out[10]:
In [13]:
    
#get heading-xpath
response.xpath("//a").extract_first()
    
    Out[13]:
In [16]:
    
#get authors-css
response.css("small::text").extract()
    
    Out[16]:
In [17]:
    
#authors-xpath
response.xpath("//small/text()").extract()
    
    Out[17]:
In [19]:
    
#heading-css
response.css('a[style="text-decoration: none"]').extract()
    
    Out[19]:
In [20]:
    
#heading-css text only
response.css('a[style="text-decoration: none"]::text').extract()
    
    Out[20]:
In [21]:
    
#heading-css href only
response.css('a[style="text-decoration: none"]::attr(href)').extract()
    
    Out[21]:
In [23]:
    
#tag text css
response.css("a[class='tag']::text").extract()
    
    Out[23]:
In [24]:
    
#tag url css
response.css("a[class='tag']::attr(href)").extract()
    
    Out[24]:
In [28]:
    
#tag text xpath
response.xpath("//a[@class='tag']/text()").extract()
    
    Out[28]:
In [30]:
    
#tag url xpath
response.xpath("//a[@class='tag']/@href").extract()
    
    Out[30]:
In [7]:
    
response.css("title").extract_first()
    
    Out[7]:
In [9]:
    
response.css("title").re("title")
    
    Out[9]:
In [17]:
    
#regex to get text between tags
response.css("title").re('.+>(.+)<.+')
    
    Out[17]:
In [ ]: