My transcripts scraper has some issues, so I am going to try and debug it here in a notebook.
Below is the log for the errors
In [ ]:
(scrapy-env) C:\Users\caleb\Documents\Data Science\welcome-to-night-vale\scrapy\wtnv>scrapy crawl transcripts -L ERROR
2017-06-02 22:54:23 [scrapy.core.scraper] ERROR: Spider error processing <GET http://cecilspeaks.tumblr.com/post/145954523576/episode-90-whos-a-good-boy-part-2#_=_> (referer: http://cecilspeaks.tumblr.com/)
Traceback (most recent call last):
File "c:\users\caleb\appdata\local\conda\conda\envs\scrapy-env\lib\site-packages\twisted\internet\defer.py", line 653, in _runCallbacks
current.result = callback(current.result, *args, **kw)
File "C:\Users\caleb\Documents\Data Science\welcome-to-night-vale\scrapy\wtnv\wtnv\spiders\transcripts.py", line 46, in parse_transcript
with open(transcript_path, 'w', encoding='utf-8') as f:
IOError: [Errno 22] Invalid argument: u'C:\\Users\\caleb\\Documents\\Data Science\\welcome-to-night-vale\\data\\transcripts\\Episode 90 - Who\u2019s a Good Boy? Part 2.txt'
2017-06-02 22:54:29 [scrapy.core.scraper] ERROR: Spider error processing <GET http://cecilspeaks.tumblr.com/post/113433434311/heres-the-second-youtube-release-today-a-preview#_=_> (referer: http://cecilspeaks.tumblr.com/)
Traceback (most recent call last):
File "c:\users\caleb\appdata\local\conda\conda\envs\scrapy-env\lib\site-packages\twisted\internet\defer.py", line 653, in _runCallbacks
current.result = callback(current.result, *args, **kw)
File "C:\Users\caleb\Documents\Data Science\welcome-to-night-vale\scrapy\wtnv\wtnv\spiders\transcripts.py", line 37, in parse_transcript
).extract()[0]
IndexError: list index out of range
2017-06-02 22:54:30 [scrapy.core.scraper] ERROR: Spider error processing <GET http://cecilspeaks.tumblr.com/post/117838916316/episode-67-best-of#_=_> (referer: http://cecilspeaks.tumblr.com/)
Traceback (most recent call last):
File "c:\users\caleb\appdata\local\conda\conda\envs\scrapy-env\lib\site-packages\twisted\internet\defer.py", line 653, in _runCallbacks
current.result = callback(current.result, *args, **kw)
File "C:\Users\caleb\Documents\Data Science\welcome-to-night-vale\scrapy\wtnv\wtnv\spiders\transcripts.py", line 46, in parse_transcript
with open(transcript_path, 'w', encoding='utf-8') as f:
IOError: [Errno 22] Invalid argument: u'C:\\Users\\caleb\\Documents\\Data Science\\welcome-to-night-vale\\data\\transcripts\\Episode 67 - [Best Of?].txt'
2017-06-02 22:54:31 [scrapy.core.scraper] ERROR: Spider error processing <GET http://cecilspeaks.tumblr.com/post/102677600106/bonus-episode-2-what-of-the-sea#_=_> (referer: http://cecilspeaks.tumblr.com/)
Traceback (most recent call last):
File "c:\users\caleb\appdata\local\conda\conda\envs\scrapy-env\lib\site-packages\twisted\internet\defer.py", line 653, in _runCallbacks
current.result = callback(current.result, *args, **kw)
File "C:\Users\caleb\Documents\Data Science\welcome-to-night-vale\scrapy\wtnv\wtnv\spiders\transcripts.py", line 46, in parse_transcript
with open(transcript_path, 'w', encoding='utf-8') as f:
IOError: [Errno 22] Invalid argument: u'C:\\Users\\caleb\\Documents\\Data Science\\welcome-to-night-vale\\data\\transcripts\\Bonus Episode 2 - What of the Sea?.txt'
2017-06-02 22:54:32 [scrapy.core.scraper] ERROR: Spider error processing <GET http://cecilspeaks.tumblr.com/post/99042573486/episode-55-the-university-of-what-it-is#_=_> (referer: http://cecilspeaks.tumblr.com/)
Traceback (most recent call last):
File "c:\users\caleb\appdata\local\conda\conda\envs\scrapy-env\lib\site-packages\twisted\internet\defer.py", line 653, in _runCallbacks
current.result = callback(current.result, *args, **kw)
File "C:\Users\caleb\Documents\Data Science\welcome-to-night-vale\scrapy\wtnv\wtnv\spiders\transcripts.py", line 49, in parse_transcript
f.write("\n\n".join(transcript))
TypeError: write() argument 1 must be unicode, not str
2017-06-02 22:54:33 [scrapy.core.scraper] ERROR: Spider error processing <GET http://cecilspeaks.tumblr.com/post/100041657976/the-thrilling-adventure-hourwelcome-to-night-vale#_=_> (referer: http://cecilspeaks.tumblr.com/)
Traceback (most recent call last):
File "c:\users\caleb\appdata\local\conda\conda\envs\scrapy-env\lib\site-packages\twisted\internet\defer.py", line 653, in _runCallbacks
current.result = callback(current.result, *args, **kw)
File "C:\Users\caleb\Documents\Data Science\welcome-to-night-vale\scrapy\wtnv\wtnv\spiders\transcripts.py", line 46, in parse_transcript
with open(transcript_path, 'w', encoding='utf-8') as f:
IOError: [Errno 2] No such file or directory: u'C:\\Users\\caleb\\Documents\\Data Science\\welcome-to-night-vale\\data\\transcripts\\The Thrilling Adventure Hour/Welcome to Night Vale Crossover.txt'