GetsDrawn DotCom

This is a python script to generate the website GetsDrawn. It takes data from /r/RedditGetsDrawn and makes something awesome.

The script has envolved and been rewritten several times.

The first script for rgdsnatch was written after I got banned from posting my artwork on /r/RedditGetsDrawn. The plan was to create a new site that displayed stuff from /r/RedditGetsDrawn.

Currently it only displays the most recent 25 items on redditgetsdrawn. The script looks at the newest 25 reference photos on RedditGetsDrawn. It focuses only on jpeg/png images and ignores and links to none .jpg or .png ending files. It is needed to instead of ignoring them files - get the image or images in some cases, from the link. The photos are always submitted from imgur. Still filter out the i.imgur files, but take the links and filter them through a python imgur module returning the .jpeg or .png files.

This is moving forward from rgdsnatch.py because I am stuck on it.

TODO

Fix the links that don't link to png/jpeg and link to webaddress. Needs to get the images that are at that web address and embed them.

Display artwork submitted under the images.

Upload artwork to user. Sends them a message on redditgetsdrawn with links.

More pandas

Saves reference images to imgs/year/month/day/reference/username-reference.png

Saves art images to imgs/year/month/day/art/username-line-bw-colour.png

Creates index.html file with: Title of site and logo: GetsDrawn Last updated date and time.

Path of image file /imgs/year/month/day/username-reference.png. (This needs changed to just their username).

Save off .meta data from reddit of each photo, saving it to reference folder. username-yrmnthday.meta - contains info such as author, title, upvotes, downvotes. Currently saving .meta files to a meta folder - along side art and reference.

Folder sorting system of files. websitename/index.html-style.css-imgs/YEAR(15)-MONTH(2)-DAY(4)/art-reference-meta Inside art folder Currently it generates USERNAME-line/bw/colour.png 50/50 white files. Maybe should be getting art replies from reddit?

Inside reference folder Reference fold is working decent. it creates USERNAME-reference.png / jpeg files.

Currently saves username-line-bw-colour.png to imgs folder. Instead get it to save to imgs/year/month/day/usernames.png. Script checks the year/month/day and if folder isnt created, it creates it. If folder is there, exit. Maybe get the reference image and save it with the line/bw/color.pngs

The script now filters the jpeg and png image and skips links to imgur pages. This needs to be fixed by getting the images from the imgur pages. It renames the image files to the redditor username followed by a -reference tag (and ending with png of course). It opens these files up with PIL and checks the sizes. It needs to resize the images that are larger than 800px to 800px. These images need to be linked in the index.html instead of the imgur altenatives.

Instead of the jpeg/png files on imgur they are downloaded to the server with this script.

Filter through as images are getting downloaded and if it has been less than certain time or if the image has been submitted before

Extending the subreddits it gets data from to cycle though a list, run script though list of subreddits.

Browse certain days - Current day by default but option to scroll through other days.

Filters - male/female/animals/couples etc Function that returns only male portraits. tags to add to photos. Filter images with tags


In [2]:
import os 
import requests
from bs4 import BeautifulSoup
import re
import json
import time
import praw
import dominate
from dominate.tags import * 
from time import gmtime, strftime
#import nose
#import unittest
import numpy as np
import pandas as pd
from pandas import *
from PIL import Image
from pprint import pprint
#import pyttsx
import shutil

In [3]:
gtsdrndir = ('/home/wcmckee/getsdrawndotcom')

In [4]:
os.chdir(gtsdrndir)

In [5]:
r = praw.Reddit(user_agent='getsdrawndotcom')

In [6]:
#getmin = r.get_redditor('itwillbemine')

In [7]:
#mincom = getmin.get_comments()

In [8]:
#engine = pyttsx.init()

#engine.say('The quick brown fox jumped over the lazy dog.')
#engine.runAndWait()

In [9]:
#shtweet = []

In [10]:
#for mi in mincom:
#    print mi
#    shtweet.append(mi)

In [11]:
bodycom = []
bodyicv = dict()

In [12]:
#beginz = pyttsx.init()

In [13]:
#for shtz in shtweet:
#    print shtz.downs
#    print shtz.ups
#    print shtz.body
#    print shtz.replies
    #beginz.say(shtz.author)
    #beginz.say(shtz.body)
    #beginz.runAndWait()
    
#    bodycom.append(shtz.body)
    #bodyic

In [14]:
#bodycom

In [15]:
getnewr = r.get_subreddit('redditgetsdrawn')

In [16]:
rdnew = getnewr.get_new()

In [17]:
lisrgc = []
lisauth = []

In [18]:
for uz in rdnew:
    #print uz
    lisrgc.append(uz)

In [19]:
gtdrndic = dict()

In [20]:
imgdir = ('/home/wcmckee/getsdrawndotcom/imgs')

In [21]:
artlist = os.listdir(imgdir)

In [22]:
from time import time

In [23]:
yearz = strftime("%y", gmtime())
monthz = strftime("%m", gmtime())
dayz = strftime("%d", gmtime())


#strftime("%y %m %d", gmtime())

In [24]:
imgzdir = ('imgs/')
yrzpat = (imgzdir + yearz)
monzpath = (yrzpat + '/' + monthz)
dayzpath = (monzpath + '/' + dayz)
rmgzdays = (dayzpath + '/reference')
imgzdays = (dayzpath + '/art')
metzdays = (dayzpath + '/meta')

repathz = ('imgs/' + yearz + '/' + monthz + '/' + dayz + '/')

In [25]:
metzdays


Out[25]:
'imgs/15/01/05/meta'

In [26]:
imgzdays


Out[26]:
'imgs/15/01/05/art'

In [27]:
repathz


Out[27]:
'imgs/15/01/05/'

In [28]:
def ospacheck():
    if os.path.isdir(imgzdir + yearz) == True:
        print 'its true'
    else:
        print 'its false'
        os.mkdir(imgzdir + yearz)

In [29]:
ospacheck()


its true

In [30]:
#if os.path.isdir(imgzdir + yearz) == True:
#    print 'its true'
#else:
#    print 'its false'
#    os.mkdir(imgzdir + yearz)

In [117]:
lizmon = ['monzpath', 'dayzpath', 'imgzdays', 'rmgzdays', 'metzdays']

In [120]:
for liz in lizmon:
    if os.path.isdir(liz) == True:
        print 'its true'
    else:
        print 'its false'
        os.mkdir(liz)


its false
its false
its false
its false
its false

In [36]:
fullhom = ('/home/wcmckee/getsdrawndotcom/')

In [38]:
#artlist

In [39]:
httpad = ('http://getsdrawn.com/imgs')

In [40]:
#im = Image.new("RGB", (512, 512), "white")
#im.save(file + ".thumbnail", "JPEG")

In [41]:
rmgzdays = (dayzpath + '/reference')
imgzdays = (dayzpath + '/art')
metzdays = (dayzpath + '/meta')

In [42]:
os.chdir(fullhom + metzdays)

In [47]:
metadict = dict()

if i save the data to the file how am i going to get it to update as the post is archieved. Such as up and down votes.


In [55]:
for lisz in lisrgc:
    metadict.update({'up': lisz.ups})
    metadict.update({'down': lisz.downs})
    metadict.update({'title': lisz.title})
    metadict.update({'created': lisz.created})
    #metadict.update({'createdutc': lisz.created_utc})
    #print lisz.ups
    #print lisz.downs
    #print lisz.created
    #print lisz.comments

In [56]:
metadict


Out[56]:
{'created': 1420436236.0,
 'createdutc': 1420407436.0,
 'down': 0,
 'title': u"This is my favorite pic of my little ones, would anyone be interested in transforming it into color? I'd love to see your artistic interpretation!",
 'up': 2}

Need to save json object.

Dict is created but it isnt saving. Looping through lisrgc twice, should only require the one loop.

Cycle through lisr and append to dict/concert to json, and also cycle through lisr.author meta folders saving the json that was created.


In [77]:
for lisr in lisrgc:
    gtdrndic.update({'title': lisr.title})
    lisauth.append(str(lisr.author))
    for osliz in os.listdir(fullhom + metzdays):
        with open(str(lisr.author) + '.meta', "w") as f:
            rstrin = lisr.title.encode('ascii', 'ignore').decode('ascii')
            #print matdict
            #metadict = dict()
            #for lisz in lisrgc:
            #    metadict.update({'up': lisz.ups})
            #    metadict.update({'down': lisz.downs})
            #    metadict.update({'title': lisz.title})
            #    metadict.update({'created': lisz.created})
            f.write(rstrin)

In [75]:
#matdict


Out[75]:
{'created': 1420436236.0,
 'down': 0,
 'title': u"This is my favorite pic of my little ones, would anyone be interested in transforming it into color? I'd love to see your artistic interpretation!",
 'up': 2}

I have it creating a meta folder and creating/writing username.meta files. It wrote 'test' in each folder, but now it writes the photo author title of post.. the username/image data. It should be writing more than author title - maybe upvotes/downvotes, subreddit, time published etc.


In [62]:
#os.listdir(dayzpath)

Instead of creating these white images, why not download the art replies of the reference photo.


In [63]:
#for lisa in lisauth:
#    #print lisa + '-line.png'
#    im = Image.new("RGB", (512, 512), "white")
#    im.save(lisa + '-line.png')
#    im = Image.new("RGB", (512, 512), "white")
#    im.save(lisa + '-bw.png')

    #print lisa + '-bw.png'
#    im = Image.new("RGB", (512, 512), "white")
#    im.save(lisa + '-colour.png')

    #print lisa + '-colour.png'

In [64]:
os.listdir('/home/wcmckee/getsdrawndotcom/imgs')


Out[64]:
['getsdrawn-bw.png', '12', '15', '14']

In [65]:
#lisauth

I want to save the list of usernames that submit images as png files in a dir. Currently when I call the list of authors it returns Redditor(user_name='theusername'). I want to return 'theusername'. Once this is resolved I can add '-line.png' '-bw.png' '-colour.png' to each folder.


In [66]:
#lisr.author

In [67]:
namlis = []

In [68]:
opsinz = open('/home/wcmckee/visignsys/index.meta', 'r')
panz = opsinz.read()

In [69]:
os.chdir('/home/wcmckee/getsdrawndotcom/' + rmgzdays)

Filter the non jpeg/png links. Need to perform request or imgur api to get the jpeg/png files from the link. Hey maybe bs4?


In [83]:



/usr/local/lib/python2.7/dist-packages/bs4/__init__.py:189: UserWarning: "http://m.imgur.com/uurbzet" looks like a URL. Beautiful Soup is not an HTTP client. You should probably use an HTTP client to get the document behind the URL, and feed that document to Beautiful Soup.
  '"%s" looks like a URL. Beautiful Soup is not an HTTP client. You should probably use an HTTP client to get the document behind the URL, and feed that document to Beautiful Soup.' % markup)

In [130]:
from imgurpython import ImgurClient

In [136]:
opps = open('/home/wcmckee/ps.txt', 'r')
opzs = open('/home/wcmckee/ps2.txt', 'r')
oprd = opps.read()
opzrd = opzs.read()

In [140]:
client = ImgurClient(oprd, opzrd)

# Example request
#items = client.gallery()
#for item in items:
#    print(item.link)
    

#itz = client.get_album_images()


---------------------------------------------------------------------------
TypeError                                 Traceback (most recent call last)
<ipython-input-140-91dfeecf21cc> in <module>()
      7 
      8 
----> 9 itz = client.get_album_images()

TypeError: get_album_images() takes exactly 2 arguments (1 given)

In [102]:
linklis = []

I need to get the image ids from each url. Strip the http://imgur.com/ from the string. The gallery id is the random characters after. if it's an album a is added. if multi imgs then , is used to seprate.

Doesnt currently work.


In [141]:
for rdz in lisrgc:
    if 'http://imgur.com' in rdz.url:
        print rdz.url
        #itz = client.get_album_images()
#        reimg = requests.get(rdz.url)
##        retxt = reimg.text
#        souptxt = BeautifulSoup(''.join(retxt))
#        soupurz = souptxt.findAll('img')
#        for soupuz in soupurz:
#            imgurl = soupuz['src']
#            print imgurl
#            linklis.append(imgurl)
            
            #try:
            #    imzdata = requests.get(imgurl)


http://imgur.com/SBaV275
http://imgur.com/pFHPdwE
http://imgur.com/qRDkoj6
http://imgur.com/a/lPqbx
http://imgur.com/xmmw9H0
http://imgur.com/ViCxsrS,lxpGIUQ#1
http://imgur.com/3RtgPGW
http://imgur.com/k2kzLZu
http://imgur.com/a/LTDJ9
http://imgur.com/KZqZncZ
http://imgur.com/a/xhfGD

In [111]:
linklis


Out[111]:
['//i.imgur.com/SBaV275.jpg',
 '',
 '//i.imgur.com/pFHPdwE.jpg',
 '',
 '//i.imgur.com/qRDkoj6.jpg',
 '',
 '//i.imgur.com/PzRhHTr.jpg',
 '//i.imgur.com/FrlJauY.jpg']

In [115]:
if '.jpg' in linklis:
    print 'yes'
else:
    print 'no'


no

In [70]:
#panz()
for rdz in lisrgc:
    (rdz.title)
    #a(rdz.url)
    if 'http://i.imgur.com' in rdz.url:
        #print rdz.url
        print (rdz.url)
        url = rdz.url
        response = requests.get(url, stream=True)
        with open(str(rdz.author) + '-reference.png', 'wb') as out_file:
            shutil.copyfileobj(response.raw, out_file)
            del response


http://i.imgur.com/LPJyRvI.jpg
http://i.imgur.com/IXscmtj.jpg
http://i.imgur.com/HDkkpfs.jpg
http://i.imgur.com/ENDI3AG.jpg
http://i.imgur.com/P4ZXyZu.jpg
http://i.imgur.com/fTznCbi.jpg
http://i.imgur.com/aDIiH6t.jpg
http://i.imgur.com/6NifkhZ.jpg
http://i.imgur.com/EqnCVT7.jpg
http://i.imgur.com/vJgRQ2n.jpg
http://i.imgur.com/tMvP7jP.jpg
http://i.imgur.com/EEae4eN.jpg
http://i.imgur.com/SQeDd69.jpg

In [55]:
apsize = []

In [56]:
aptype = []

In [57]:
basewidth = 600

In [58]:
imgdict = dict()

In [59]:
for rmglis in os.listdir('/home/wcmckee/getsdrawndotcom/' + rmgzdays):
    #print rmglis
    im = Image.open(rmglis)
    #print im.size
    imgdict.update({rmglis : im.size})
    #im.thumbnail(size, Image.ANTIALIAS)
    #im.save(file + ".thumbnail", "JPEG")
    apsize.append(im.size)
    aptype.append(rmglis)

In [60]:
#for imdva in imgdict.values():
    #print imdva
    #for deva in imdva:
        #print deva
     #   if deva < 1000:
      #      print 'omg less than 1000'
       # else:
        #    print 'omg more than 1000'
         #   print deva / 2
            #print imgdict.values
            # Needs to update imgdict.values with this new number. Must halve height also.

In [61]:
#basewidth = 300
#img = Image.open('somepic.jpg')
#wpercent = (basewidth/float(img.size[0]))
#hsize = int((float(img.size[1])*float(wpercent)))
#img = img.resize((basewidth,hsize), PIL.Image.ANTIALIAS)
#img.save('sompic.jpg')

In [62]:
#os.chdir(metzdays)

In [62]:


In [63]:
#for numz in apsize:
#    print numz[0]
 #   if numz[0] > 800:
#        print ('greater than 800')
#    else:
#        print ('less than 800!')

In [64]:
reliz = []

In [65]:
for refls in os.listdir('/home/wcmckee/getsdrawndotcom/' + rmgzdays):
    #print rmgzdays + refls
    reliz.append(rmgzdays + '/' + refls)

In [66]:
reliz


Out[66]:
['imgs/15/01/04/reference/clawz_nd_webz-reference.png',
 'imgs/15/01/04/reference/Jasperthecat77-reference.png',
 'imgs/15/01/04/reference/Xrayguy104-reference.png',
 'imgs/15/01/04/reference/trippedwire-reference.png',
 'imgs/15/01/04/reference/herooftime94-reference.png',
 'imgs/15/01/04/reference/OhDeBabies-reference.png',
 'imgs/15/01/04/reference/yumyumyoshi-reference.png',
 'imgs/15/01/04/reference/thaisun-reference.png',
 'imgs/15/01/04/reference/Resrey-reference.png',
 'imgs/15/01/04/reference/SinisterCanuck-reference.png',
 'imgs/15/01/04/reference/Marsinator-reference.png',
 'imgs/15/01/04/reference/zakkalaska-reference.png',
 'imgs/15/01/04/reference/jazzyghost-reference.png',
 'imgs/15/01/04/reference/Reptilebear-reference.png',
 'imgs/15/01/04/reference/WesternWaterTribe-reference.png',
 'imgs/15/01/04/reference/AidenXY-reference.png',
 'imgs/15/01/04/reference/mrsmomo-reference.png',
 'imgs/15/01/04/reference/SpaceFeline-reference.png',
 'imgs/15/01/04/reference/Jabald69-reference.png',
 'imgs/15/01/04/reference/harisshahzad98-reference.png',
 'imgs/15/01/04/reference/TinyB1-reference.png',
 'imgs/15/01/04/reference/crackettt-reference.png',
 'imgs/15/01/04/reference/seahorseVT-reference.png',
 'imgs/15/01/04/reference/the_master_blaster-reference.png',
 'imgs/15/01/04/reference/baconbreeder-reference.png']

In [67]:
aptype


Out[67]:
['clawz_nd_webz-reference.png',
 'Jasperthecat77-reference.png',
 'Xrayguy104-reference.png',
 'trippedwire-reference.png',
 'herooftime94-reference.png',
 'OhDeBabies-reference.png',
 'yumyumyoshi-reference.png',
 'thaisun-reference.png',
 'Resrey-reference.png',
 'SinisterCanuck-reference.png',
 'Marsinator-reference.png',
 'zakkalaska-reference.png',
 'jazzyghost-reference.png',
 'Reptilebear-reference.png',
 'WesternWaterTribe-reference.png',
 'AidenXY-reference.png',
 'mrsmomo-reference.png',
 'SpaceFeline-reference.png',
 'Jabald69-reference.png',
 'harisshahzad98-reference.png',
 'TinyB1-reference.png',
 'crackettt-reference.png',
 'seahorseVT-reference.png',
 'the_master_blaster-reference.png',
 'baconbreeder-reference.png']

In [68]:
opad = open('/home/wcmckee/ad.html', 'r')

In [69]:
opred = opad.read()

In [70]:
str2 = opred.replace("\n", "")

In [71]:
str2


Out[71]:
'<script async src="//pagead2.googlesyndication.com/pagead/js/adsbygoogle.js"></script><!-- header --><ins class="adsbygoogle"     style="display:inline-block;width:970px;height:250px"     data-ad-client="ca-pub-2716205862465403"     data-ad-slot="3994067148"></ins><script>(adsbygoogle = window.adsbygoogle || []).push({});</script>'

In [72]:
doc = dominate.document(title='GetsDrawn')

with doc.head:
    link(rel='stylesheet', href='style.css')
    script(type ='text/javascript', src='script.js')
    str(str2)
    
    with div():
        attr(cls='header')
        h1('GetsDrawn')
        p(img('imgs/getsdrawn-bw.png', src='imgs/getsdrawn-bw.png'))
        #p(img('imgs/15/01/02/ReptileLover82-reference.png', src= 'imgs/15/01/02/ReptileLover82-reference.png'))
        h1('Updated ', strftime("%a, %d %b %Y %H:%M:%S +0000", gmtime()))
        p(panz)
        p(bodycom)
    
    

with doc:
    with div(id='body').add(ol()):
        for rdz in reliz:
            #h1(rdz.title)
            #a(rdz.url)
            #p(img(rdz, src='%s' % rdz))
            #print rdz
            p(img(rdz, src = rdz))
            p(rdz)


                
            #print rdz.url
            #if '.jpg' in rdz.url:
            #    img(rdz.urlz)
            #else:
            #    a(rdz.urlz)
            #h1(str(rdz.author))
            
            #li(img(i.lower(), src='%s' % i))

    with div():
        attr(cls='body')
        p('GetsDrawn is open source')
        a('https://github.com/getsdrawn/getsdrawndotcom')
        a('https://reddit.com/r/redditgetsdrawn')

#print doc

In [73]:
docre = doc.render()

In [74]:
#s = docre.decode('ascii', 'ignore')

In [75]:
yourstring = docre.encode('ascii', 'ignore').decode('ascii')

In [76]:
indfil = ('/home/wcmckee/getsdrawndotcom/index.html')

In [77]:
mkind = open(indfil, 'w')
mkind.write(yourstring)
mkind.close()

In [78]:
#os.system('scp -r /home/wcmckee/getsdrawndotcom/ wcmckee@getsdrawn.com:/home/wcmckee/getsdrawndotcom')

In [79]:
#rsync -azP source destination

In [80]:
#updatehtm = raw_input('Update index? Y/n')
#updateref = raw_input('Update reference? Y/n')

#if 'y' or '' in updatehtm:
#    os.system('scp -r /home/wcmckee/getsdrawndotcom/index.html wcmckee@getsdrawn.com:/home/wcmckee/getsdrawndotcom/index.html')
#elif 'n' in updatehtm:
#    print 'not uploading'
#if 'y' or '' in updateref:
#    os.system('rsync -azP /home/wcmckee/getsdrawndotcom/ wcmckee@getsdrawn.com:/home/wcmckee/getsdrawndotcom/')

In [81]:
os.system('scp -r /home/wcmckee/getsdrawndotcom/index.html wcmckee@getsdrawn.com:/home/wcmckee/getsdrawndotcom/index.html')


Out[81]:
0

In [553]:
#os.system('scp -r /home/wcmckee/getsdrawndotcom/style.css wcmckee@getsdrawn.com:/home/wcmckee/getsdrawndotcom/style.css')

In [553]:


In [321]:


In [138]:


In [138]:


In [ ]: