RedTube json Python

This takes data from the website redtube and saves it as an index.html. When I first wrote this script I manually wrote the html tags. That was before I found dominate. Since discovering dominate I have been using it everywhere. It allows me to visually see the output of the scripts I write. The posibilities are endless...


In [26]:
import os
import random
import requests
from bs4 import BeautifulSoup
import re
import json
import time
import dominate
from dominate.tags import *
from time import gmtime, strftime

Requests and json are the two main modules used for this. Random can also be handy


In [27]:
getprn = requests.get('http://api.redtube.com/?output=json&data=redtube.Videos.searchVideos&page=1')

Simple requests command to get the json object. This could be any json object - not just RedTube


In [28]:
loaprn = json.loads(getprn.text)
#print loaUrl

Convert it into readable text that you can work with


In [29]:
naoprn = loaprn[u'videos'][0]
print naoprn


{u'video': {u'rating': u'4.34', u'thumb': u'http://img.ec.cdn.redtubefiles.com/_thumbs/0000289/0289337/0289337_009m.jpg', u'ratings': u'68', u'url': u'http://www.redtube.com/289337', u'views': 96974, u'video_id': u'289337', u'publish_date': u'2014-10-22 10:46:02', u'duration': u'1:00', u'title': u'Esperanza Gomez in Story Of A Call Girl', u'tags': [{u'tag_name': u'Anal Sex'}, {u'tag_name': u'Big Ass'}, {u'tag_name': u'Big Cock'}, {u'tag_name': u'Big Tits'}, {u'tag_name': u'Brunette'}, {u'tag_name': u'Couple'}, {u'tag_name': u'Kissing'}, {u'tag_name': u'Latin'}, {u'tag_name': u'Lingerie'}, {u'tag_name': u'MILF'}, {u'tag_name': u'Masturbation'}, {u'tag_name': u'Oral Sex'}, {u'tag_name': u'Position 69'}, {u'tag_name': u'Titfuck'}, {u'tag_name': u'Vaginal Masturbation'}], u'default_thumb': u'http://img.ec.cdn.redtubefiles.com/_thumbs/0000289/0289337/0289337_009m.jpg'}}

Compress down - look at first element of json object. You could cycle through older elements by increasing the int


In [30]:
ngeprn = naoprn[u'video']
print ngeprn


{u'rating': u'4.34', u'thumb': u'http://img.ec.cdn.redtubefiles.com/_thumbs/0000289/0289337/0289337_009m.jpg', u'ratings': u'68', u'url': u'http://www.redtube.com/289337', u'views': 96974, u'video_id': u'289337', u'publish_date': u'2014-10-22 10:46:02', u'duration': u'1:00', u'title': u'Esperanza Gomez in Story Of A Call Girl', u'tags': [{u'tag_name': u'Anal Sex'}, {u'tag_name': u'Big Ass'}, {u'tag_name': u'Big Cock'}, {u'tag_name': u'Big Tits'}, {u'tag_name': u'Brunette'}, {u'tag_name': u'Couple'}, {u'tag_name': u'Kissing'}, {u'tag_name': u'Latin'}, {u'tag_name': u'Lingerie'}, {u'tag_name': u'MILF'}, {u'tag_name': u'Masturbation'}, {u'tag_name': u'Oral Sex'}, {u'tag_name': u'Position 69'}, {u'tag_name': u'Titfuck'}, {u'tag_name': u'Vaginal Masturbation'}], u'default_thumb': u'http://img.ec.cdn.redtubefiles.com/_thumbs/0000289/0289337/0289337_009m.jpg'}

Compress down again - this time video. It's always a bit of a trial and error to figure out navagating json objects, IPython is perfect for this.

Individual Data!

This could be imporoved by turning the following unicode into a list and get the program to cycle though - saving off each element. Maybe save to a list?


In [31]:
prnliz = []

In [32]:
for nge in ngeprn:
    prnliz.append(ngeprn[nge])
    print nge
    print len(nge)


rating
6
thumb
5
ratings
7
url
3
views
5
video_id
8
publish_date
12
duration
8
title
5
tags
4
default_thumb
13

In [33]:
prnliz


Out[33]:
[u'4.34',
 u'http://img.ec.cdn.redtubefiles.com/_thumbs/0000289/0289337/0289337_009m.jpg',
 u'68',
 u'http://www.redtube.com/289337',
 96974,
 u'289337',
 u'2014-10-22 10:46:02',
 u'1:00',
 u'Esperanza Gomez in Story Of A Call Girl',
 [{u'tag_name': u'Anal Sex'},
  {u'tag_name': u'Big Ass'},
  {u'tag_name': u'Big Cock'},
  {u'tag_name': u'Big Tits'},
  {u'tag_name': u'Brunette'},
  {u'tag_name': u'Couple'},
  {u'tag_name': u'Kissing'},
  {u'tag_name': u'Latin'},
  {u'tag_name': u'Lingerie'},
  {u'tag_name': u'MILF'},
  {u'tag_name': u'Masturbation'},
  {u'tag_name': u'Oral Sex'},
  {u'tag_name': u'Position 69'},
  {u'tag_name': u'Titfuck'},
  {u'tag_name': u'Vaginal Masturbation'}],
 u'http://img.ec.cdn.redtubefiles.com/_thumbs/0000289/0289337/0289337_009m.jpg']

In [34]:
for liz in prnliz:
    print liz


4.34
http://img.ec.cdn.redtubefiles.com/_thumbs/0000289/0289337/0289337_009m.jpg
68
http://www.redtube.com/289337
96974
289337
2014-10-22 10:46:02
1:00
Esperanza Gomez in Story Of A Call Girl
[{u'tag_name': u'Anal Sex'}, {u'tag_name': u'Big Ass'}, {u'tag_name': u'Big Cock'}, {u'tag_name': u'Big Tits'}, {u'tag_name': u'Brunette'}, {u'tag_name': u'Couple'}, {u'tag_name': u'Kissing'}, {u'tag_name': u'Latin'}, {u'tag_name': u'Lingerie'}, {u'tag_name': u'MILF'}, {u'tag_name': u'Masturbation'}, {u'tag_name': u'Oral Sex'}, {u'tag_name': u'Position 69'}, {u'tag_name': u'Titfuck'}, {u'tag_name': u'Vaginal Masturbation'}]
http://img.ec.cdn.redtubefiles.com/_thumbs/0000289/0289337/0289337_009m.jpg

In [35]:
tagprn = ngeprn[u'tags']
print tagprn


[{u'tag_name': u'Anal Sex'}, {u'tag_name': u'Big Ass'}, {u'tag_name': u'Big Cock'}, {u'tag_name': u'Big Tits'}, {u'tag_name': u'Brunette'}, {u'tag_name': u'Couple'}, {u'tag_name': u'Kissing'}, {u'tag_name': u'Latin'}, {u'tag_name': u'Lingerie'}, {u'tag_name': u'MILF'}, {u'tag_name': u'Masturbation'}, {u'tag_name': u'Oral Sex'}, {u'tag_name': u'Position 69'}, {u'tag_name': u'Titfuck'}, {u'tag_name': u'Vaginal Masturbation'}]

In [36]:
tagval = []

In [37]:
for ta in tagprn:
    print ta.values()
    tagval.append(ta.values())


[u'Anal Sex']
[u'Big Ass']
[u'Big Cock']
[u'Big Tits']
[u'Brunette']
[u'Couple']
[u'Kissing']
[u'Latin']
[u'Lingerie']
[u'MILF']
[u'Masturbation']
[u'Oral Sex']
[u'Position 69']
[u'Titfuck']
[u'Vaginal Masturbation']

In [38]:
for tag in tagval:
    print tag


[u'Anal Sex']
[u'Big Ass']
[u'Big Cock']
[u'Big Tits']
[u'Brunette']
[u'Couple']
[u'Kissing']
[u'Latin']
[u'Lingerie']
[u'MILF']
[u'Masturbation']
[u'Oral Sex']
[u'Position 69']
[u'Titfuck']
[u'Vaginal Masturbation']

In [39]:
derbprn = (tagprn, 'tag_name')
print derbprn


([{u'tag_name': u'Anal Sex'}, {u'tag_name': u'Big Ass'}, {u'tag_name': u'Big Cock'}, {u'tag_name': u'Big Tits'}, {u'tag_name': u'Brunette'}, {u'tag_name': u'Couple'}, {u'tag_name': u'Kissing'}, {u'tag_name': u'Latin'}, {u'tag_name': u'Lingerie'}, {u'tag_name': u'MILF'}, {u'tag_name': u'Masturbation'}, {u'tag_name': u'Oral Sex'}, {u'tag_name': u'Position 69'}, {u'tag_name': u'Titfuck'}, {u'tag_name': u'Vaginal Masturbation'}], 'tag_name')

In [40]:
for deb in derbprn:
    print deb


[{u'tag_name': u'Anal Sex'}, {u'tag_name': u'Big Ass'}, {u'tag_name': u'Big Cock'}, {u'tag_name': u'Big Tits'}, {u'tag_name': u'Brunette'}, {u'tag_name': u'Couple'}, {u'tag_name': u'Kissing'}, {u'tag_name': u'Latin'}, {u'tag_name': u'Lingerie'}, {u'tag_name': u'MILF'}, {u'tag_name': u'Masturbation'}, {u'tag_name': u'Oral Sex'}, {u'tag_name': u'Position 69'}, {u'tag_name': u'Titfuck'}, {u'tag_name': u'Vaginal Masturbation'}]
tag_name

In [43]:

Saving Data


In [44]:
doc = dominate.document(title='nsfw')

In [45]:
with doc.head:
    link(rel='stylesheet', href='style.css')
    script(type='text/javascript', src='script.js')

with doc:
    with div(id=header):
        attr(cls='header')
        #<img src="smiley.gif" alt="Smiley face" height="42" width="42">
        h1('nsfw')
        img(scr='logo.gif')
        h2('warning: porn. get out now.')
        p(strftime("%a, %d %b %Y %H:%M:%S +0000", gmtime()))
        a('about', href='http://brobeur.com/nsfw/about')
        a('contact', href='http://brobeur.com/nsfw/contact') 
        a('blog', href='http://brobeur.com/wcmckee.com/wcmckee/output')
        
    with div(id='body'):
        h1(prnliz[9])
        #img(prnliz[1])
        (img(src= prnliz[1]))
        #for liz in prnliz:
        #    h1(liz)[7]
        for tag in tagval:
            p(tag)
    
    
    
    
    
    
    
    
    
    
                                                                                                                                                                                                                 #with div(id='pnr').add(p()):
        #for i in jplis:
            #(img(i.lower(), src='%s' % i))
            #(a(i.lower(), href='%s' % i))
            
    with div(id='footer'):
        p(a('nsfw: warning porn - is open source', href='https://github.com/wcmckee/wcmckee-notebook'))


            

print doc


<!DOCTYPE html>
<html>
  <head>
    <title>nsfw</title>
    <link href="style.css" rel="stylesheet">
    <script src="script.js" type="text/javascript"></script>
  </head>
  <body>
    <div class="header" id="&lt;class 'dominate.tags.header'&gt;">
      <h1>nsfw</h1>
      <img scr="logo.gif">
      <h2>warning: porn. get out now.</h2>
      <p>Wed, 22 Oct 2014 09:20:47 +0000</p>
      <a href="http://brobeur.com/nsfw/about">about</a>
      <a href="http://brobeur.com/nsfw/contact">contact</a>
      <a href="http://brobeur.com/wcmckee.com/wcmckee/output">blog</a>
    </div>
    <div id="body">
      <h1 tag:name="Vaginal Masturbation"></h1>
      <img src="http://img.ec.cdn.redtubefiles.com/_thumbs/0000289/0289337/0289337_009m.jpg">
      <p>Anal Sex</p>
      <p>Big Ass</p>
      <p>Big Cock</p>
      <p>Big Tits</p>
      <p>Brunette</p>
      <p>Couple</p>
      <p>Kissing</p>
      <p>Latin</p>
      <p>Lingerie</p>
      <p>MILF</p>
      <p>Masturbation</p>
      <p>Oral Sex</p>
      <p>Position 69</p>
      <p>Titfuck</p>
      <p>Vaginal Masturbation</p>
    </div>
    <div id="footer">
      <p>
        <a href="https://github.com/wcmckee/wcmckee-notebook">nsfw: warning porn - is open source</a>
      </p>
    </div>
  </body>
</html>

In [46]:
os.chdir('/home/wcmckee/nsfw/')

In [46]:


In [47]:
savPrn = open('index.html','w')
savPrn.write(str(doc))
savPrn.close()

In [48]:
opPrn = open('index.html','r')
for op in opPrn:
    print op


<!DOCTYPE html>

<html>

  <head>

    <title>nsfw</title>

    <link href="style.css" rel="stylesheet">

    <script src="script.js" type="text/javascript"></script>

  </head>

  <body>

    <div class="header" id="&lt;class 'dominate.tags.header'&gt;">

      <h1>nsfw</h1>

      <img scr="logo.gif">

      <h2>warning: porn. get out now.</h2>

      <p>Wed, 22 Oct 2014 09:20:47 +0000</p>

      <a href="http://brobeur.com/nsfw/about">about</a>

      <a href="http://brobeur.com/nsfw/contact">contact</a>

      <a href="http://brobeur.com/wcmckee.com/wcmckee/output">blog</a>

    </div>

    <div id="body">

      <h1 tag:name="Vaginal Masturbation"></h1>

      <img src="http://img.ec.cdn.redtubefiles.com/_thumbs/0000289/0289337/0289337_009m.jpg">

      <p>Anal Sex</p>

      <p>Big Ass</p>

      <p>Big Cock</p>

      <p>Big Tits</p>

      <p>Brunette</p>

      <p>Couple</p>

      <p>Kissing</p>

      <p>Latin</p>

      <p>Lingerie</p>

      <p>MILF</p>

      <p>Masturbation</p>

      <p>Oral Sex</p>

      <p>Position 69</p>

      <p>Titfuck</p>

      <p>Vaginal Masturbation</p>

    </div>

    <div id="footer">

      <p>

        <a href="https://github.com/wcmckee/wcmckee-notebook">nsfw: warning porn - is open source</a>

      </p>

    </div>

  </body>

</html>

In [48]:


In [412]:


In [412]:


In [412]:


In [412]:


In [390]:


In [390]:


In [390]:


In [390]:


In [390]:


In [390]:


In [390]:


In [390]:


In [390]:


In [390]:


In [390]:


In [390]:


In [390]:


In [390]:


In [390]:


In [390]:


In [390]:


In [390]:


In [390]:


In [390]:


In [390]:


In [390]:


In [390]:


In [390]:


In [390]:


In [390]:


In [390]:


In [390]:


In [390]:


In [390]:


In [390]:


In [390]:


In [390]:


In [390]:


In [390]:


In [390]:


In [390]:


In [390]:


In [390]:


In [390]:


In [390]:


In [390]:


In [390]:


In [390]:


In [390]:


In [390]:


In [390]:


In [ ]: