run_this_first.ipynb

You will need to download and place in the /data folder the following files from ftp://pi.super-computing.org/:

  • pi100m.hexbin.000
  • pi100m.hexbin.001
  • pi100m.hexbin.002
  • pi100m.hexbin.003
  • pi100m.hexbin.004
  • pi100m.hexbin.005
  • pi100m.hexbin.006
  • pi100m.hexbin.007
  • pi100m.hexbin.008
  • pi100m.hexbin.009

Each one is 100 million digits of pi encoded as a hexadecimal byte.

This script will convert them to decimal text files.


In [6]:
import os

for i in range(10):
    infilename = 'data/pi100m.hexbin.00' + str(i)
    outfilename = 'data/pi100m.dectxt.00' + str(i)
    if not os.path.isfile(outfilename): #nothing will be carried out if decimal text file already exists
        with open(infilename, 'rb') as fin:
            pi_partial = fin.read().encode('hex')
        print "Processing %s, %d digits, %s...%s" % (outfilename,
                                                     len(pi_partial), 
                                                     pi_partial[:20],
                                                     pi_partial[-20:])
        with open(outfilename, 'w+') as fout:
            fout.write(pi_partial)


Processing data/pi100m.dectxt.000, 100000000 digits, 14159265358979323846...14970581120187751592
Processing data/pi100m.dectxt.001, 100000000 digits, 21505880957832796348...83204322022549381399
Processing data/pi100m.dectxt.002, 100000000 digits, 05651112987155305622...46957648307240885708
Processing data/pi100m.dectxt.003, 100000000 digits, 11455198452037407814...06783528489092823653
Processing data/pi100m.dectxt.004, 100000000 digits, 14966402168553092322...70523687343293261427
Processing data/pi100m.dectxt.005, 100000000 digits, 47342432766223352946...87101328869499653984
Processing data/pi100m.dectxt.006, 100000000 digits, 87093558701000064080...61891872423951424305
Processing data/pi100m.dectxt.007, 100000000 digits, 85880766885039792130...09659296170468252898
Processing data/pi100m.dectxt.008, 100000000 digits, 18283173292552189833...90805981695172461619
Processing data/pi100m.dectxt.009, 100000000 digits, 12265862781547280522...15171395115275045519

In [1]:
# Create 1000-character files for overlaps during tests

filelist = ['data/pi100m.dectxt.001', 'data/pi100m.dectxt.002', 'data/pi100m.dectxt.003', 
            'data/pi100m.dectxt.004', 'data/pi100m.dectxt.005', 'data/pi100m.dectxt.006', 
            'data/pi100m.dectxt.007', 'data/pi100m.dectxt.008', 'data/pi100m.dectxt.009']

for filename in filelist:
    current = open(filename, 'r')
    string1K = current.read()[:1000]
    current.close()
    newfile = open(filename+'.1K', 'w+')
    newfile.write(string1K)
    newfile.close()

In [ ]: