this is the file structure for FROG fingerprint in binary file representation

File Structure { File version: FROG file version in ASCII mode + Hex mode( 4 byte example: <.fg><0x01> ) Signature: Organism-Ref-ID and Ref-version in ASCII (variable bytes enclosed by quote, Ex: “GRCh38.p11”) FROG Data End of File: padding bits and quote “ ( few bits + 1 byte) }

FROG Data Structure { Genome Position: Chromosome & Position in ASCII ( enclosed by quote, Ex: “m.9210“ or “ch1.9210“ ) FROG Level: binary data mode FROG Attribute: binary data mode FROG Terms: binary data mode }

End of file structure { padding data: 1 to 7 zeros bits padding info : first bit (YES/NO), next 3 bits for number of padding bits, next 4 bits unusable – fill zeros End of file character: quote “

cases

  1. all bits 156 are zero
  2. all 156 bits are one
  3. write binary file as bit wise

step wise work:

  1. write a first code for read bits from file

In [76]:
# file_strcture
# frog_data_structure
# end_of_file

import sys
import numpy as np
import os
import binascii
# read for DNA level

one_byte =('1111')
print one_byte
# comment ascii to hex
#f = binascii.b2a_hex(one_byte)
#print f


1111

In [71]:
#one_byte_hex = int(one_byte,2)
#print (one_byte_hex)
#'{0:08b}'.format(6)
bin(1)[2:].zfill(8)


Out[71]:
'00000001'

In [58]:
len(one_byte)


Out[58]:
4

In [37]:
sys.getsizeof([]) # size of memory


Out[37]:
72

In [38]:
sys.getsizeof('abc')


Out[38]:
40

In [42]:
binaryfile= "file.bin"

In [47]:
os.path.getsize(binaryfile) # get filesize on disk


Out[47]:
1

In [48]:
with open('one_byte_file', 'wb') as f:
    f.write(one_byte)
f.close


Out[48]:
<function close>

In [73]:
#import pickle
#f = open('one_byte_pickle.p', 'wb')   # 'wb' instead 'w' for binary file
#pickle.dump(one_byte, f, -1)       # -1 specifies highest binary protocol
#f.close()

In [74]:
'{:<08d}'.format(190)


Out[74]:
'19000000'

In [81]:
'{:<08d}'.format(1111111) # padding the bit


Out[81]:
'11111110'

In [109]:
format(127, "<08b")


Out[109]:
'11110000'

In [110]:
'{:0<8}'.format(one_byte)


Out[110]:
'11110000'

In [122]:
with open('one_file', 'wb') as f:
    f.write(one_byte)

In [126]:
from array import array

bin_array = array("B")
bits = "10111111111111111011110"

bits = bits + "0" * (32 - len(bits))  # Align bits to 32, i.e. add "0" to tail
for index in range(0, 32, 8):
    byte = bits[index:index + 8][::-1]
    bin_array.append(int(byte, 2))

with open("test.bnr", "wb") as f:
    f.write(bytes(bin_array))

In [ ]: