The timeit()
command appears to have strict limitations in how you can use it within a Jupyter Notebook. For it to work most effectively:
timeit()
is configured to iterate)timeit()
is the only line in the test cell as shown in these examplestimeit()
and open more options for how to use it and related functions, check the documentation. This library was creatd in Python 2 and is compatible (and may have updated in Python 3.The example that follows illustrates something important. Using libraries and built-ins that function at a binary level under the covers can greatly boost performance in python and is generally highly recommended. Often, making your code "more Pythonic" using constructs that condense an idea down to as few lines of code as possible goes hand-in-hand with such performance boosting choices. But a word of caution. As is illustrated by the example that follows, this is not always the case. The second choice in this notebook is a great learning example for the use of the language constructs it illustrates, and it even got the most upvotes as a Python solution on www.hackerrank.com at the time of this writing. But a comparatively "less Pythonic" example that follows it, actually beats its performance by a significant factor using loops where appropriate and list comprehensions where beneficial. Understanding what our specific choices are doing under the covers is tantamount to finding the most efficient solution. When it is difficult to see the path, performance testing of different choices can also reveal the answer.
To understand the abbreviations in timeit performance metrics, see this wikipedia post.
In [1]:
# this example comes from www.hackerrank
# in the exercise, a string is fed into a function along with a number
# the number tells the function how long to make each substring that will be tested
# then the function carves the string into substrings removing all duplicate characters from each substring
# constraints indicate that the number passed in is always a factor of the length of the string
# by Python standards, this first solution is the inefficient way to do it using loops
# a quick and dirty solution just using loops ...
def removeDupes(s):
s, s2 = list(s), []
for i in range(len(s)):
if s[i] not in s2:
s2.append(s[i])
return "".join(s2)
def get_subStrs_of_uniqueChars(strng, k):
sLen, rtnVal = len(strng), []
for x in range(0, sLen, k):
rtnVal.append(removeDupes(strng[x:x+k]))
return rtnVal
In [2]:
# some tests to show what this code does:
def test_funcs_thisNB(fun):
# allows running the same tests with same input data on different versions of a function
inputs = [("AABCAAADA", 3), ("ABCDAABCAADGGHI", 3), ("AAADYYBCAADXXXI", 3),
("AAADYYBCAADXXXI", 5), ("AAADYYBCAADXXXI", 15)] # 15: substring len = str length test
for inptStr, subStrLen in inputs:
print("Input: %s, %d\nOutput: %s" %(inptStr, subStrLen, fun(inptStr, subStrLen)))
print("-"*72)
test_funcs_thisNB(get_subStrs_of_uniqueChars)
In [3]:
# a really big test - this will only have one subset string
longString = '''DWNFMNZZVQPKKGXESLQNIUNZFMIKEGCJDROREQQCIHMSNJYGWQVEKJDRXNDBVFMYZDRDTJFDQTVGEVMALJEYUJPRWTUSAHSZMMFHVMKOGHUKCGNNRRLMDDDBWYVYHQATCFAXRMLZTHMVPZIHSWTVZYZXYUVHKXCOCEOWQBXJJJGYKPFFLACMBBJBWGJIELZIQNGGOESXPYYCPDHCGJOJLXKJGTTMHSWZHDFWJZVAATCPZLSFVGOIGATOWOADHWCOBJMMLINLBRAAFUHADXKLYFZUTZZAXDRZODNZNAMPUPRZJZBMWNXUSYQLZROXUHYJMLKAOXRKMILVHOKEBJAVITGHKWGHDEQQSCSIZJSLRFJBTTHUEHPOCXYPWEWZLORFQLNQUHBOMMPFHWBMFQAIPAXLFVNSJGXBSKRORVCFHTKPPLDVDFFUGEILZVDKDCLXMFMFCQMJKYAZLFWOLDKRHSEIPJUVNISZNGGRXSCHQDIBKHRVKDOUYUENFBISJDTYJZPGTRPLWAOHJHETLTPJNWYVXJNGMHGVIVEEOTRLVGUENYAARPLHLKCKTRQFAYAJVGPKAGXXMRCCPEEJTQSHAWTVNMAQKCBGJQSLWPILGMPXQVIMLAVNWOIMAICNNDTYVLJTATEIIVHYRSMESJSRAAFALJNAOIYJVHCXCJFKGPLZHXEBJWUJYZLLKBLYJJJFSOCVXKFFBQFKQLLZJIKHJVVUWGUGSENKSPFRBNWCFDOXOAYXIIHTECNAKIICOVNGNUZQJXVPBJMRLLRVTYPZCCAMMKRAIEIVAHNKFIZISNBFYSBUSQTWSVJFGAHOGQLGXASEJRODHQIFKJBDZVZUSKBAMIOSYBBYCTENNTSULCBVMDANACJSOKTBTJVTNYTRUYGHTYDGAEDOJGEJINEYYZBTIXOXXKORKWADUDLXKOLTUPFEFLFFKGYUFOSDYIUIFVNBAYYKPMGLDNQIYXNIDNEJCYOCHKLOHARKBQUQCCBFPTQNSDXWTEHXCXZLHMBRNTBQJXGNAJUPFKFXQEVJICGLBHWLUYCJRFZCCHRCSLUZYZZOFUZOZHZAOXLIXPTQUUSXBJBWXXVVYUJERKSQRTSFQGQQVJGSEBRHMSDJQBEOXPUOZMGQHBWAHOQEXWWDZPNMIQVATAQRPLFRZOHHPFHYTXCTVBYVQLHACECXGVOYGWPHKYQBEZZZWDUUGTRXEAZIECGLXUJFSAOCYFFCEEEDIYZQRSNYSPGYTOLSLUXDWNIVUNZZSDECEDSVVIVPXENSSZKDVJJSWRNREOSWSWAWBVTYDRPCVCUPDETBNETLVIEZWWYQUYMXVIVYZKAWPWLUCGVRKODFYHHVGFNCFCADKXBLKEHZCUTFBRWLHBTGLADRHQTOSTRETVRDBACDVXIWQFJYICETEJKNZGEUBXAUUSXVSBZNBHJROTRXXVSEEFRGLXCOXCJRUINNMOANVMGMHYLGVDKCIEIVBKKAMVSJDHWPVYESKLETLRAGXKIHQSDUCPURKMAPTYFQWLJJYPCJHEQGRANHTSBXJYQVMQLIQSYNDHYCXCLGHDOYGBHZWLYFJPDXHOFYGENLNLPMPBUWGIWMMENKRORCDUZMIFMQLZCAMTNCWJADSZRGDHQWVHYACXMMEZFPAHQPCFTZOTCISVOXCEVZOTZQRMEVNJNNSFEVKXUATYJNTXNYEIZSCZKVNOSAYFPSMUNWRJZKHIYDHNDNVCFXBRUQGNQGUGYHCOFTXEGHOGKYTNLORTOVKKLSXCAUKBBMPIHOPNVDTFDMURDMKTHWDSPDWRXISAUJIEYXRTDNBIBXZELJZSIDNZGLQDUJFQSNUSNOOSBRAEOALALKUTNJSVVLAPUFHOVEHISXCVOCZCEKEQVZLKIGGFRGWLMEBHIKRCHTXVXWYCHEUEDHOONUTEDSRRWVYGFRINMFIJDINMNJQSRHIGDELGWEXSZXAFQKUCRCOUNDHCNZUGGFMKJZSFEQZFPBMGLGKCLAWYEFCTHYZNDNZMNSTTKSYZVLHGTUIEUHEAOGTVFULKIKXVESOONMQIZXPTTZXOGEOXKKUREFCMSBHWTXLGKBPLAGEUFEIOIYLUIFONNSBFTJBOGMVSPMEQUKKBQUPAUCVDILSVDTCYEDMNSJHHXNZRYLTOHKPCNMHVZZSCUUCBZOOTYVBXJCPJQKZXUQZHEGEDHYHEUJFUYVNYQQXBVMMLYMIVCICJQJMAHUEEFLYFGNFYEDCBROMQCULFEPOVACVKYBOEMOLSBRQHUSILJUDLPQSTGIQGLNSJOGPAUBUWUNDOHOBQIFCZVUVDDLMQBEBPNTSHWOFQBLGJZHBJODJMAERFRDXSKYKXTEGPSOFWZLFAVGJLMUXMYRTSWQKGQWFKAOZTCHPBSWDPEMBQHAEHRZZOPMWGICSJQREUYVXTRAKXOLOXOUFHUGXLSURDYLOQDSKDPHYIJJFXWVXMQEUMKRXFNPILCWBFQNLFWJNFUUFRQCDIGZUSRRXEIHSLFTQVJDCFNRLJMSAEUGMCFGUWXSDGBVRHQJEZNHHCAULMMOQIUCKBIEAIYDOAAHHQSLSHUZJVTXJHNAPJEZKPDKXCPNERWLIOYCWVDHSYGBFTDWCIVPZAZWERLIIHVSXWUTRXBJXJMCDQYHYVWXYYVCPINZPISMEOIXOLILUVQXLQHLMDKKBHOSPBSEKMSQAAQQLYCIVUHIKOTYSFITMZMBAGHMSCFUEVLQVPARJHBVWXUQCEJRDXUGDCSXGXSKSDCOUCHDKIAIFUYKBJBEHVKMZDMFCEQVJULDXUIJEIRMFPYGYZMHWZWYEIFGOXDXSOCPKMYRUPDBGBHECWOAVKYZSEFGBJFTZJIMVIDRXITFJCMNYANTKMUESAMTLTNMCXYXGBOFLJNWMZJMZXILLCPDEDYRWNEALCXTGNARXNOLOXZNWHYJMPOQUNISAMSNQRGWFJQEYEPNDOACYBLKSACNPMFPAAFRRLNWWFCVJTKNKMRINCUIEWXVJCNJCSAWFQSCVXZHQJWAVNILRCVWBSTKUGVZYYXEORILOHUFSQFPDQCVUXTVPOHMVDNTBKZRBHEQQZXIPEAVUCSRANMRBUDYXQSACTRGBVWRWVCOZCLWEDNGSZYVVDUUWOUYHNEKLCEHXGVZKGVQLIXFHXBFAXZWLUVUHBFUELCDTZCDIXTTFSBPPCUQZVMMPJGZLNTRAXUTXXYFUUACMBRCFNUEIISASBBFOWWRUQKRNJYKFYORCFVHUROEAGEUJHZAEVRANBTDMRPTRDLVKIDEATKABRWLYVLESCEHFXKUQBNIEYFOJIUJBFLDYHOWCBDWDHDKHQEXRSHWSNMBXJMBOXGOEUMIXPEAZKKGCRFVJPRDEGHBPTEFRKTXGFGDVMDWYQCAHJWSYPXEXEIMYMTRXMODUWHRJMNJCRMLAIFBZDFXHNLHAFYZTPCNLJEWYRGCISOLCTMBYTAGJONLVOKODOCQYINYATAKNQVPMJSKFVQQJGBEUNUZBWPBFEZHXATMSRBEAWPHRFXCNAIJNCIRBASIEUREWKQODUTGSINLPNPDNXOBBXSEZMOFGFJCSCRVWMBQURBJEQORNDUQCMXDBLIHTSMNUDISPMILFLWJBNAQSVIUJFXKRFUMZGZVLJPCVZPAKLLLYOBQLKMURLHITDUSJVOWFFYCFNFPAQBBEETPQGMIRVSKYPFJKVHRAHUFWZXWRAXWERNXZZHSWZFWOKHBFOSHVMPRNMQFOPBSIQRHQYAOYFLORSPYJKFEWUYKIQRWFSROIKXAJZRJGCXXWPVFZDMXZKJICAGHUXXFKXHTWYEDADCZSAETDQRCCCMEDTOXSNCCKMVHKBMNFQMXQQTTJMWLOISTDGTWTVYEJWNUXBHCRTCIJXBSJZGZKYTNGMJZJKFTGUPFVWKPPOXALAVWCDVMBQZHCLIOVQJDKYKIUUZLIWNTYIRCLNOODQXIDGYAYHDIFNSBKROUPBOQMFSZUJOABNIEVGGTNJCUYWXKOLFFMVVACQAXBQXCDHIZNOSCZWYZVVKLIRQUMOVOGXNHNKLSTTTIJMMKKKLFHXQPOJMAXJRDGEKVRVPKQJTBXFLHSYPZWHRKQDLPMESUIEPZBGMTPFWMMJWEINGEUXOLABCOFXIQDZPFFBAXIXJVGHBOUHUPEJCEMEUSBEKFDZMLDOILLTIUBKKVRFMWQOCCVZWWDGDJIPWLEEYRZHLATWYMDKKTBPVWOSURAYCINYTTEULEDYGWWEKZRVSUKQQAKKTKIWVYUQTZKGFNFLLDSWFJTZDFPVFBHYOQUJQRZJSMSXBZIPCALJLGJQLANQCVOQNLBDECOWOGTQFDFKGSTRACKLDXDHSTZFEAKIEZGTHBLNFSZLMSEMVQZAOCJIVIQCKAMOBTJJUUYBOXOCRVOOLPOZRXKPHARRADGEYPPVMPWAMNDFITUTLKVEKFTRFKJHQRNOGCJURIVFVALDVHYIRTNBAGTFTEPJVCZDFJYYTVEOXRTUYRDPNQRNYMVRQKALOCQVLOUEJYUGRPBPGGEVWXJUJEOBQQOGSFCDTWJFWFLNUOEDUIYSIJPTPDXHTLOMQQRMODRKIEZFTDKPNKIXUXRLAOTVBJHUBBGPEZBODCTYFDOUQYUKXLXZBSUEBEYCFERLFUZJWVHDAXYQXSCWFAVISSOUWNYDTSOZMPKLKTOLTODQIGNPGKZACOWYDVDYPRXBHJOTDDEWTIOCQBRWMTWQHTRMQUKFOKIVTXQWCVSVFHXVKRTYMRPTMGHCCRJQBSNVRETTBOOIVODHFYIRPXMEFVGJMSAQMNNDRGWUUMCRAHBHFJAXIOBPJJYYDYQPODSHLQCIDGBDOCNVNNUVCXKNJLLMLBCZGWHUNLCSRDVFIKDVYXTCVFPEQBSCEWDNTMHIXLARQXYYIBWGZPIWUZCNCURJRUWMHDUGOUXETWFBXBHYSRWMTYBVVSEONAAWFWCVSACLWJNVKWUCQQQJRUGMMLABNCZTACQSESFCBSXNRTRJMKSDEYPQLRUZVTUVXKPBFUDGPDWIWNRIXLNBLCUXVQWSJQPHCEKHBQOQTMARZRCZCPCNUWMRMKKYAZHFGRMHHCZCOZWOSYPUNRJJQYBEINCLMJQUDCBKFCPTBLJTLYPAQBKGZNKJAPUOYKJDNMQUPFNSSZODXFDPGNXHCKRCZLTZWEDLQTHHAUASVQXTVAKCQILSUCXTQQUOWZZMSGUUCWOAMLVHOFLEPXYJZXCRPZFLYEYTMUNRSERGPMPDTBJJYISZFXRVYWGWDGPRCFKUJDCAPSELTNUTXOUFNNCLMIKRRBKVGVQRYUSQMWBFLXALNXQDKSQYCAPTEARKXJCXDWNQUQXFPYQFVIIHAAGDDXYHZPRWYVUETJUNCRTRRLWOUGYWJGBMDCTDTMBUKXYDHSRJLMCFXBTTHTSSZTEFXXKRMLLWJJZSDQBQEDVCGRXOKRIKNNPKMZDYMOUXZWPEMSVRYQVEHUUSODEBQVLCURBIFXGHTXLIQGBOZYSGUNAISGJIBXMXOPGVPOEILQSBYVRXTKGOZIYTPIBQHPQXHYUWMZGXPZBPUSNQEVEDDEWUMZNVRDSYBNUNOAMDBNUWGJMKESQKWOGLQTIHYBHZQBOEDDJHQDDYNRLTLDFJTMUJFDSEGZFYCWEIZPPRUSSJLFFYIKHBWENEJFKRFRRHNVPMMEGIYYSLDXJNJTQIXDMGJYZOPQXDNPRZVZHWXBHCATRKMISLLGRUEQKTIKYVZQUUPDQPFATFTLRFTJQEPJBTBLPJVNGWFCRXFJMMJHUESLJLXBQOMTKPEZACOGYUIRTQBHCMOYRHMCSLGKZSDJKKKKMZRMVBEQRFXWRLWKSINNVTZVNFGXPRIDSBPPETFWACUUPQEIBRXYMWTABCXSTFVNINENILLLPFFEXLOAFLZRIURLWRERWZGHOKUXXHIMMPTLAHOHVPZDJSOGJSZIUGRISLHQSREGGZTJJHQEWRHIMXQXSPFOVWYQJGGDXMLGOFRXOJBNBKXPJNMBCURASQQCWYHVKVDACUXQEAFHKCWWPIZUERUXHKBFIIAUFEVHYUYEVFLHKJDZSDTYUPVBZWIHHIEMOBUPVUTQBHYLQDNKGGJCWEFXDNHKWLYMOSBJOXCQECDUFQFMZOQVUWUXJBJHOIUCCXOTWQLASOXXFELEUCBOYXOJZZRPJNUOMIHJYULQKIPPOBVIFYZDWPOXOFOASIOFSXQQRBIBMATADQJIRKNNZBMPJCPBNDIFCAYVCIXOISQNKBVBLKQMOECXHTAUZLZBLZXPIWDSQUGAXDCJOUVEZZBGTDCSODTADSPLOVFERLHOPJZFFVJEWMMPPOHESDGXVXKMUPQLBXCSIBXOYGUVSHMJVWNPZTMWTWINMBZPADHLEGBDOVAIEMRCIFRJAGGTEQJSTKHUNQHRWKWKFWSKLKOTRFDRNLNSBWMVGWRVMAOLKLVPJQBWARQTXVKMGZGHXTEFPXADXPOHCKZNCCKCUAVTVHFDJMKGHRMYOMBMDQVFCUTEXDITFFMCMRFXFREOISMZEONKGIRIFKOEPWXWBLYPEDNMXTAFLNGRDVB'''
get_subStrs_of_uniqueChars(longString, 6623)
Out[3]:
Performance Test One:
In [4]:
timeit(get_subStrs_of_uniqueChars(longString, 6623))
In [5]:
# hackerrank, most voted solution from discussion board (for Python)
# S, N = input(), int(input())
def hackrRank_getSubsOfUniqueChars_v1(S, N):
# modified slightly to integrate for this timeit test:
rtnVal = [] # changed print() to rtnVal
for part in zip(*[iter(S)] * N):
d = dict()
rtnVal.append(''.join([ d.setdefault(c, c) for c in part if c not in d ]))
return rtnVal
In [6]:
# tests to show it works the same as earlier examples
test_funcs_thisNB(hackrRank_getSubsOfUniqueChars_v1)
hackrRank_getSubsOfUniqueChars_v1(longString, 6623) # should only ouput one string
Out[6]:
Performance Test Two:
This one runs faster by a noticeable factor (6x faster). But the next one is even more extreme.
In [7]:
timeit(hackrRank_getSubsOfUniqueChars_v1(longString, 6623))
In [8]:
# a solution from the discussion board of hackerrank
def without_repetition(t):
s = set()
o = []
for c in t:
if c not in s:
s.add(c)
o.append(c)
return ''.join(o)
def hackrRank_getSubsOfUniqueChars_v2(s, w):
n = len(s)
rtnVal = []
for t in [s[i:i+w] for i in range(0, n, w)]:
rtnVal.append(without_repetition(t))
return rtnVal
In [9]:
# tests to show it works the same as earlier examples
test_funcs_thisNB(hackrRank_getSubsOfUniqueChars_v2)
hackrRank_getSubsOfUniqueChars_v2(longString, 6623) # should only ouput one string
Out[9]:
Performance Test Three:
This one runs faster by a much more extreme multiplier. Note that we have switched from ms to us in the output metric. If this function were called on big data, the performance boost would be significant (3 decimal places faster).
In [10]:
timeit(hackrRank_getSubsOfUniqueChars_v2(longString, 6623))
In [11]:
from collections import OrderedDict
def hackrRank_getSubsOfUniqueChars_v3(string, k):
rtnVal = []
for x in range(0,len(string),k):
rtnVal.append(''.join(list(OrderedDict.fromkeys(string[x:x+k]))))
return rtnVal
In [12]:
# tests to show it works the same as earlier examples
test_funcs_thisNB(hackrRank_getSubsOfUniqueChars_v3)
hackrRank_getSubsOfUniqueChars_v3(longString, 6623) # should only ouput one string
Out[12]:
Performance Test Four:
Ironically, though this one also tests in microseconds, this ordered dictionary solution that also looks "more Pythonic" then the previous example, is not able to beat the performance of the previous example. It only comes in second place.
In [13]:
timeit(hackrRank_getSubsOfUniqueChars_v3(longString, 6623))
In [1]:
def averageUniqueElements(array):
return sum(set(array)) / len(set(array))
def averageUniqueElements_v2(array):
arrSet = set(array)
return sum(arrSet) / len(arrSet)
In [2]:
timeit(averageUniqueElements([456789, 456789, 11111111, 111111111, 12111111111111, 78787878, 7878, 99999999, 99999999, 6]))
In [3]:
timeit(averageUniqueElements_v2([456789, 456789, 11111111, 111111111, 12111111111111, 78787878, 7878, 99999999, 99999999, 6]))
The comparison is in Microseconds and the difference is slight, but it proves the theory.