In [1]:
# out of core line-counting function
# this function will break on an empty file
# Fix it as an exercise ;-)

def count_lines(file):
    with open(file, 'r') as f:
        for line_number, line in enumerate(f):
            pass
    return line_number + 1

In [2]:
# make a small test file - 10 lines long
with open('../test-data/line_count_test.txt', 'w') as f:
    for i in range(10):
        f.write('la la la\n')
        
# a second, just for fun
with open('../test-data/line_count_test_2.txt', 'w') as f:
    for i in range(2000000):
        f.write('wow, this is a big file\n')

In [3]:
# preview the test file - confirm 10 lines long
with open('../test-data/line_count_test.txt', 'r') as f:
    for i, line in enumerate(f):
        print(line)


la la la

la la la

la la la

la la la

la la la

la la la

la la la

la la la

la la la

la la la


In [5]:
# assert: if nothing happens, the condition is satisfied
assert count_lines('../test-data/line_count_test.txt') == 9, 'Error, wrong count returned'
assert count_lines('../test-data/line_count_test_2.txt') == 2000000, 'Error, wrong count returned'


---------------------------------------------------------------------------
AssertionError                            Traceback (most recent call last)
<ipython-input-5-de7113cd215b> in <module>()
      1 # assert: if nothing happens, the condition is satisfied
----> 2 assert count_lines('../test-data/line_count_test.txt') == 9, 'Error, wrong count returned'
      3 assert count_lines('../test-data/line_count_test_2.txt') == 2000000, 'Error, wrong count returned'

AssertionError: Error, wrong count returned

In [6]:
# set path once and for all
WAR_AND_PEACE = '../data/war_and_peace.txt'

print('There are exactly %d lines in War and Peace.' % count_lines(WAR_AND_PEACE))


There are exactly 65007 lines in War and Peace.

In [7]:
def find_first(file, name):
    with open(file, 'r') as f:
        for i, line in enumerate(f):
            if line.find(name) != -1:
                return i

In [8]:
with open('../test-data/find_name_test.txt', 'w') as f:
    for i in range(5):
        f.write('la la la la\n')
    f.write('la la Bezukhov la\n')
    for i in range(4):
        f.write('la la la la\n')
    f.write('la la Bezukhova la\n')
    for i in range(4):
        f.write('la la la la\n')

In [9]:
with open('../test-data/find_name_test.txt', 'r') as f:
    for line in f:
        print(line)


la la la la

la la la la

la la la la

la la la la

la la la la

la la Bezukhov la

la la la la

la la la la

la la la la

la la la la

la la Bezukhova la

la la la la

la la la la

la la la la

la la la la


In [10]:
# test the function - should be on 5th line
find_first('../test-data/find_name_test.txt', 'Bezukhov')


Out[10]:
5

In [11]:
names = ['Dokhturov', 'Vasili Kuragin', 'Langeron']

for name in names:
    print('The name %s occurs first on line %d.' % (name, find_first(WAR_AND_PEACE, name)))


The name Dokhturov occurs first on line 8686.
The name Vasili Kuragin occurs first on line 816.
The name Langeron occurs first on line 14378.

In [12]:
def count_occurrences(file, name):
    counter = 0
    with open(file, 'r') as f:
        for line in f:
            counter += line.count(name)
    return counter

In [13]:
# note that this counts Bezukhova, because Bezukhov is part of Bezukhova!
count_occurrences('../test-data/find_name_test.txt', 'Bezukhov')


Out[13]:
2

In [14]:
# include whitespace at end of name?
count_occurrences('../test-data/find_name_test.txt', 'Bezukhov ')


Out[14]:
1

In [15]:
with open('../test-data/count_occurrences_test.txt', 'w') as f:
    for i in range(10):
        f.write('la '*3 + 'Bezukhov '*3 + 'la\n')
    f.write('la '*2 + 'Bezukhova ' +'la '*3 + 'la\n')

In [16]:
with open('../test-data/count_occurrences_test.txt', 'r') as f:
    for line in f:
        print(line)


la la la Bezukhov Bezukhov Bezukhov la

la la la Bezukhov Bezukhov Bezukhov la

la la la Bezukhov Bezukhov Bezukhov la

la la la Bezukhov Bezukhov Bezukhov la

la la la Bezukhov Bezukhov Bezukhov la

la la la Bezukhov Bezukhov Bezukhov la

la la la Bezukhov Bezukhov Bezukhov la

la la la Bezukhov Bezukhov Bezukhov la

la la la Bezukhov Bezukhov Bezukhov la

la la la Bezukhov Bezukhov Bezukhov la

la la Bezukhova la la la la


In [17]:
assert count_occurrences('../test-data/count_occurrences_test.txt', 'Bezukhov ') == 30

In [18]:
with open('../data/war_and_peace.txt', 'r') as f:
    for i, line in enumerate(f):
        if line.count('Bezukhov') != 0:
            print(line)


young man was an illegitimate son of Count Bezukhov, a well-known

Bezukhov, and about his illegitimate son Pierre, the one who had behaved

Count Bezukhov's distress some fifteen times.

been degraded to the ranks and Bezukhov's son sent back to Moscow.

visitor. "And to think it is Cyril Vladimirovich Bezukhov's son who

state.... My only hope now is in Count Cyril Vladimirovich Bezukhov. If

here lives Count Cyril Vladimirovich Bezukhov so rich, all alone... that

Vladimirovich Bezukhov's house. "My dear Boris," said the mother,

he was afraid of finding in her a rival for Count Bezukhov's fortune,

Vladimirovich Bezukhov, Countess Rostova sat for a long time all alone

When Anna Mikhaylovna returned from Count Bezukhov's the money, all in

cooks were getting the supper, Count Bezukhov had a sixth stroke. The

Bezukhov.

Pierre will not be Pierre but will become Count Bezukhov, and will then

into the court of Count Bezukhov's house. As the wheels rolled softly

father, Count Bezukhov, with that gray mane of hair above his broad

circular room. Around the table all who were at Count Bezukhov's house

Count Bezukhov's death. She said the count had died as she would herself

old Count Bezukhov, and his inheritance. Fancy! The three princesses

recognized as legitimate; so that he is now Count Bezukhov and possessor

to know as plain Monsieur Pierre, has become Count Bezukhov and the

Bezukhova. But you will understand that I have no desire for the post. A

The news of Count Bezukhov's death reached us before your letter and my

Pierre, on unexpectedly becoming Count Bezukhov and a rich man, felt

Bezukhov he did not let go his hold of the lad. He had the air of a man

Moscow after the death of Count Bezukhov, he would call Pierre, or go to

by the death of Count Bezukhov (everyone constantly considered it a duty

Bezukhov, and showed them her own box. Princess Helene asked to see the

Six weeks later he was married, and settled in Count Bezukhov's large,

Bezukhov) "and now she's in love with that singer" (he meant Natasha's

at once, and go to Bezukhov's, and tell him 'Count Ilya has sent you to

"But I'll go to Bezukhov's myself. Pierre has arrived, and now we shall

"Tell Bezukhov to come. I'll put his name down. Is his wife with him?"

Bezukhov! Yes, I pity him from my heart, and shall try to give him what

neighbor on his right quickly turned in alarm to Bezukhov.

Nesvitski, Bezukhov's second. Pierre went home, but Rostov with Dolokhov

father's room, that huge room in which Count Bezukhov had died.

Rostov's share in Dolokhov's duel with Bezukhov was hushed up by the

honorable, of Bezukhov? And Fedya, with his noble spirit, loved him and

there! Bezukhov got off scotfree, while Fedya had to bear the whole

in the duel with Bezukhov, Pierre was right and Dolokhov wrong, and

into fashion, were danced. Iogel had taken a ballroom in Bezukhov's

its broad temples and close-cropped hair, and looked at Bezukhov. The

"I have the pleasure of addressing Count Bezukhov, if I am not

fastening his coat. When he had finished, he turned to Bezukhov, and

Despite Count Bezukhov's enormous wealth, since he had come into an

from correspondence with those abroad that Bezukhov had obtained the

reproved Bezukhov for his vehemence and said it was not love of virtue

Bezukhova's presence. To be received in the Countess Bezukhova's salon

Bezukhova's reputation as a lovely and clever woman became so firmly

the most intimate friend of the Bezukhov household since Helene's return

maid of honor, Pierre Bezukhov, and the son of their district postmaster

position in society thanks to his intimacy with Countess Bezukhova, a

understand.... Bezukhov, now, is blue, dark-blue and red, and he is

"Ah, here she is, the Queen of Petersburg, Countess Bezukhova," said

"That is Bezukhova's brother, Anatole Kuragin," she said, indicating a

Bezukhova and asked her to dance. She smilingly raised her hand and laid

"Especially such a capital fellow as Bezukhov!" In Natasha's eyes all

Just then Count Bezukhov was announced. Husband and wife glanced at one

"You have known Bezukhov a long time?" he asked. "Do you like him?"

getting married! Then Bezukhov, eh? He is here too, with his wife. He

She was the Countess Bezukhova, Pierre's wife, and the count, who knew

Shinshin. Countess Bezukhova turned smiling to the newcomer, and

with Pierre, Natasha heard a man's voice in Countess Bezukhova's box and

When the second act was over Countess Bezukhova rose, turned to the

Countess Bezukhova quite deserved her reputation of being a fascinating

opened and Countess Bezukhova, dressed in a purple velvet gown with a

she were blossoming under the praise of this dear Countess Bezukhova who

Bezukhova's visit and the invitation for that evening, Marya Dmitrievna

"I don't care to have anything to do with Bezukhova and don't advise you

Count Rostov took the girls to Countess Bezukhova's. There were a good

Bezukhova asked her visitors into the ballroom.

Countess Bezukhova was present among other Russian ladies who had

same time more seriously than did Count Bezukhov. Natasha unconsciously

furnishing a regiment, Bezukhov at once informed Rostopchin that he

regiment would cost him eight hundred thousand rubles, and that Bezukhov

had spent even more on his, but that the best thing about Bezukhov's

"Bezukhov est ridicule, but he is so kind and good-natured. What

not cause the least embarrassment to Countess Bezukhova, who evidently

but probably realizing that he was shouting at Bezukhov who so far was

know whether Count Bezukhov had left or was leaving the town.

Bezukhov's household, despite all the search they made, saw Pierre again

got Petya transferred from Obolenski's regiment to Bezukhov's, which was

"Look! Yes, on my word, it's Bezukhov!" said Natasha, putting her head

"Yes, it really is Bezukhov in a coachman's coat, with a queer-looking

The news of the day in Petersburg was the illness of Countess Bezukhova.

added. Countess Helene Bezukhova had suddenly died of that terrible

gatherings, everyone said that Countess Bezukhova had died of a terrible

"Bezukhov."

their prisoner's name was Count Bezukhov?

Bezukhov.

Hearing that Bezukhov was in Orel, Willarski, though they had never been

Natasha's wedding to Bezukhov, which took place in 1813, was the last

law Bezukhov to pay off debts he regarded as genuinely due for value

Besides the Bezukhov family, Nicholas' old friend the retired General

say, in Nicholas' house. The young Countess Bezukhova was not often seen


In [20]:
# still not sufficient - what about punctuation? Bezukhov's, etc? Best - regex
import re  # regular expressions

def count_occurrences(file, name):
    count = 0
    with open(file) as f:
        for line in f:
            count += len(re.findall(r'\b'+name+r'\b', line))  # or use two backslashes
    return count

In [21]:
print('The name Bezukhov occurs in War and Peace %d times.' % 
      count_occurrences(WAR_AND_PEACE, 'Bezukhov'))


The name Bezukhov occurs in War and Peace 72 times.

In [19]:
'this is a\' string'


Out[19]:
"this is a' string"

In [ ]: