Python Text Basics Assessment

Welcome to your assessment! Complete the tasks described in bold below by typing the relevant code in the cells.
You can compare your answers to the Solutions notebook provided in this folder.

f-Strings

1. Print an f-string that displays NLP stands for Natural Language Processing using the variables provided.


In [1]:
abbr = 'NLP'
full_text = 'Natural Language Processing'

# Enter your code here:


NLP stands for Natural Language Processing

Files

2. Create a file in the current working directory called contacts.txt by running the cell below:


In [6]:
%%writefile contacts.txt
First_Name Last_Name, Title, Extension, Email


Overwriting contacts.txt

3. Open the file and use .read() to save the contents of the file to a string called fields. Make sure the file is closed at the end.


In [3]:
# Write your code here:



    
# Run fields to see the contents of contacts.txt:
fields


Out[3]:
'First_Name Last_Name, Title, Extension, Email'

Working with PDF Files

4. Use PyPDF2 to open the file Business_Proposal.pdf. Extract the text of page 2.


In [4]:
# Perform import


# Open the file as a binary object


# Use PyPDF2 to read the text of the file



# Get the text from page 2 (CHALLENGE: Do this in one step!)
page_two_text = 



# Close the file


# Print the contents of page_two_text
print(page_two_text)


AUTHORS:
 
Amy Baker, Finance Chair, x345, abaker@ourcompany.com
 
Chris Donaldson, Accounting Dir., x621, cdonaldson@ourcompany.com
 
Erin Freeman, Sr. VP, x879, efreeman@ourcompany.com
 

5. Open the file contacts.txt in append mode. Add the text of page 2 from above to contacts.txt.

CHALLENGE: See if you can remove the word "AUTHORS:"


In [5]:
# Simple Solution:


First_Name Last_Name, Title, Extension, EmailAUTHORS:
 
Amy Baker, Finance Chair, x345, abaker@ourcompany.com
 
Chris Donaldson, Accounting Dir., x621, cdonaldson@ourcompany.com
 
Erin Freeman, Sr. VP, x879, efreeman@ourcompany.com
 


In [7]:
# CHALLENGE Solution (re-run the %%writefile cell above to obtain an unmodified contacts.txt file):


First_Name Last_Name, Title, Extension, Email
 
Amy Baker, Finance Chair, x345, abaker@ourcompany.com
 
Chris Donaldson, Accounting Dir., x621, cdonaldson@ourcompany.com
 
Erin Freeman, Sr. VP, x879, efreeman@ourcompany.com
 

Regular Expressions

6. Using the page_two_text variable created above, extract any email addresses that were contained in the file Business_Proposal.pdf.


In [8]:
import re

# Enter your regex pattern here. This may take several tries!
pattern = 

re.findall(pattern, page_two_text)


Out[8]:
['abaker@ourcompany.com',
 'cdonaldson@ourcompany.com',
 'efreeman@ourcompany.com']

Great job!