Python Text Basics Assessment

Welcome to your assessment! Complete the tasks described in bold below by typing the relevant code in the cells.
You can compare your answers to the Solutions notebook provided in this folder.

f-Strings

1. Print an f-string that displays `NLP stands for Natural Language Processing` using the variables provided.



In [1]:

    
abbr = 'NLP'
full_text = 'Natural Language Processing'

# Enter your code here:









    



NLP stands for Natural Language Processing

Files

2. Create a file in the current working directory called `contacts.txt` by running the cell below:



In [6]:

    
%%writefile contacts.txt
First_Name Last_Name, Title, Extension, Email









    



Overwriting contacts.txt

3. Open the file and use .read() to save the contents of the file to a string called `fields`. Make sure the file is closed at the end.



In [3]:

    
# Write your code here:



    
# Run fields to see the contents of contacts.txt:
fields









    Out[3]:





'First_Name Last_Name, Title, Extension, Email'

Working with PDF Files

4. Use PyPDF2 to open the file `Business_Proposal.pdf`. Extract the text of page 2.



In [4]:

    
# Perform import


# Open the file as a binary object


# Use PyPDF2 to read the text of the file



# Get the text from page 2 (CHALLENGE: Do this in one step!)
page_two_text = 



# Close the file


# Print the contents of page_two_text
print(page_two_text)









    



AUTHORS:
 
Amy Baker, Finance Chair, x345, abaker@ourcompany.com
 
Chris Donaldson, Accounting Dir., x621, cdonaldson@ourcompany.com
 
Erin Freeman, Sr. VP, x879, efreeman@ourcompany.com

5. Open the file `contacts.txt` in append mode. Add the text of page 2 from above to `contacts.txt`.

CHALLENGE: See if you can remove the word "AUTHORS:"



In [5]:

    
# Simple Solution:









    



First_Name Last_Name, Title, Extension, EmailAUTHORS:
 
Amy Baker, Finance Chair, x345, abaker@ourcompany.com
 
Chris Donaldson, Accounting Dir., x621, cdonaldson@ourcompany.com
 
Erin Freeman, Sr. VP, x879, efreeman@ourcompany.com



In [7]:

    
# CHALLENGE Solution (re-run the %%writefile cell above to obtain an unmodified contacts.txt file):









    



First_Name Last_Name, Title, Extension, Email
 
Amy Baker, Finance Chair, x345, abaker@ourcompany.com
 
Chris Donaldson, Accounting Dir., x621, cdonaldson@ourcompany.com
 
Erin Freeman, Sr. VP, x879, efreeman@ourcompany.com

Regular Expressions

6. Using the `page_two_text` variable created above, extract any email addresses that were contained in the file `Business_Proposal.pdf`.



In [8]:

    
import re

# Enter your regex pattern here. This may take several tries!
pattern = 

re.findall(pattern, page_two_text)









    Out[8]:





['abaker@ourcompany.com',
 'cdonaldson@ourcompany.com',
 'efreeman@ourcompany.com']

Python Text Basics Assessment

f-Strings

1. Print an f-string that displays NLP stands for Natural Language Processing using the variables provided.

Files

2. Create a file in the current working directory called contacts.txt by running the cell below:

3. Open the file and use .read() to save the contents of the file to a string called fields. Make sure the file is closed at the end.

Working with PDF Files

4. Use PyPDF2 to open the file Business_Proposal.pdf. Extract the text of page 2.

5. Open the file contacts.txt in append mode. Add the text of page 2 from above to contacts.txt.

CHALLENGE: See if you can remove the word "AUTHORS:"

Regular Expressions

6. Using the page_two_text variable created above, extract any email addresses that were contained in the file Business_Proposal.pdf.

Great job!

1. Print an f-string that displays `NLP stands for Natural Language Processing` using the variables provided.

2. Create a file in the current working directory called `contacts.txt` by running the cell below:

3. Open the file and use .read() to save the contents of the file to a string called `fields`. Make sure the file is closed at the end.

4. Use PyPDF2 to open the file `Business_Proposal.pdf`. Extract the text of page 2.

5. Open the file `contacts.txt` in append mode. Add the text of page 2 from above to `contacts.txt`.

6. Using the `page_two_text` variable created above, extract any email addresses that were contained in the file `Business_Proposal.pdf`.