Lesson 30:

Filenames and Absolute/Relative File Paths

To make sure a programs output persists, scripts have to save to files.

Filenames and File Paths

  • Files are held in Folders.
  • A Folder is just a directory on the disk.
  • A File Path is the path to that file through those folders.
  • All file paths start at the Root folder (/ on UNIX systems and C:\ on Windows)
  • Unix systems use /, Windows systems use \.
  • Files also have File Extensions which is the .type suffix; it tells the OS what application handles the file.

One way to construct file paths is the .join string method:


In [159]:
'\\'.join(['folder1','folder2','folder3','file.png']) # join all elements using the escaped (literal) '\' string


Out[159]:
'folder1\\folder2\\folder3\\file.png'

But this string only works on Windows; to create an OS insensitive path, using the os module:


In [160]:
import os # contains many file path related functions

print(os.path.join('folder1','folder2','folder3','file.png')) # takes string arguments and returns OS-appropriate path
print(os.sep) # show the seperator currently being used.


folder1/folder2/folder3/file.png
/

If no explicit path is specified, Python will look for files in the current working directory.

You can find out what the current working directory is with os.getcwd(), and change it is with os.chdir().


In [161]:
# Start at current directory
defaultpath = os.path.expanduser('~/Dropbox/learn/books/Python/AutomateTheBoringStuffWithPython/')
os.chdir(defaultpath)
print(os.getcwd())

# Change path to /files folder
os.chdir('files') # changes current working directory, if not currently in it (to /files)
print(os.getcwd()) # prints the current working directory (should be /files)

# Reset back to notebook directory
os.chdir(defaultpath)


/Users/vivek.menon/Dropbox/learn/books/python/AutomateTheBoringStuffWithPython
/Users/vivek.menon/Dropbox/learn/books/python/AutomateTheBoringStuffWithPython/files

There are two kinds of file paths, relative and absolute.

Absolute file paths are the full address, while relative paths are related to the current working directory; they start at the cwd, not the root folder.

There are also . and .. operators:

  • .\ refers to the cwd, and .\path will look at any folders below this folder.
  • ..\ refers to the folder above the cwd, and will look at any folders below the parent folder of the cwd.

os.path.abspath returns the absolute path of whatever relative path is passed to it.

os.path.relpath() returns the relative path from an absolute path to a relative location.

os.path.isabs() returns a boolean if the path is a true path.


In [162]:
print(os.path.abspath('files')) # print absolute path of the files subdirectory
print(os.path.isabs(os.path.abspath('files'))) # Is the absolute path to files an absolute path (True)
print(os.path.relpath('../..', 'files')) # print the relative file path of a folder two folders up relative to subfolder (3 folders; ../../..)


/Users/vivek.menon/Dropbox/learn/books/python/AutomateTheBoringStuffWithPython/files
True
../../..

os.path.dirname() pulls out just the directory path component above a filepath.

os.path.basename() pulls out whats past the last slash.


In [163]:
print(os.path.dirname(os.path.abspath('files'))) # outputs absolute path above 'files'
print(os.path.basename('files/26645.pdf')) # outputs just 'files'


/Users/vivek.menon/Dropbox/learn/books/python/AutomateTheBoringStuffWithPython
26645.pdf

os.path.exists() can check if a path exists.

os.path.isfile() checks if the path ends at a file

os.path.isdir() checks if the path ends at a folder.


In [164]:
# Reset back to notebook directory
os.chdir(defaultpath)

print(os.path.exists(os.path.abspath('files'))) # checks if 'files' exists (True)
print(os.path.isfile('files')) # checks if 'files' is a file (False)
print(os.path.isdir('files')) # checks if 'files' is a folder (False)


True
False
True

os.path.getsize() returns the size of a file (bytes)

os.path.listdir() returns a list of all the files at the path.

File Size Finder


In [168]:
"""
A simple program to loop through a folder and find the size of all files in bytes, and the total size of the folder.
"""

import os

# starting size
totalSize = 0 


# for the fileName in the 'files' directory
for fileName in os.listdir('files'):
    # generate filePaths
    filePath = os.path.join('files',fileName) 
    # check if filePath is a file
    if os.path.isfile(filePath) == True:
    # if True, increase totalSize by the size of fileName
        totalSize += os.path.getsize(filePath)
    # also print what the file was and the size
        print('%s is %d bytes.'%(filePath, os.path.getsize(filePath)))
    # otherwise keep looping
    else:
        continue

# print the size of the folder at the end
print('\n\nThe \'%s\' folder contains %s bytes in total.'%('files',str(totalSize)))


files/112065.pdf is 359928 bytes.
files/26645.pdf is 590564 bytes.
files/alarm.wav is 582952 bytes.
files/allMyCats1.py is 461 bytes.
files/allMyCats2.py is 311 bytes.
files/backupToZip.py is 1415 bytes.
files/birthdays.py is 493 bytes.
files/boxPrint.py is 633 bytes.
files/buggyAddingProgram.py is 223 bytes.
files/bulletPointAdder.py is 399 bytes.
files/calcProd.py is 310 bytes.
files/catlogo.png is 16726 bytes.
files/catnapping.py is 121 bytes.
files/census2010.py is 155379 bytes.
files/censuspopdata.xlsx is 2246155 bytes.
files/characterCount.py is 225 bytes.
files/coinFlip.py is 214 bytes.
files/combinedminutes.pdf is 517813 bytes.
files/combinePdfs.py is 819 bytes.
files/countdown.py is 305 bytes.
files/demo.docx is 42624 bytes.
files/dictionary.txt is 454061 bytes.
files/dimensions.xlsx is 5393 bytes.
files/downloadXkcd.py is 1149 bytes.
files/duesRecords.xlsx is 8903 bytes.
files/encrypted.pdf is 316481 bytes.
files/encryptedminutes.pdf is 235840 bytes.
files/errorExample.py is 103 bytes.
files/example.csv is 191 bytes.
files/example.html is 324 bytes.
files/example.xlsx is 9898 bytes.
files/example.zip is 382809 bytes.
files/excelSpreadsheets.zip is 49773 bytes.
files/exitExample.py is 170 bytes.
files/factorialLog.py is 478 bytes.
files/fiveTimes.py is 87 bytes.
files/formFiller.py is 2757 bytes.
files/freezeExample.xlsx is 504006 bytes.
files/getDocxText.py is 214 bytes.
files/guessTheNumber.py is 675 bytes.
files/guests.txt is 60 bytes.
files/headings.docx is 34883 bytes.
files/hello.py is 370 bytes.
files/helloFunc.py is 114 bytes.
files/helloFunc2.py is 78 bytes.
files/helloworld.docx is 34837 bytes.
files/inventory.py is 352 bytes.
files/isPhoneNumber.py is 812 bytes.
files/littleKid.py is 163 bytes.
files/lucky.py is 555 bytes.
files/magic8Ball.py is 693 bytes.
files/magic8Ball2.py is 314 bytes.
files/mapIt.py is 392 bytes.
files/mcb.pyw is 732 bytes.
files/meetingminutes.pdf is 246927 bytes.
files/meetingminutes2.pdf is 301513 bytes.
files/merged.xlsx is 5450 bytes.
files/mouseNow.py is 454 bytes.
files/mouseNow2.py is 704 bytes.
files/multidownloadXkcd.py is 1613 bytes.
files/multipleParagraphs.docx is 34907 bytes.
files/myPets.py is 198 bytes.
files/passingReference.py is 106 bytes.
files/phoneAndEmail.py is 1235 bytes.
files/picnicTable.py is 361 bytes.
files/prettyCharacterCount.py is 248 bytes.
files/printRandom.py is 67 bytes.
files/produceSales.xlsx is 878030 bytes.
files/pw.py is 597 bytes.
files/quickWeather.py is 984 bytes.
files/randomQuizGenerator.py is 3209 bytes.
files/readCensusExcel.py is 1307 bytes.
files/readDocx.py is 210 bytes.
files/removeCsvHeader.py is 958 bytes.
files/removeCsvHeader.zip is 678008 bytes.
files/renameDates.py is 1426 bytes.
files/resizeAndAddLogo.py is 1430 bytes.
files/restyled.docx is 48791 bytes.
files/sameName.py is 270 bytes.
files/sameName2.py is 89 bytes.
files/sameName3.py is 231 bytes.
files/sameName4.py is 91 bytes.
files/sampleChart.xlsx is 7396 bytes.
files/sendDuesReminders.py is 1392 bytes.
files/stopwatch.py is 823 bytes.
files/styled.xlsx is 5376 bytes.
files/styles.xlsx is 5447 bytes.
files/swordfish.py is 261 bytes.
files/textMyself.py is 500 bytes.
files/threadDemo.py is 209 bytes.
files/ticTacToe.py is 707 bytes.
files/torrentStarter.py is 3905 bytes.
files/twoPage.docx is 34868 bytes.
files/updatedProduceSales.xlsx is 505451 bytes.
files/updateProduce.py is 671 bytes.
files/validateInput.py is 379 bytes.
files/vampire.py is 247 bytes.
files/vampire2.py is 247 bytes.
files/watermark.pdf is 91339 bytes.
files/zeroDivide.py is 202 bytes.
files/zophie.png is 1364265 bytes.


The 'files' folder contains 10798836 bytes in total.

os.makedirs() creates new directories at the location.

os.removedirs() removes folders in an absolute location.


In [166]:
# clear the folders if the exist already
if os.path.exists(os.path.abspath('files/newfolder/anotherone')) == True:
    os.removedirs(os.path.abspath('files/newfolder/anotherone')) # clear folders if they exist

# create new folders at an absolute path
os.makedirs(os.path.abspath('files/newfolder/anotherone')) # create new folders

# check if they exist
if os.path.exists(os.path.abspath('files/newfolder/anotherone')) == True:
    print('\'files/newfolder/anotherone\' exists.')


'files/newfolder/anotherone' exists.

Recap

  • Files have a name and a path.
  • The root folder is the lowest folder; it's C:\ on Windows and / on Unix systems.
  • In a file path, the folders and filename are seperated by \ on Windows and / on Unix systems.
  • os.path.join() combines folders with the correct slash for the OS.
  • The current working directory is the folder that any relative paths are relative to.
  • os.getcwd() will return the current working directory.
  • .os.chdir() will change the current working directory.
  • Absolute paths begin with the root folder, relative paths begin at the current working directory.
  • The . symbol is shorthand for the current folder.
  • The .. symbol is shorthand for the parent folder.
  • os.path.abspath() returns the absolute path form of the path given to it.
  • os.path.isabs() checks that a path is absolute.
  • os.path.relpath() returns the relative path between two paths passed to it.
  • os.makedirs() can make folders.
  • os.removedirs() can remove folders.
  • os.path.getsize() returns a file's size.
  • os.listdir() returns a list of strings of filenames.
  • os.path.exists() checks if a path exists.
  • os.path.isfile() checks if a path ends in a file.
  • os.path.isdir() checks if a path ends in a directory.