Lesson 34:

Walking a Directory Tree

To write a program that can apply to a directory tree, Python needs to be able to navigate through a directory.

This is done via os.walk().

After moving into /files:


In [1]:
import os 
# Define base directory
defaultpath = os.path.expanduser('~/Dropbox/learn/books/Python/AutomateTheBoringStuffWithPython')

#Change directory to files directory if set in default 
if (os.getcwd() == defaultpath):
       os.chdir('/files')   
else:
    os.chdir(defaultpath + '/files')

import os and use walk() to return three values: [folderName, subfolders, filenames]


In [14]:
for folderName, subfolders, filenames in os.walk(os.getcwd()): # loop through all the variables returned from os.walk
    print('The folder is ' + folderName + '\n') # Print folder name
    print('The subfolders in ' + folderName + ' are ' + str(subfolders).join('\n,') + '\n') # Print subfolders
    print('The filenames in ' + folderName + ' are ' + str(filenames).join('\n,') + '\n') # Print filenames
    print('\n\n') # Double newline


The folder is /Users/vivek.menon/Dropbox/learn/books/python/AutomateTheBoringStuffWithPython/files

The subfolders in /Users/vivek.menon/Dropbox/learn/books/python/AutomateTheBoringStuffWithPython/files are 
['newfiles', 'newfolder'],

The filenames in /Users/vivek.menon/Dropbox/learn/books/python/AutomateTheBoringStuffWithPython/files are 
['112065.pdf', '26645.pdf', 'alarm.wav', 'allMyCats1.py', 'allMyCats2.py', 'backupToZip.py', 'bacon.txt', 'birthdays.py', 'boxPrint.py', 'buggyAddingProgram.py', 'bulletPointAdder.py', 'calcProd.py', 'catlogo.png', 'catnapping.py', 'census2010.py', 'censuspopdata.xlsx', 'characterCount.py', 'coinFlip.py', 'combinedminutes.pdf', 'combinePdfs.py', 'countdown.py', 'demo.docx', 'dictionary.txt', 'dimensions.xlsx', 'downloadXkcd.py', 'duesRecords.xlsx', 'encrypted.pdf', 'encryptedminutes.pdf', 'errorExample.py', 'example.csv', 'example.html', 'example.xlsx', 'example.zip', 'excelSpreadsheets.zip', 'exitExample.py', 'factorialLog.py', 'fiveTimes.py', 'formFiller.py', 'freezeExample.xlsx', 'getDocxText.py', 'guessTheNumber.py', 'guests.txt', 'headings.docx', 'hello.py', 'helloFunc.py', 'helloFunc2.py', 'helloworld.docx', 'helloworld.txt', 'inventory.py', 'isPhoneNumber.py', 'littleKid.py', 'lucky.py', 'magic8Ball.py', 'magic8Ball2.py', 'mapIt.py', 'mcb.pyw', 'meetingminutes.pdf', 'meetingminutes2.pdf', 'merged.xlsx', 'mouseNow.py', 'mouseNow2.py', 'multidownloadXkcd.py', 'multipleParagraphs.docx', 'mycatdata', 'myPets.py', 'newbacon.txt', 'passingReference.py', 'phoneAndEmail.py', 'picnicTable.py', 'prettyCharacterCount.py', 'printRandom.py', 'produceSales.xlsx', 'pw.py', 'quickWeather.py', 'randomQuizGenerator.py', 'readCensusExcel.py', 'readDocx.py', 'removeCsvHeader.py', 'removeCsvHeader.zip', 'renameDates.py', 'resizeAndAddLogo.py', 'restyled.docx', 'sameName.py', 'sameName2.py', 'sameName3.py', 'sameName4.py', 'sampleChart.xlsx', 'sendDuesReminders.py', 'stopwatch.py', 'styled.xlsx', 'styles.xlsx', 'swordfish.py', 'textMyself.py', 'threadDemo.py', 'ticTacToe.py', 'torrentStarter.py', 'twoPage.docx', 'updatedProduceSales.xlsx', 'updateProduce.py', 'validateInput.py', 'vampire.py', 'vampire2.py', 'watermark.pdf', 'zeroDivide.py', 'zophie.png'],




The folder is /Users/vivek.menon/Dropbox/learn/books/python/AutomateTheBoringStuffWithPython/files/newfiles

The subfolders in /Users/vivek.menon/Dropbox/learn/books/python/AutomateTheBoringStuffWithPython/files/newfiles are 
[],

The filenames in /Users/vivek.menon/Dropbox/learn/books/python/AutomateTheBoringStuffWithPython/files/newfiles are 
['bacon.txt', 'bacon3.txt'],




The folder is /Users/vivek.menon/Dropbox/learn/books/python/AutomateTheBoringStuffWithPython/files/newfolder

The subfolders in /Users/vivek.menon/Dropbox/learn/books/python/AutomateTheBoringStuffWithPython/files/newfolder are 
['anotherone'],

The filenames in /Users/vivek.menon/Dropbox/learn/books/python/AutomateTheBoringStuffWithPython/files/newfolder are 
[],




The folder is /Users/vivek.menon/Dropbox/learn/books/python/AutomateTheBoringStuffWithPython/files/newfolder/anotherone

The subfolders in /Users/vivek.menon/Dropbox/learn/books/python/AutomateTheBoringStuffWithPython/files/newfolder/anotherone are 
[],

The filenames in /Users/vivek.menon/Dropbox/learn/books/python/AutomateTheBoringStuffWithPython/files/newfolder/anotherone are 
[],




It not only examines the parent folder, but loops through the entire subdirectory, and run functions accordingly:


In [21]:
import shutil

for folderName, subfolders, filenames in os.walk(os.getcwd()): # loop through all the variables returned from os.walk
    for subfolder in subfolders: # for every subfolder
        if 'fish' in subfolders: # if there is a subfolder named 'fish'
            os.rmdir(subfolder)      # delete it
        else:
            continue
            
    # for every file in the walk    
    for file in filenames:       
        # that ends with .'py'
        if file.endswith('.py'):   
            # backup files, but use use 'os.path.join' since files are just strings by themselves; need complete paths from cwd
            shutil.copy(os.path.join(folderName, file), os.path.join(folderName, file) + '.backup')

This batch approaches allows us to run through an entire directory tree and run functions.

It is recommended to use dry-run mode with commented code first.

Recap

  • os.walk() allows you to 'walk' through a directory tree, and interact with subdirectories and filenames.
  • It returns three variables, folderName, subfolder, and filename.
  • Looping through these will return these variables, and allow you to run functions on them.
  • Useful for batch file management.

In [ ]: