In [1]:
name = '2017-02-03-file-operations'
title = 'Standard library: file and directory operations'
tags = 'basics'
author = 'Denis Sergeev'

In [2]:
from nb_tools import connect_notebook_to_post
from IPython.core.display import HTML, Image

html = connect_notebook_to_post(name, title, tags, author)

Today we got acquianted with the following parts of Python's standard library:

  • os - Operating system functionality
  • pathlib and path.py - system paths as object
  • glob - Unix style pathname pattern expansion
  • tempfile - dealing with temporary files

os - Operating system functionality


In [3]:
import os

Environment variables

Print all environment variables:

os.environ.keys()

Get a particular variable by its name from the os.environ dictionary:


In [4]:
os.environ['USER']


Out[4]:
'denis'

In [5]:
os.getenv('MODEL')

Running system commands

You can use os.system, which returns a status code of the command execution (0 if the command did not fail):


In [6]:
os.system('ls')  # by default, the output goes to the console connected to the notebook


Out[6]:
0

Much better solution - subprocess module


In [7]:
import subprocess as sb

In [8]:
out = sb.check_output(['ls', '-la'])

In [9]:
# print(out.decode())

Directory level operations


In [10]:
os.getcwd()


Out[10]:
'/home/denis/sandbox/ueapy.github.io/content/notebooks'

In [11]:
os.curdir


Out[11]:
'.'

In [12]:
[i for i in os.listdir() if os.path.isfile(i)]


Out[12]:
['2015-11-27-meeting-summary.ipynb',
 '2015-12-11-meeting-summary.ipynb',
 '2015-11-13-meeting-summary.ipynb',
 '2016-03-11-exceptions.ipynb',
 '2016-09-30-scripts-and-modules.ipynb',
 '2016-10-07-creating-netcdf-datasets.ipynb',
 '2016-10-21-data-animations-intro.ipynb',
 '2015-12-18-meeting-summary.ipynb',
 '2017-01-20-function-quirks.ipynb',
 '2016-03-04-meeting-location.ipynb',
 '2016-05-13-custom-colorbar-colormap.ipynb',
 'demo.py',
 '2016-03-04-argument-parsing.ipynb',
 '2016-01-29-matplotlib-styles.ipynb',
 '2017-02-03-file-operations.ipynb',
 '2016-02-12-jupyter-grace-update.ipynb',
 '2016-12-02-oop-meteo-example.ipynb',
 '2016-02-05-ipywidgets-interact.ipynb',
 '2016-06-10-arcgis-intro.ipynb',
 '2016-01-15-iris-trajectory.ipynb',
 '2016-02-19-numpy-arrays-basics.ipynb',
 '2015-12-04-meeting-summary.ipynb',
 'junk_file.txt',
 '2015-11-06-initial-meeting.ipynb',
 '2016-10-14-loading-netcdf-datasets.ipynb',
 'nb_tools.py',
 '2016-05-06-classes.ipynb',
 '2015-11-20-cartopy-example.ipynb',
 '2016-10-28-xarray-intro.ipynb',
 '2016-01-22-string-formatting.ipynb']

Create a directory in the current working directory


In [13]:
os.mkdir('junk_folder')

Test if it's there:


In [14]:
'junk_folder' in os.listdir(os.curdir)


Out[14]:
True

Rename it:


In [15]:
os.rename('junk_folder', 'foodir')  # works for files as well

Delete it:


In [16]:
os.rmdir('foodir')

Create a directory and intermediate ones:


In [17]:
os.makedirs('./aaa/bbb/ccc', exist_ok=True)

os.path: path manipulations


In [18]:
with open('junk_file.txt', 'w') as f:
    f.write('blah')

In [19]:
a = os.path.abspath('junk_file.txt')

In [20]:
a


Out[20]:
'/home/denis/sandbox/ueapy.github.io/content/notebooks/junk_file.txt'

In [21]:
os.path.dirname(a)


Out[21]:
'/home/denis/sandbox/ueapy.github.io/content/notebooks'

In [22]:
os.path.basename(a)


Out[22]:
'junk_file.txt'

In [23]:
os.path.splitext(os.path.basename(a))


Out[23]:
('junk_file', '.txt')

In [24]:
os.path.exists(os.path.dirname(a))


Out[24]:
True

In [25]:
os.path.isfile('junk_file.txt')


Out[25]:
True

In [26]:
os.path.isdir('junk_file.txt')


Out[26]:
False

In [27]:
os.path.expanduser('~/UEA')


Out[27]:
'/home/denis/UEA'

In [28]:
os.path.join(os.path.expanduser('~'), 'UEA', 'lalala')


Out[28]:
'/home/denis/UEA/lalala'

In [29]:
os.path.getctime(a)


Out[29]:
1486678245.661653

In [30]:
os.path.getsize(a)


Out[30]:
4

In [31]:
os.path.commonprefix(['~/UEA/temp_data/', '~/UEA/PUG/'])


Out[31]:
'~/UEA/'

os.walk (previously os.path.walk) - directory tree generation

Uncomment to list the current directory tree


In [32]:
# for i in os.walk('.'):
    # print(i)

glob - Unix style pathname pattern expansion

The glob module provides convenient file pattern matching.

For example, you can find all files ending in '.ipynb':


In [33]:
import glob

In [34]:
glob.glob('*.ipynb')


Out[34]:
['2015-11-27-meeting-summary.ipynb',
 '2015-12-11-meeting-summary.ipynb',
 '2015-11-13-meeting-summary.ipynb',
 '2016-03-11-exceptions.ipynb',
 '2016-09-30-scripts-and-modules.ipynb',
 '2016-10-07-creating-netcdf-datasets.ipynb',
 '2016-10-21-data-animations-intro.ipynb',
 '2015-12-18-meeting-summary.ipynb',
 '2017-01-20-function-quirks.ipynb',
 '2016-03-04-meeting-location.ipynb',
 '2016-05-13-custom-colorbar-colormap.ipynb',
 '2016-03-04-argument-parsing.ipynb',
 '2016-01-29-matplotlib-styles.ipynb',
 '2017-02-03-file-operations.ipynb',
 '2016-02-12-jupyter-grace-update.ipynb',
 '2016-12-02-oop-meteo-example.ipynb',
 '2016-02-05-ipywidgets-interact.ipynb',
 '2016-06-10-arcgis-intro.ipynb',
 '2016-01-15-iris-trajectory.ipynb',
 '2016-02-19-numpy-arrays-basics.ipynb',
 '2015-12-04-meeting-summary.ipynb',
 '2015-11-06-initial-meeting.ipynb',
 '2016-10-14-loading-netcdf-datasets.ipynb',
 '2016-05-06-classes.ipynb',
 '2015-11-20-cartopy-example.ipynb',
 '2016-10-28-xarray-intro.ipynb',
 '2016-01-22-string-formatting.ipynb']

shutil - high-level file operations

The shutil provides useful file operations:

  • shutil.rmtree: Recursively delete a directory tree.
  • shutil.move: Recursively move a file or directory to another location.
  • shutil.copy: Copy files or directories.

path.py - Object-oriented filesystem paths

Not a part of the standard library!

Although there is pathlib module with almost the same functionality.


In [35]:
try:
    from path import Path
except ImportError:
    from pathlib import Path

examples


In [36]:
p = Path(os.getenv('HOME'))

In [37]:
p


Out[37]:
PosixPath('/home/denis')

In [38]:
path_to_file = p / 'UEA' / 'PUG' / 'blabla' / 'lalala'

In [39]:
path_to_file.exists()


Out[39]:
False

In [40]:
try:
    path_to_file.makedirs_p()
except:
    pass

In [41]:
path_to_file.parent.parent.glob('*.txt')


Out[41]:
<generator object Path.glob at 0x7fdacd013830>

tempfile - Generate temporary files and directories

Useful if your program produces some intermediate output that you can delete afterwards - temporary files are removed automatically.


In [42]:
import tempfile

Example:


In [43]:
with tempfile.TemporaryDirectory() as tmpdirname:
    print('created temporary directory', tmpdirname)


created temporary directory /tmp/tmp21hojth4

References


In [44]:
HTML(html)


Out[44]:

This post was written as an IPython (Jupyter) notebook. You can view or download it using nbviewer.