Python can be used to control the keyboard and mouse, which allows us to automate any program that uses these as inputs. However, it currently does so blindly, depending on cartesian coordinates; we need a way to show the program the screen.
pyautogui
also contains the pillow
imaging model, to help create quick screenshots and show the computer the current screen state. Working with pillow
for image recongition is covered in more detail in Chapter 17 of Mr. Sweigart's blog.
In [9]:
import pyautogui
We can use the screenshot()
function to take a screenshot of the current screen.
In [7]:
# let us first change directory to the `files` subdirectory, to store these values
import os
os.chdir('files')
os.getcwd()
Out[7]:
In [11]:
pyautogui.screenshot()
Out[11]:
This creates an immediate screenshot file, stored in memory. To save it to the file, we can pass a file path to it.
In [12]:
pyautogui.screenshot('screenshot_example.png')
Out[12]:
Now, the module can 'see' the screen, but to do proper image recognition, we must use the the locateOnScreen()
function. We will use the default OSX calculator for this example:
We will be searching for the '7' key on this calculator, using this cropped version:
We can now pass this file to the locateOnScreen()
function, and see if the program can find it in on the screen currently.
In [14]:
pyautogui.locateOnScreen('calc7key.png')
Out[14]:
It has returned a tuple of 4 integers:
A more useful version of this is the locateCenterOnScreen()
, which centers in that object, allowing for better interaction.
In [18]:
pyautogui.locateCenterOnScreen('calc7key.png')
Out[18]:
We can now move to and click the calculator.
In [19]:
pyautogui.moveTo((1309, 595), duration=1)
pyautogui.click((1309, 595), clicks = 7)
There are a few caveats with GUI automation that may affect its performance:
locateOnScreen()
function is computationally intensive, and may effect runtimes significantly; it cannot perform effectively in a dynamic environment. While this isn't ideal for some environments, it can have particularly useful applications, like winning at Sushi Go.
pyautogui.screenshot()
will return an Image object of the screen, or you can pass it a filename to save it to a file.pyautogui.locateOnScreen()
is passed a sample image file, and returns the coordinates of where it is on the screen.pyautogui.locateCenterOnScreen()
returns the (x, y) tuple of the center of the image found on the scren.