This example will show how to interface to an overlay or hardware library from Python.
In this example, we will assume a new overlay has been created with an accelerator that receives data from Python, processes it, and returns the results.
A command and data will be sent to the accelerator from Python, the accelerator will process the data, return the results to memory, and acknowledge the transaction has completed.
Rather than going through the process or creating a new overlay, for the purposes of this example, the base
overlay will be used to illustrate the process. The IOP1 memory will be used to act like the accelerator memory, although no processing will be carried out on the data.
For this example, we will define the following addresses in the overlay, which are in the IOP1 memory space, and are accessible from Python:
Address | Name | Memory Location |
---|---|---|
Accelerator base address | BASE_ADDRESS | 0x40000000 |
Accelerator address range | ADDRESS_LENGTH | 0x1000 |
Command Address offset | CMD_OFFSET | 0x800 |
Acknowledge Address offset | ACK_OFFSET | 0x804 |
Input Data Address offset | INPUT_DATA_OFFSET | 0x0 |
Output Data Address offset | OUTPUT_DATA_OFFSET | 0x400 |
Assume we only have the following commands for this simple accelerator:
Command | Value |
---|---|
IDLE | 0x0 |
PROCESS | 0x1 |
The MMIO
module will be used to read and write to memory, or memory mapped peripherals in the Overlay. As shown in the following example, the steps to instantiate the new class include:
1. `MMIO` is imported.
2. The new class for the accelerator is defined.
3. `MMIO` will be instantiated inside the new module.
Note that a variable, array_length
, for this module will also be declared. You will see how this is used later.
Assume that the accelerator will check the command address when it starts.
The Python module must first initialize the command location (BASE_ADDRESS
+ CMD_OFFSET
) to 0x0 (IDLE).
from pynq import MMIO
class my_new_accelerator:
"""Python class for the PL Acclererator.
Attributes
----------
mmio : MMIO
MMIO object that can be read / written between PS and PL.
array_length : int
Length of the array to be processed.
"""
def __init__(self):
self.mmio = MMIO(BASE_ADDRESS,ADDRESS_LENGTH)
self.array_length = 0
self.mmio.write(CMD_OFFSET, 0x0)
For this example, we will define two functions: load_data()
and process()
.
load_data()
will write data to the accelerator memory.
process_data()
will send the start command to the accelerator, wait for an acknowledge, and read back the processed data.
Note how the array_length
variable is used.
def load_data(self, input_data):
self.array_length = len(input_data)
for i in range(self.array_length):
self.mmio.write(INPUT_DATA_OFFSET + i * 4, input_data[i])
def process(self):
# Send start command to accelerator
self.mmio.write(CMD_OFFSET, 0x1)
output_data = [0] * self.array_length
# ACK is set to check for 0x0 in the ACK offset
while (self.mmio.read(ACK_OFFSET)) != 0x1:
pass
# Ack has been received
for i in range(self.array_length):
output_data[i] = self.mmio.read(OUTPUT_DATA_OFFSET + i * 4)
# Reset Ack
self.mmio.write(ACK_OFFSET, 0x0)
return output_data
In [1]:
BASE_ADDRESS = 0x40000000
ADDRESS_LENGTH = 0x1000
CMD_OFFSET = 0x800
ACK_OFFSET = 0x804
INPUT_DATA_OFFSET = 0x0
OUTPUT_DATA_OFFSET = 0x400
from pynq import MMIO
class my_new_accelerator:
"""Python class for the PL Acclererator.
Attributes
----------
mmio : MMIO
MMIO object that can be read / written between PS and PL.
array_length : int
Length of the array to be processed.
"""
def __init__(self):
self.mmio = MMIO(BASE_ADDRESS,ADDRESS_LENGTH)
self.array_length = 0
self.mmio.write(CMD_OFFSET, 0x0)
def load_data(self, input_data):
self.array_length = len(input_data)
for i in range(self.array_length):
self.mmio.write(INPUT_DATA_OFFSET + i * 4, input_data[i])
def process(self):
# Send start command to accelerator
self.mmio.write(CMD_OFFSET, 0x1)
output_data = [0] * self.array_length
# ACK is set to check for 0x0 in the ACK offset
while (self.mmio.read(ACK_OFFSET)) != 0x1:
pass
# Ack has been received
for i in range(self.array_length):
output_data[i] = self.mmio.read(OUTPUT_DATA_OFFSET + i * 4)
# Reset Ack
self.mmio.write(ACK_OFFSET, 0x0)
return output_data
Executing the cell above loads the module into this notebook. This is the equivalent of importing the module (import my_new_accelerator
) if it was included as part of the pynq package.
As explained previously, this notebook does not show you how to create a custom accelerator, however, the python code can be tested with the base
overlay. In the base
overlay, the IOP memory (starting at 0x40000000) will be used to simulate writing to an accelerator, and reading back from the accelerator. Notice how the code writes to one area of memory (BASE_ADDRESS + INPUT_DATA_OFFSET), and expects to read back results from another area in memory (BASE_ADDRESS + OUTPUT_DATA_OFFSET).
Execute the cell below to load the overlay, instantiate the accelerator, and send some data to the accelerator.
In [2]:
from pynq import Overlay
Overlay("base.bit").download()
# declare accelerator with an array length of 10
acc = my_new_accelerator()
input_data = [i for i in range(10)]
print("Data to be sent to the accelerator:", input_data)
acc.load_data(input_data)
As the accelerator doesn't exist, any data loaded to memory won't be processed, and the acknowledge will not be written.
Execute the cell below to use the MMIO
to manually write some data to the results area of the memory to simulate data being processed, and to write 0x1 to the acknowledge address.
The MMIO
can be very useful to peak and poke memory and memory mapped peripherals in the overlay to debug Python code.
In [3]:
from pynq import MMIO
mmio = MMIO(BASE_ADDRESS, ADDRESS_LENGTH)
for i in range(len(input_data)):
mmio.write(OUTPUT_DATA_OFFSET + i * 4, input_data[i] + 1)
mmio.write(ACK_OFFSET, 1)
The process()
function can now send a start command, read the acknowledge (which has already been set manually in the cell above), and read back from data from the processed data area. You can change the code above to write different data to the processed data area, or to set the acknowlege to 0 (which will cause the code below to hang).
In [4]:
output_data = acc.process()
print("Input Data : ", input_data)
print("Output Data : ", output_data)