Welcome!

Let's start by assuming you have downloaded the code, and ran the setup.py . This demonstration will show the user how predict the time constant of their trEFM data using the methods of statistical learning. Let's start by importing the data simulation module trEFMlearn package. This package contains methods to numerically simulate some experimental data.


In [1]:
import numpy as np
import matplotlib.pyplot as plt
from trEFMlearn import data_sim
%matplotlib inline

Simulation

You can create an array of time constants that you would like to simulate the data for. This array can then be input into the simulation function which simulates the data as well as fits it using standard vector regression. This function can take a few minutes dpending on the number of time constants you provide. Run this cell and wait for the function to complete. There may be an error that occurs, don't fret as this has no effect.


In [2]:
tau_array = np.logspace(-8, -5, 100)

fit_object, fit_tau = data_sim.sim_fit(tau_array)


Simulation is 0 % Complete.
C:\Users\jarrison\Miniconda2\lib\site-packages\trefm_learn-0.1-py2.7.egg\trEFMlearn\simulate.py:82: RuntimeWarning: divide by zero encountered in divide
C:\Users\jarrison\Miniconda2\lib\site-packages\trefm_learn-0.1-py2.7.egg\trEFMlearn\simulate.py:263: VisibleDeprecationWarning: using a non-integer number instead of an integer will result in an error in the future
C:\Users\jarrison\Miniconda2\lib\site-packages\trefm_learn-0.1-py2.7.egg\trEFMlearn\simulate.py:268: VisibleDeprecationWarning: using a non-integer number instead of an integer will result in an error in the future
Simulation is 10 % Complete.
Simulation is 20 % Complete.
Simulation is 30 % Complete.
Simulation is 40 % Complete.
Simulation is 50 % Complete.
Simulation is 60 % Complete.
Simulation is 70 % Complete.
Simulation is 80 % Complete.
Simulation is 90 % Complete.
Simulation is Complete.

Neato!

Looks like that function is all done. We now have an SVR Object called "fit_object" as well as a result of the fit called "fit_tau". Let's take a look at the result of the fit by comparing it to the actual input tau.


In [3]:
plt.figure()
plt.title('Fit Time Constant vs. Actual')
plt.plot(fit_tau, 'bo')
plt.plot(tau_array,'g')
plt.ylabel('Tau (s)')
plt.yscale('log')
plt.show()

# Calculate the error at each measurement.
error = (tau_array - fit_tau) / tau_array

plt.figure()
plt.title('Error Signal')
plt.plot(tau_array, error)
plt.ylabel('Error (%)')
plt.xlabel('Time Constant (s)')
plt.xscale('log')
plt.show()


Clearly the SVR method is quite capable of reproducing the time constants simulated data using very simple to calculate features. We observe some lower limit to the model's ability to calculate time constants, which is quite interesting. However, this lower limit appears below 100 nanoseconds, a time-scale that is seldom seen in the real world. This could be quite useful for extracting time constant data!

Analyzing a Real Image

The Data

In order to assess the ability of the model to apply to real images, I have taken a trEFM image of an MDMO photovoltaic material. There are large aggregates of acceptor material that should show a nice contrast in the way that they generate and hold charge. Each pixel of this image has been pre-averaged before being saved with this demo program. Each pixel is a measurement of the AFM cantilever position as a function of time.

The Process

Our mission is to extract the time constant out of this signal using the SVR fit of our simulated data. We accomplish this by importing and calling the "process_image" function.


In [4]:
from trEFMlearn import process_image

The image processing function needs two inputs. First we show the function the path to the provided image data. We then provide the function with the SVR object that was previously generated using the simulated cantilever data. Processing this image should only take 15 to 30 seconds.


In [5]:
tau_img, real_sum_img, fft_sum_img, amp_diff_img = process_image.analyze_image('.\\image data\\', fit_object)


Line: 0 of 16
Line: 1 of 16
Line: 2 of 16
Line: 3 of 16
Line: 4 of 16
Line: 5 of 16
Line: 6 of 16
Line: 7 of 16
Line: 8 of 16
Line: 9 of 16
Line: 10 of 16
Line: 11 of 16
Line: 12 of 16
Line: 13 of 16
Line: 14 of 16
Line: 15 of 16

Awesome. That was pretty quick huh? Without this machine learning method, the exact same image we just analyzed takes over 8 minutes to run. Yes! Now let's take a look at what we get.


In [6]:
# Something went wrong in the data on the first line. Let's skip it.
tau_img = tau_img[1:]
real_sum_img = real_sum_img[1:]
fft_sum_img = fft_sum_img[1:]
amp_diff_img = amp_diff_img[1:]

plt.figure()  
upper_lim = (tau_img.mean() + 2*tau_img.std())
lower_lim = (tau_img.mean() - 2*tau_img.std())
plt.imshow(tau_img,vmin=lower_lim, vmax=upper_lim,cmap = 'cubehelix')
plt.show()


You can definitely begin to make out some of the structure that is occuring in the photovoltaic performance of this device. This image looks great, but there are still many areas of improvement. For example, I will need to extensively prove that this image is not purely a result of topographical cross-talk. If this image is correct, this is a significant improvement on our current imaging technique.

The Features

In the next cell we show an image of the various features that were calculated from the raw deflection signal. Some features more clearly matter than others and indicate that the search for better and more representative features is desirable. However, I think this is a great start to a project I hope to continue developing in the future.


In [8]:
fig, axs = plt.subplots(nrows=3)
axs[0].imshow(real_sum_img ,'hot')
axs[0].set_title('Total Signal Sum')

axs[1].imshow(fft_sum_img, cmap='hot')
axs[1].set_title('Sum of the FFT Power Spectrum')

axs[2].imshow(amp_diff_img, cmap='hot')
axs[2].set_title('Difference in Amplitude After Trigger')
plt.tight_layout()
plt.show()



In [ ]: