The partical lab is based on the following tools :
%magic
commandsThis is a very important part - you'll find yourself spending more time reading doc than writting code !
Do not hesitate to go through the different help systems - available from here, ( look at the Help menu of this page, you will recognize the list )
you've already seen it - this is a notebook
Within a notebook, you can freely mix
example of code :
sp = numpy.fft.rfft(fid)
equation, using the $\LaTeX$ syntax : $$ \ell_p(\mathbf{x}) = \left ( \sum_{i=O}^N {(x_i)^p} \right )^{\frac{1}{p}} $$ but also in line : $ \ell_p(\mathbf{x}) = \left ( \sum_{i=O}^N {(x_i)^p} \right )^{\frac{1}{p}}$
etc...
please double-click on this cell to see the internal magic !
### convenient user interface
object?
for help object??
for codeobje + tab-key
for code completionobject. + tab-key
for attribute listfunction( + shift-tab-key )
for interface description_
for last results%history
%timeit
%debug
In [1]:
pwd
Out[1]:
In [2]:
ls
Python is script langage, meant to tie things together. Over time, many possibilities have been implemented, the ones which we are going to use is the scientific stack which allows to program very rapidly, at a very high level, efficient computational tasks.
One confusion to be cleared at the very beginning: There is basically 2 flavors of the Python language:
The difference are minutes, and anything which works in 2.7 will work in 3.x as long as you check the following differences:
print(something)
(both 2.7 and 3.5) rather than print something
(2.7 only)This repository is meant to run under python 2.7 Most of the features should also work under 3.x but you might need some tuning.
In [3]:
# this simple line allows the code to be version independent (kind of)
from __future__ import division, print_function
In [4]:
a = 1 # integer
b = 3.14 # floats
c = 1.1 + 2j # complex
# but also unlimited precision integers :
l = 123456789012345678901234567890L
print ("l^2 = ", l*l)
In [5]:
d = "Mary had a little cat " #strings - strings are immutable, d[3] = "g" will fail
dd = 'George had one too ' # ' and " are just the same
ddd = """ triple quotes indicate multi line string
very convenient for large texts
where you can easily use " and '
"""
m = None # some prefined constants
n = True
o = False
e = (1.1, a, (b,c), d, a) # tuples
f = [1.1, a, (b,c), d, a] # lists - lists and tuples ar ordered, tuples are immutable
empty = [] # initialize an empty list
empty2 = () # even empty tuple
# index in tuples and lists start at 0.
print ("e[1] ",e[1])
# there are tools for creating lists and string
line = "*"*30 # this is 30 "*" in a row
line_extended = "#" + line + "#" # is '#******************************#'
r = range(10) # this is [0, 1, 2, 3, 4, 5, 6, 7, 8, 9]
r.append("end") # now r is [0, 1, 2, 3, 4, 5, 6, 7, 8, 9, 'end']
# and MANY other tools
g = {"key1": 1.0, "key2": "horsemen", 1:"keys can be anything"} # dictionnaries
empty3 = {} # initialize an empty directory
empty3["here"] = "there" # set values in dictionnaries
print ('g["key2"]: ', g["key2"])
print ('g.keys(): ', g.keys(), 'g.values(): ', g.values())
h = set((1.1, a, (b,c), d, a)) # set do not have duplicated values, so there is only one a here
# dictionnaries and set are unordered
# MANY MANY other stuff (see standard library for types and associated functions )
In [6]:
if (1==2):
do(this)
for i in e:
print(i)
s = 0
for j in range(10):
s = s+j**2
while abs(c)<100:
print(c)
c = c**2
In [7]:
# range
m2 = range(10) # 10 values from 0 to 9
m3 = range(2,15,3) # 2 to 14, by steps of 3
print('m2 :', m2)
print('m3 :', m3)
In [8]:
# indexing
print(m2[3:]) # m2 from 3 to the end
print(m2[:5]) # m2 from beginning to 4
print(m2[::2]) # m2 by step of 2
print(m2[:7:2]) # m2 from beginning to 6 by step of 2
print(m2[:-3]) # m2 with all but last 3
print(m2[::-1]) # m2 reversed
In [9]:
def func1(arg1):
"example of function"
do(arg1)
return value
# arguments may have default value, in which case, they are optional (but come last in arg list)
def func2(arg1, arg2="default", arg3=3, arg4=None):
"example of default arguments in function definition"
if arg4 is None: # prefered to == None
# NEVER EVER use a mutable ([] for instance) as defaut var
arg4 = []
return (arg1, arg2, arg3, arg4)
print ( func2(5) )
In [10]:
func2(6)
Out[10]:
Unlike some script languages, values are typed, however, the variable can hold sequentially different types, and function can adapt anytype as long as it is syntaxly correct
In [11]:
def combine(x,y):
" combines two vars, using + and *"
return x + 2*y
In [12]:
# this works
print ( combine(1, 2))
print ( combine("a", "b"))
print ( combine(d, dd))
# this doesn't
print ( combine("a", 2))
In [13]:
class MyClass(object): # here we inherit from basic object - may inherit from any other class
"a minumum class"
def __init__(self, arg):
"this is the 'creator' of the object"
self.arg1 = arg # here you create an object attribute
self.arg2 = "initial" # here another
def method1(self):
"here we define a method for this object"
if self.arg1:
v = self.arg2.upper()
else:
v = self.arg2.lower()
return v
# then we can create
ob1 = MyClass(True)
ob1.arg2 = "ExAmPlE"
print ( ob1.method1() )
ob1.arg1 = False
print ( ob1.method1() )
Standard python has a complete library of packages, which cover about everything you want to do with a computer : (regular expression, socket, web sites, interface with OS, cryptographic, threads, multiprocessing, etc...)
to load and use a library into a program, simply do one of these:
import library
#then use
library.tool()
import library as lib # just an alias
#then use
lib.tool()
from library import tool
#then use
tool()
You should definitely check the documentation
python is a real full-fledge language, created to be simple yet not limited. You should go thru the on-line tutorial for getting ideas about the possibilities of the language.
This is in contrast with most scripting languages that are usually limited, and/or started as a quick hack, and contains some initial defects which are hard to get rid of.
It is also in contrast with specific languages (R, Matlab, PHP) which are optimized for a given task, but have hard time doing something else (try doing big stat in PHP, or a web site in Matlab!)
In [14]:
ls 'FTICR/Files/bruker ubiquitin file/ESI_pos_Ubiquitin_000006.d/'
In [15]:
cat 'FTICR/Files/bruker ubiquitin file/readme.txt'
and ! can be used to call more specific Unix commands
In [16]:
!find . -name '*.method'
In [17]:
import numpy # this is how you load an external library
import numpy as np # this is the standard way of loading numpy
x = np.linspace(0,5,1000) # create a series of 1000 points ranging from 0.0 to 5.0
y = 1 + np.sin(x)**2 # do some arithmetic with x
print('y_100: ',y[100]) # then elements appear like simple lists
In [18]:
# multidimentional
mat = np.array([[0,1,2],[3,4,5],[6,7,8]])
print(mat)
print(mat[1,2])
print(mat[1,:])
print(mat[:,2])
print(mat.T)
In [19]:
# creators
print(np.zeros(10))
print(np.zeros(5, dtype=complex))
print(np.ones(10))
print(np.arange(10)) # note the int
print(np.arange(10.0)) # note the float
In [20]:
print("initialize to 0 or 1")
print(np.zeros( (2,3) ) ) # note the tuple
print(np.ones( (3,2) ) )
print("a diagonal matrix")
print(np.eye(5))
print( np.eye(5).shape )
print ("a random array")
print(np.random.randn( 5,3 ) ) # note the 2 arguments
print ("you have more than 2 dimension")
print(np.random.randn( 4,3,2 ) ) #
In [21]:
A = np.eye(5)
B = np.random.randn(5,5)
print("you can do arithmetics with array")
D = A -2*B # arithmetic
x = np.arange(5.0)
y = x*x # this is a element-wise mult
print (np.dot(y,y)) # this is the scalar product
print (np.dot(D,y)) # this is the matrix product
In [22]:
A
Out[22]:
In [23]:
x = np.linspace(0,10,100000) # un 1E5 points
%timeit y = np.sin(2*x + 3*x**3)
Difference with Matlab
This is very close to Matlab approach. However there are some differences ( simplified here ).
*
in numpy is equivalent to .*
in matlabnumpy.dot()
Additionally, memory management is somewhat better than MatLab
In [24]:
D = A - 2*B # this creates a new matrix in memory
A -= 2*B # this does not
print(A)
In [25]:
import matplotlib.pylab as plt # traditionnal import
# this magic command embed graphics into page
%matplotlib inline
x = np.linspace(0, 4*np.pi, 100)
plt.plot(x, np.sin(x))
plt.plot(x, np.cos(x)) # 'r' means red
Out[25]:
check
semilogx() semilogy() loglog()
for log-plotsscatter() stem() bar()
for different formatscontour() contourp()
for 2D andimshow()
for imagesDocumentation is a bit complex and confuse !
Usefull references:
In [26]:
plt.figure(figsize=(8,6)) # forces size (x,y)
for beta in range(11):
plt.plot(np.kaiser(100, beta), label=r"$\beta=%.1f$"%beta)
# create a label, using LaTeX syntax and % operator for string formating
plt.legend(loc=0) # show the legend, loc=0 means "optimal" zone
Out[26]:
using the scatter function, to code 4 values : x, y, size, color
In [27]:
N = 50
x = np.random.rand(N) # generates random values
y = np.random.rand(N)
colors = np.random.rand(N)
area = np.pi * (15 * np.random.rand(N))**2 # 0 to 15 point radiuses
plt.scatter(x, y, s=area, c=colors, alpha=0.5) # alpha is transparency
Out[27]:
Contour plots are possible also, as well as multi-images
In [28]:
x = np.linspace(-5, 5, 100)
y = np.linspace(-5, 5, 100)
xx, yy = np.meshgrid(x, y)
z = np.sin(xx**2 + yy**2) / (xx**2 + yy**2)
plt.figure(figsize=(10,4)) #
plt.subplot(121) # 1 line 2 columns
h = plt.contourf(x,y,z) # 'filled' contours
plt.subplot(122)
h = plt.contour(x,y,z) # empty ones
This whole series : python ipython jupyter nupy scipy matplotlib realizes a very nice environment for scientists. It free, fast, quite complete, and very efficient.
There is in ipython a magic
command that imports everything into the current space :
%pylab inline
We are not going to use it, as it is a quick hack, good for tiny projects, and considered harmfull by many -and we are here in a school!
In [29]:
print("compute a difference")
x = numpy.linspace(0,10,1000)
y = numpy.sin(x)
yp = 1000*(y[1:] - y[:-1])/10
plt.plot(x, y, label='y' )
plt.plot(x[1:], yp, label="yp")
plt.legend()
Out[29]:
In [30]:
print("accessing pictures")
s = plt.imread("clown.jpg")
print(s.shape)
plt.imshow(s)
Out[30]:
In [31]:
plt.plot(s[70,:,1])
Out[31]:
In [32]:
print("compute histogram")
sg = s.sum(axis=2)/3.0
print (sg.shape)
h = plt.hist(sg.ravel(), bins=255)
In [33]:
print("accessing pictures")
s = plt.imread("embryos.tif")
print(s.shape)
plt.imshow(s)
Out[33]:
In [34]:
for i in range(3):
plt.figure()
plt.imshow(s[:,:,i], cmap='gray')
In [35]:
c1 = 1.0*s[:,:,0]
h = plt.hist(c1.ravel(), bins=255)
In [36]:
plt.plot(h[0])
Out[36]:
In [37]:
print("thresholding")
mask = numpy.where(c1<162,1,0)
plt.imshow(mask, cmap="gray_r")
Out[37]:
In [38]:
cleaned = c1*mask
plt.imshow(cleaned, cmap="gray")
Out[38]:
In [39]:
import matplotlib.cbook as cbook
lena = plt.imread(cbook.get_sample_data("lena.png"))
plt.imshow(lena)
print("Lena tells you good bye!")
This is obvious, your research is useless to the community unless ist is accessible to others.
That is why we write publications, and present in conferences
A program is a way of presenting your ideas
Whether it is
It actually present what you did (or want to do)
So a program is as valuable as the text of the publication. It expresses science.
For this is reason it should be
There are tools and methods to help managing programs
git
hg
mercurialOne of the purpose of this school was to create some awareness within the scientists that
is an organization dedicated to teaching computing skills to scientists, with support from the Alfred P. Sloan Foundation and the Mozilla Foundation.
Activities:
In [ ]: