A brief tutorial of basic python

From the wikipedia: "Python is a widely used general-purpose, high-level programming language. Its design philosophy emphasizes code readability, and its syntax allows programmers to express concepts in fewer lines of code than would be possible in languages such as C++ or Java. The language provides constructs intended to enable clear programs on both a small and large scale."

Through this tutorial, students will learn some basic characteristics of the Python programming language, that will be useful for working with corpuses of text data.

1. Introduction to Strings

Among the different native python types, we will focus on strings, since they will be the core type that we will recur to represent text. Essentially, a string is just a concatenation of characters.



In [1]:

    
str1 = '"Hola" is how we say "hello" in Spanish.'
str2 = "Strings can also be defined with quotes; try to be sistematic."

It is easy to check the type of a variable with the type() command:



In [2]:

    
print str1
print type(str1)
print type(3)
print type(3.)

The following commands implement some common operations with strings in Python. Have a look at them, and try to deduce what the result of each operation will be. Then, execute the commands and check what are the actual results.



In [3]:

    
print str1[0:5]



In [4]:

    
print str1+str2



In [5]:

    
print str1.lower()



In [6]:

    
print str1.upper()



In [7]:

    
print len(str1)



In [8]:

    
print str1.replace('h','H')



In [2]:

    
str= 'This is a question'
str.replace('i','o’)
str.lower()
print str[0:4]









    



  File "<ipython-input-2-96f1e452684b>", line 2
    str.replace(‘i’,'o’)
                ^
SyntaxError: invalid syntax

It is interesting to notice the difference in the use of commands 'lower' and 'len'. Python is an object-oriented language, and str1 is an instance of the Python class 'string'. Then, str1.lower() invokes the method lower() of the class string to which object str1 belongs, while len(str1) or type(str1) imply the use of external methods, not belonging to the class string. In any case, we will not pay (much) attention to these issues during the session.

Finally, we remark that there exist special characters that require special consideration. Apart from language-oriented characters or special symbols (e.g., \euro), the following characters are commonly used to denote carriage return and the start of new lines



In [9]:

    
print 'This is just a carriage return symbol.\r Second line will start on top of the first line.'



In [10]:

    
print 'If you wish to start a new line,\r\nthe line feed character should also be used.'



In [11]:

    
print 'But note that most applications are tolerant\nto the use of \'line feed\' only.'

2. Working with Python lists

Python lists are containers that hold a number of other objects, in a given order. To create a list, just put different comma-separated values between square brackets



In [1]:

    
list1 = ['student', 'teacher', 1997, 2000]
print list1
list2 = [1, 2, 3, 4, 5 ]
print list2
list3 = ["a", "b", "c", "d"]
print list3

To check the value of a list element, indicate between brackets the index (or indices) to obtain the value (or values) at that position (positions).

Run the code fragment below, and try to guess what the output of each command will be.

Note: Python indexing starts from 0!!!!



In [7]:

    
print list1[0]
print list2[2:4]
print list3[-1]

To add elements in a list you can use the method append() and to remove them the method remove()



In [13]:

    
list1 = ['student', 'teacher', 1997, 2000]
list1.append(3)
print list1
list1.remove('teacher')
print list1

Other useful functions are:

len(list): Gives the number of elements in a list.    
max(list): Returns item from the list with max value.  
min(list): Returns item from the list with min value.



In [7]:

    
list2 = [1, 2, 3, 4, 5 ]
print len(list2)
print max(list2)
print min(list2)

3. Flow control (with 'for' and 'if')

As in other programming languages, python offers mechanisms to loop through a piece of code several times, or for conditionally executing a code fragment when certain conditions are satisfied.

For conditional execution, you can we use the 'if', 'elif' and 'else' statements.

Try to play with the following example:



In [9]:

    
x = int(raw_input("Please enter an integer: "))
if x < 0:
    x = 0
    print 'Negative changed to zero'
elif x == 0:
    print 'Zero'
elif x == 1:
    print 'Single'
else:
    print 'More'

The above fragment, allows us also to discuss some important characteristics of the Python language syntaxis:

Unlike other languages, Python does not require to use the 'end' keyword to indicate that a given code fragment finishes. Instead, Python recurs to indentation
Indentation in Python is mandatory, and consists of 4 spaces (for first level indentation)
The condition lines conclude with ':', which are then followed by the indented blocks that will be executed only when the indicated conditions are satisfied.

The statement 'for' lets you iterate over the items of any sequence (a list or a string), in the order that they appear in the sequence



In [24]:

    
words = ['cat', 'window', 'open-course']
for w in words:
     print w, len(w)

In combination with enumerate(), you can iterate over the elementes of the sequeence and have a counter over them



In [26]:

    
words = ['cat', 'window', 'open-course']
for (i, w) in enumerate(words):
     print 'element ' + str(i) + ' is ' + w

4. File input and output operations

First of all, you need to open a file with the open() function (if it does not exist, it creates it).



In [38]:

    
f = open('workfile', 'w')

The first argument is a string containing the filename. The second argument defines the mode in which the file will be used:

'r' : only to be read,
'w' : for only writing (an existing file with the same name would be erased),
'a' : the file is opened for appending; any data written to the file is automatically appended to the end. 
'r+': opens the file for both reading and writing.

If the mode argument is not included, 'r' will be assumed.

Use f.write(string) to write the contents of a string to the file. When you are done, do not forget to close the file:



In [39]:

    
f.write('This is a test\n with 2 lines')
f.close()

To read the content of a file, use the function f.read():



In [42]:

    
f2 = open('workfile', 'r')
text=f2.read()
f2.close()
print text

You can also read line by line from the file identifier



In [44]:

    
f2 = open('workfile', 'r')
for line in f2:
    print line

f2.close()

5. Modules import

Python lets you define modules which are files consisting of Python code. A module can define functions, classes and variables.

Most Python distributions already include the most popular modules with predefined libraries which make our programmer lifes easier. Some well-known libraries are: time, sys, os, numpy, ...

There are several ways to import a library:

1) Import all the contents of the library: import lib_name

Note: You have to call these methods as part of the library



In [4]:

    
import time
print time.time()  # returns the current processor time in seconds
time.sleep(2) # suspends execution for the given number of seconds
print time.time() # returns the current processor time in seconds again!!!

2) Define a short name to use the library: import lib_name as lib



In [6]:

    
import time as t
print t.time()

3) Import only some elements of the library

Note: now you have to use the methods directly



In [2]:

    
from time import time, sleep
print time()