Functions

We've already seen functions like rep, seq or sample in the vector part. Functions allow us to do the same things we coudl eventually do with long scripts, but do it more logically. Function save us time and typign space, shorten up code and make it clear.

This is an example that you can either use rep function or for loop to achieve same thing. But the rep is logical, shorter and nicer :)


In [6]:
vec_old = rep(c("a","b"),5)

vec = c("a", "b")
vec_new = c()
up_to = 5
for (i in 1:up_to){
    vec_new = c(vec_new, vec)
}

vec_new == vec_old


  1. TRUE
  2. TRUE
  3. TRUE
  4. TRUE
  5. TRUE
  6. TRUE
  7. TRUE
  8. TRUE
  9. TRUE
  10. TRUE

Common functions

Here are som most common adn default used functions:


In [1]:



25
1
2
1

In [3]:
seq(1, 20, by = 3)


  1. 1
  2. 4
  3. 7
  4. 10
  5. 13
  6. 16
  7. 19

In [2]:
rep(c(1:5), 3)


  1. 1
  2. 2
  3. 3
  4. 4
  5. 5
  6. 1
  7. 2
  8. 3
  9. 4
  10. 5
  11. 1
  12. 2
  13. 3
  14. 4
  15. 5

In [4]:
sqrt(16)


4

In [5]:
num_vec = 1:25

In [6]:
length(num_vec)


25

In [7]:
sum(num_vec)


325

In [8]:
mean(num_vec)


13

In [9]:
min(num_vec)


1

In [10]:
which(num_vec > 15)


  1. 16
  2. 17
  3. 18
  4. 19
  5. 20
  6. 21
  7. 22
  8. 23
  9. 24
  10. 25

Packaging

Functions come in packages. rep, seq, lenght etc. are part of {base} R package. Packages are the most powerful aspect of R.

Packages are installed like this:


In [ ]:
install.packages("ggplot2")
pack = c("dplyr", "ezanova")
install.packages(pack)

When installed, you need to allow them in each project/R instance. Loaded packages are vaailable to all scripts in the R session.


In [ ]:
library(dplyr)
library(ggplot)

Your own functions

As soon as you are comfortable with scripts, you should write your own functions to get rid of repetitive tasks.

Functions are defined as:


In [ ]:
fun = function(argument1, argument2){
    #does something
    return("SUCESS")
}

For example, if you are working with a lots of triangles you can come to the following situation:


In [ ]:
sideA1 = 3
sideA2 = 4
sideA3 = sqrt((sideA1 ^ 2) + (sideA2 ^ 2))

sideB1 = 10
sideB2 = 15
sideB3 = sqrt((sideB1 ^ 2) + (sideB2 ^ 2))

it would work, but it looks like crap. We can do several things about it. Firstly, lets put triangles to vectors to keep it neat:


In [7]:
triangle1 = c(3, 4, NA)
triangle2 = c(10, 15, NA)
triangle1[3] = sqrt((triangle1[1] ^ 2) + (triangle1[2] ^ 2))
triangle2[3] = sqrt((triangle2[1] ^ 2) + (triangle1[2] ^ 2))

Nicer, but still looks kinda crap. Lets fix it with a function.


In [ ]:
# calculate trialngle hypotenuse
hypotenuse = function(sideA, sideB){
  hypot = sqrt((sideA ^ 2) + (sideB ^ 2))
  return(hypot)
}

Now lets do the same thing with a function.


In [ ]:
triangle1 = c(3, 4, NA)
triangle2 = c(10, 15, NA)
triangle1[3] = hypotenuse(triangle1[1], triangle1[2])
triangle2[3] = hypotenuse(triangle2[1], triangle2[2])

It is not more effective, it is not significanly shorter, but it is more CLEAR TO UNDERSTAND!

Do not just write code that works, write code that is clear to read.

Sourcing

When you write your funcions, you can save them to a different file and then use the source function. It loads the R file and everything inside to your environment.

Let's open a file na write inside it.


In [17]:
file.create("functions-test.R")
f = file("functions-test.R", open = "w")
text = 
"kidding = function(){
    print('I am kidding')
}"
write(text, f)
close(f)


TRUE

In [19]:
source('functions-test.R')
kidding()


[1] "I am kidding"

Function scoping

There is something caled scope in programming. It deals with the fact that you will likely have multiple functions/variable of the same name in the project and want them to be different. Exampple:


In [9]:
add = function(number, number2){
    number = number + number2
    return(number)
}

Now what would happen if we run this:


In [10]:
number = 5
add(10, 10)
print(number)


20
[1] 5

But there is a different function.


In [11]:
num = 10
change = function(){
    num = 0
}
change()
print(number)


[1] 10

The logic is not fully clear and is very different form other programming languages. Function searches for the closest recognizable variable near - inside function scope. If it can't find it, it uses variable in the general scope of the entire R environment.

Let's consider somehting else


In [16]:
add = function(number, number2){
    number = number + number2
    return(number)
}
change_add = function(){
    number = 0
    number2 = 5
    print(add(number, number2))
    
}
number = 10
change_add()


[1] 5

In [19]:
add = function(number){
    number = number + number2
    return(number)
}
change_add = function(){
    number = 0
    number2 = 20
    print(add(number))
    
}
number2 = 5
number = 10
change_add()


[1] 5

R uses so called lexical scope. Therefore it searches for variables insside function and if they are noth there, goes straight to the general environemnt. If not there, it throws an error.

more info https://darrenjw.wordpress.com/2011/11/23/lexical-scope-and-function-closures-in-r/