DIC LAB 2 Problem 2 B : Repurpose Data : European Soccer Data


In [25]:
library("XML")
library("methods")
library("dplyr")
library("tidyr")

Read the player data


In [26]:
playersFinalData <- read.csv(file="player.csv", header=TRUE, sep=",")

Read the team data


In [27]:
teamFinalData <- read.csv(file="team.csv", header=TRUE, sep=",")

Display the data


In [28]:
head(playersFinalData)


player_api_idplayer_nameoverall_ratinggk_ratingappearancesleague_appearances
505942 Aaron Appindangoye67 43 8 1
155782 Aaron Cresswell 74 121 75 1
162549 Aaron Doran 71 125 104 1
30572 Aaron Galindo 75 111 10 1
23780 Aaron Hughes 78 129 162 1
27316 Aaron Hunt 79 132 158 1

Display the data


In [29]:
head(teamFinalData)


team_api_idteam_long_namechance_creationbuild_up_playdefenseawayGoalshomeGoalsgoals
9987 KRC Genk 182 148 205 143 205 348
9993 Beerschot AC 185 NA 210 62 112 174
10000 SV Zulte-Waregem 153 147 200 130 176 306
9994 Sporting Lokeren 154 159 200 130 149 279
9984 KSV Cercle Brugge200 145 195 94 119 213
8635 RSC Anderlecht 182 149 190 180 247 427

Qn) Which are the top 10 players by overall rating?


In [30]:
topPlayer = tbl_df(playersFinalData) %>% select(player_name,overall_rating) %>% top_n(10) %>% arrange(desc(overall_rating))


Selecting by overall_rating

In [31]:
head(topPlayer,n=10)


player_nameoverall_rating
Lionel Messi 94
Cristiano Ronaldo93
Gianluigi Buffon 93
Wayne Rooney 93
Gregory Coupet 92
Xavi Hernandez 92
Alessandro Nesta 91
Andres Iniesta 91
Fabio Cannavaro 91
Iker Casillas 91

Qn) Which are the top 10 goalkeepers by sum of gk attributes?


In [32]:
topGoalKeeper = tbl_df(playersFinalData) %>% select(player_name,gk_rating) %>% top_n(10) %>% arrange(desc(gk_rating))


Selecting by gk_rating

In [33]:
head(topGoalKeeper,n=10)


player_namegk_rating
Gianluigi Buffon 449
Gregory Coupet 447
Petr Cech 445
Iker Casillas 442
Manuel Neuer 440
Julio Cesar 438
Sebastian Frey 437
Edwin van der Sar436
David De Gea 429
Pepe Reina 428

Qn) Which are the top 10 players by number of appearances (with any team they played with)?


In [34]:
topAppearances = select(playersFinalData,player_name,appearances) %>% top_n(10) %>% arrange(desc(appearances))


Selecting by appearances

In [35]:
head(topAppearances,n=10)


player_nameappearances
Steve Mandanda 300
Stephane Ruffier 294
Gorka Iraizoz Moreno286
Samir Handanovic 286
Hugo Lloris 282
Tim Howard 282
Joe Hart 275
Cedric Carrasso 268
Daniel Congre 262
Moussa Sissoko 260

Qn) Which are the top 10 players by number of leagues they played in?


In [36]:
topLeagueAppearances = select(playersFinalData,player_name,league_appearances) %>% top_n(10) %>% arrange(desc(league_appearances))


Selecting by league_appearances

In [37]:
head(topLeagueAppearances,n=10)


player_nameleague_appearances
Gelson Fernandes 6
Filip Djuricic 5
Ricky van Wolfswinkel5
Aleksandar Tonev 4
Alfred Finnbogason 4
Almen Abdi 4
Aly Cissokho 4
Angel Di Maria 4
Bojan Krkic 4
Bryan Ruiz 4

Qn) Which are the top 10 teams by sum of build up play attributes?


In [38]:
topBuildUpTeam = tbl_df(teamFinalData) %>% select(team_long_name,build_up_play) %>% top_n(10) %>% arrange(desc(build_up_play))


Selecting by build_up_play

In [39]:
head(topBuildUpTeam,n=10)


team_long_namebuild_up_play
Torino 209
Udinese 205
Fortuna Düsseldorf 198
Podbeskidzie Bielsko-Biała193
Korona Kielce 192
Carpi 190
Widzew Łódź 190
Widzew Łódź 190
FC Nantes 189
Lazio 188

Qn) Which are the top 10 teams by sum of chance creation attributes?


In [40]:
topChanceCreationTeam = tbl_df(teamFinalData) %>% select(team_long_name,chance_creation) %>% top_n(10) %>% arrange(desc(chance_creation))


Selecting by chance_creation

In [41]:
head(topChanceCreationTeam,n=10)


team_long_namechance_creation
Lazio 220
SV Werder Bremen 216
Napoli 215
Manchester City 210
Tottenham Hotspur 210
Birmingham City 210
Wolverhampton Wanderers210
Burnley 210
Swansea City 210
Queens Park Rangers 210

Qn) Which are the top 10 teams by sum of defense attributes?


In [42]:
topDefenseTeam = tbl_df(teamFinalData) %>% select(team_long_name,defense) %>% top_n(10) %>% arrange(desc(defense))


Selecting by defense

In [43]:
head(topDefenseTeam,n=10)


team_long_namedefense
Beerschot AC 210
KVC Westerlo 210
Sporting Charleroi 210
West Bromwich Albion 210
Hull City 210
Birmingham City 210
Wolverhampton Wanderers210
Blackpool 210
Swansea City 210
Queens Park Rangers 210

Qn) Which are the top 10 teams by number of scored goals?


In [44]:
topGoalScoringTeams = teamFinalData %>% select(team_long_name,goals) %>% top_n(10) %>% arrange(desc(goals))


Selecting by goals

In [45]:
head(topGoalScoringTeams,n=10)


team_long_namegoals
FC Barcelona 849
Real Madrid CF 843
Celtic 695
FC Bayern Munich 653
PSV 652
Ajax 647
FC Basel 619
Manchester City 606
Chelsea 583
Manchester United582

References


In [ ]: