As a lifelong Motorcycle Grand Prix (MotoGP) enthusiast, stumbling across Ankur Vishwakarma’s Predicting MotoGP Race Finish Times using Linear Regression was a fantastic read, and I immediately realized that MotoGP data is now much more available than it once was.
As part of an Undergraduate Research Project in 2008, I had briefly tried to use similar datasets to explore the effects of rider attrition and optimal expected performance timeframes, but unlike baseball the data was very difficult to obtain. Fast forward now to official repositories and web scraping resources I could have only dreamt about in 2008.
Vishwakarma’s .csv dataset was scraped from the MotoGP Official Website. I have since forked his research into my Github account. The following is his Data Dictionary:
Summary of Data
This file contains the following data points for racing sessions only between 2005 and 2017 in all 3 racing classes.
int
value of racing season.str
abbreviation of track.str
format.str
.str
if the race was a RAC (race) or RAC2 (race2) session of the day.datetime
(date only) of the race.str
format racetrack conditions, whether dry, wet, or dry-wet.float
value of track ground temperature in celsius (C).float
value of air temperature in celsius (C).float
value of local humidity between 0 (0%) and 1 (100%).float
value of points earned by rider during this race session.str
.str
value of the finishing time for the winning rider. Riders finishing 2nd or worse have the added time value. Riders who crash have the number of full completed laps indicated.timedelta
value obtained from the Time column. This is each rider’s total time to finish the race. Crashed riders are ignored.
Here is the .csv data table:
library(pander)
URL <- "https://raw.githubusercontent.com/nbugliar/motogp_regression/master/MotoGP_2005_2017.csv"
dat <- read.csv( URL, stringsAsFactors=F )
m <- dat[1:5, 2:22]
pandoc.table(m, style= "rmarkdown")
##
##
## | Year | TRK | Track | Category |
## |:----:|:---:|:--------------------------------------------------:|:--------:|
## | 2017 | QAT | Grand Prix of Qatar - Losail International Circuit | MotoGP |
## | 2017 | QAT | Grand Prix of Qatar - Losail International Circuit | MotoGP |
## | 2017 | QAT | Grand Prix of Qatar - Losail International Circuit | MotoGP |
## | 2017 | QAT | Grand Prix of Qatar - Losail International Circuit | MotoGP |
## | 2017 | QAT | Grand Prix of Qatar - Losail International Circuit | MotoGP |
##
## Table: Table continues below
##
##
##
## | Session | Date | Track_Condition | Track_Temp | Air_Temp |
## |:-------:|:----------:|:---------------:|:----------:|:--------:|
## | RAC | 2017-03-26 | Dry | 22 | 21 |
## | RAC | 2017-03-26 | Dry | 22 | 21 |
## | RAC | 2017-03-26 | Dry | 22 | 21 |
## | RAC | 2017-03-26 | Dry | 22 | 21 |
## | RAC | 2017-03-26 | Dry | 22 | 21 |
##
## Table: Table continues below
##
##
##
## | Humidity | Position | Points | Rider_Number | Rider_Name |
## |:--------:|:--------:|:------:|:------------:|:-----------------:|
## | 0.96 | 1 | 25 | 25 | Maverick VIÑALES |
## | 0.96 | 2 | 20 | 4 | Andrea DOVIZIOSO |
## | 0.96 | 3 | 16 | 46 | Valentino ROSSI |
## | 0.96 | 4 | 13 | 93 | Marc MARQUEZ |
## | 0.96 | 5 | 11 | 26 | Dani PEDROSA |
##
## Table: Table continues below
##
##
##
## | Nationality | Team_Name | Bike | Avg_Speed | Time |
## |:-----------:|:----------------------:|:------:|:---------:|:---------:|
## | SPA | Movistar Yamaha MotoGP | Yamaha | 165.5 | 38'59.999 |
## | ITA | Ducati Team | Ducati | 165.5 | +0.461 |
## | ITA | Movistar Yamaha MotoGP | Yamaha | 165.4 | +1.928 |
## | SPA | Repsol Honda Team | Honda | 165 | +6.745 |
## | SPA | Repsol Honda Team | Honda | 165 | +7.128 |
##
## Table: Table continues below
##
##
##
## | Finish_Time | GP |
## |:-------------------------:|:----------------------------------:|
## | 0 days 00:38:59.999000000 | QAT - Losail International Circuit |
## | 0 days 00:39:00.460000000 | QAT - Losail International Circuit |
## | 0 days 00:39:01.927000000 | QAT - Losail International Circuit |
## | 0 days 00:39:06.744000000 | QAT - Losail International Circuit |
## | 0 days 00:39:07.127000000 | QAT - Losail International Circuit |
There are many different avenues one could pursue with this information. I hope you find it as valuable as I have, and welcome any and all feedback and recommendations as to best use this dataset in the future.
Upon an initial inspection, I think that the following should take place:
Thoughts? my email