The purpose of this guide is to provide data scientists and engineers adequate resources to get started with coding in R using Google Cloud Platform Data products like GCS, Cloud SQL and BigQuery.
Refer to the googleCloudStorageR package introduction here to learn more about the R package used to access GCS resources.
Refer to the bigrquery R package documentation to learn more about accessing BigQuery resources in R. The guide also provides examples to make authenticated user calls to the BigQuery service and also ways to read/write data from/to BQ.
Refer to the Using R with Google Cloud SQL for MySQL guide by GCP to get an introduction to accessing MySQL on a Cloud SQL instance.
R packages can be installed using JupyterLab notebooks or on the command line. Below are examples of each using the standard R repo.
R packages can be installed through command file using the below command: \
R -e "install.packages('abind', dependencies=TRUE, repos='http://cran.rstudio.com/')"
\
To install R package from the JupyterLab notebook itself, use the below command: \
install.packages('abind', dependencies=TRUE, repos='http://cran.rstudio.com/')"
The best method for authentication is to use your own Google Cloud Project. You can specify the location of a service account JSON file taken from your Google Project:
In [ ]:
Sys.setenv("GCS_AUTH_FILE" = "/fullpath/to/auth.json")
This file will then used for authentication via gcs_auth() when you load the library:
In [ ]:
## GCS_AUTH_FILE set so auto-authentication
library(googleCloudStorageR)
gcs_get_bucket("your-bucket")
When using bigrquery interactively, you’ll be prompted to authorize bigrquery in the browser. Your token will be cached across sessions inside the folder ~/.R/gargle/gargle-oauth/
, by default. For non-interactive usage, it is preferred to use a service account token and put it into force via
bq_auth(path = "/path/to/your/service-account.json")
. More places to learn about auth:
bigrquery::bq_auth()
.gargle::token_fetch()
, which supports a variety of token flows. This article provides full details, such as how to take advantage of Application Default Credentials or service accounts on GCE VMs.Refer to R with Cloud SQL for MySQL documentation.
In [ ]:
# Load the DBI library
library(DBI)
# Helper for getting new connection to Cloud SQL
getSqlConnection <- function(){
con <-
dbConnect(
RMySQL::MySQL(),
username = 'username',
password = 'password',
host = '127.0.0.1',
dbname = 'example'
) # TODO: use a configuration group `group = "my-db")`
return(con)
}