Access dashDB (or DB2) using RODBC from R notebook

IBM dashDB is a fully managed cloud data warehouse, purpose-built for analytics. In this article we will explore how to connect to it from a R notebook and perform data access using RODBC.

To get started you will need credentials for a dashDB database. If you don't already have an instance of dashDB you can get one for free by following steps in this article.

Next, create a new Jupyter notebook in Data Scientist Workbench. Set it to use R.



The RODBC package and the dashDB/DB2 CLI (ODBC) driver is already installed and available in your R notebook in Data Scientist Workbench. Create a new cell and load the RODBC package:
library(RODBC)
Enter the other connection details for your instance of dashDB (or DB2 database).
#Enter the values for you database connection
dsn_driver <- "{IBM DB2 ODBC Driver}"
dsn_database <- "BLUDB"           # e.g. "BLUDB"
dsn_hostname <- "<Enter Hostname>" # e.g.: "awh-yp-small03.services.dal.bluemix.net"
dsn_port <- "50000"               # e.g. "50000"
dsn_protocol <- "TCPIP"           # i.e. "TCPIP"
dsn_uid <- "<Enter UserID>"       # e.g. "dash104434"
dsn_pwd <- "<Enter Password>"     # e.g. "7dBZ39xN6$o0JiX!m"
Run the cell above. Create a new cell to create the connection:
conn_path <- paste("DRIVER=",dsn_driver,
                  ";DATABASE=",dsn_database,
                  ";HOSTNAME=",dsn_hostname,
                  ";PORT=",dsn_port,
                  ";PROTOCOL=",dsn_protocol,
                  ";UID=",dsn_uid,
                  ";PWD=",dsn_pwd,sep="")
conn <- odbcDriverConnect(conn_path)
conn

Let's get a list of tables in the samples schema:
table.list <- sqlTables(conn,tableType="TABLE", schema="SAMPLES")
cat("There are", nrow(table.list), "tables in the", schema, "schema.\n")
Let's get a list of columns in a sample table:
table.name <- "SAMPLES.COW"
col.list <- sqlColumns(conn,table.name)
cat("There are", nrow(col.list), "columns defined in", table.name,"\n")
You can use the very versatile sqlQuery() function to issue SQL Data Manipulation Language (DML) statements such as SELECT, INSERT, UPDATE, or DELETE, and SQL Data Definition Language (DDL) statements such as CREATE TABLE. Let's use sqlFetch to fetch a row from a sample table:
cows <- sqlFetch(conn, table.name)
print (cows[1,1:4], row.names=FALSE)
We can use the sqlSave() function to create and populate a new table. In this scenario, the safer parameter allows the script to replace any existing table with the same name.
tab.name <- "CLASSMARKS"
NAMES <- c("Bob","Mary","Fred")
MARKS <- c(78,88,91)

# Create a data frame of test scores and names
CLASSMARKS <- data.frame (NAMES,MARKS,stringsAsFactors=FALSE)

# Create a new table and populate it with the data frame CLASSMARKS
sqlSave(conn, CLASSMARKS, rownames=FALSE,safer=FALSE)

NEWCLASS <- sqlFetch(conn,tab.name)
cat( "Mean mark for the class is", mean(NEWCLASS[,"MARKS"]),"\n")


Cool! You've accessed data in a dashDB data warehouse using RODBC from a R notebook. Finally, as a best practice we should close the database connection once we're done with it.
odbcCloseAll()





Feedback and Knowledge Base