Access dashDB (or DB2) using RJDBC from R notebook

IBM dashDB is a fully managed cloud data warehouse, purpose-built for analytics. In this article we will explore how to connect to it from a R notebook and perform data access using RJDBC.

To get started you will need credentials for a dashDB database. If you don't already have an instance of dashDB you can get one for free by following steps in this article.

Next, create a new Jupyter notebook in Data Scientist Workbench. Set it to use R.



By default, the RJDBC package is pre-installed on your workbench. In case it is not, you can follow the instructions at the bottom.

Create a new cell and load the RJDBC package:
library(RJDBC);
Enter the other connection details for your instance of dashDB (or DB2 database).
#Enter the values for you database connection
dsn_driver = "com.ibm.db2.jcc.DB2Driver"
dsn_database = "BLUDB"           # e.g. "BLUDB"
dsn_hostname = "<Enter Hostname>" # e.g.: "awh-yp-small03.services.dal.bluemix.net"
dsn_port = "50000"               # e.g. "50000"
dsn_protocol = "TCPIP"           # i.e. "TCPIP"
dsn_uid = "<Enter UserID>"       # e.g. "dash104434"
dsn_pwd = "<Enter Password>"     # e.g. "7dBZ39xN6$o0JiX!m"
Run the cell above. Create a new cell to create the connection:
jcc = JDBC("com.ibm.db2.jcc.DB2Driver", "/usr/local/lib/db2jcc4.jar");
jdbc_path = paste("jdbc:db2://", dsn_hostname, ":", dsn_port, "/", dsn_database, sep="");
conn = dbConnect(jcc, jdbc_path, user=dsn_uid, password=dsn_pwd)
Next, execute a query against a sample table to verify everything is working:
query = "SELECT * FROM SAMPLES.COW";
rs = dbSendQuery(conn, query);
df = fetch(rs, -1);
df
Cool! You've accessed data in a dashDB data warehouse using RJDBC connection from a R notebook. Finally, as a best practice we should close the database connection once we're done with it.
dbDisconnect(conn)
Good luck!

---

[Optional Instructions if RJDBC library is not installed]:
The following steps are needed only in case the RJDBC library fails to load because it is not installed. If that is the case, create a new cell, type the following command and execute it (Ctrl+Enter):
install.packages("RJDBC")
Note: its possible the default repo may not have the latest package, in which case you can specify a different repo, e.g.:
install.packages("RJDBC", repo = "http://cran.r-project.org/")
# or:
install.packages("RJDBC", repo = "http://cran.stat.ucla.edu/")



Feedback and Knowledge Base