What is LabKey server?
LabKey Server is free, open source software available for scientists to integrate, analyze, and share biomedical research data. The platform provides a secure data repository that allows web-based querying, reporting, and collaborating across a range of data sources. Specific scientific applications and workflows can be added on top of the basic platform and leverage a data processing pipeline.
Although it is mainly targeting biomedical research, I consider it can be applied to other fields without much headache. Below lists some of its key features.
LabKey server supports/includes
- popular RDBMS (MS SQL, MySQL, PostgreSQL, Oracle) and SAS Data as well as Excel, text and AWS S3
- built-in web parts for UI development
- data grid and charts can be useful for quick analysis - see this example
- scripting engines including R, Java, Perl as well as SQL queries
- tight integration with R outputs (charts …) as reports and even R Markdown documents
- Note it may incur additional on-off investment in Ext.js to fully utilize the API
- pipeline server that can handle heavy/long computing/processing
- modules to package certain functionality such as Workflow or analysis
- good set of authentication options
- useful extra features
- Message Board
- Issue Tracking
Thanks to these features, I consider LabKey server can be used as an internal collaboration tool as well as a framework to deliver products for external clients, focusing more on the What part. In the rest of this post, some basic features of LabKey server is introduced by creating a project and generating a report that consists of a data grid, built-in chart and R report.
The latest stable version of LabKey server is 16.2. I installed it on my Windows labtop after downloading the Windows Graphical Installer (LabKey16.2-45209.14-community-Setup.exe) from the product page. While installing, it sets up SMTP connection and the following components are also added to
C:\Program Files (x86)\LabKey Server: Java Runtime Environment 1.8.0_92, Apache Tomcat 7.0.69 and PostgreSQL 9.5. Note that installation may fail due to port confliction if you’ve got Tomcat or PostgreSQL installed already. In this case, it’d be necessary to uninstall existing versions or use the manual installation option.
After installation, the server can be accessed via
http://localhost:8080 - 8080 is the default port of Tomcat server. And it is required to set up a user. Then it is ready to play with.
I set up a Collaboration project named LabkeyIntro through the following steps.
The start page shows Wiki and Messages web parts in the main panel and a Pages web part is shown in the right side bar. Also it is possible to add another web part by seleting and clicking a button.
Add a list
I removed the existing web parts and added a Lists web part in the side bar. By clicking the MANAGE LISTS link, it is possible to add a new list. Note that lists are the simplest data structure, which are tabular and have primary keys but don’t require participant ids or time/visit information. Check this to see other data structures.
The imported data is shown in a grid view by clicking the name of the list. By default the following features are provided in a grid view and they are quite useful to investigate/manage data as well as a customized view can be shown as a report.
- sort/filter in Customize Grid
- insert or delete a row
- export to Excel, text or script
- print and paging
- add or import fields in Design
Added to the above features, it provides 3 built-in chart types: Box plot, Scatter plot and Time series plot. I made a scatter plot of age versus credit score, grouping by a boolean variable of email status.
What’s more interesting and useful is its integration with R. While a more sophistigated report can be generated by R markdown, I just added a scatter plot matrices as a R report in this trial. By default, R script engine is not enabled so that it is necessary to turn it on as following.
Note that I only changed the program path and pandoc/rmarkdown is not set up. For further details, see this article.
Also note that, a package is loaded from where R is installed (eg
C:\Program Files\R\R-3.3.1\library) so that, if a package is not installed by administrator, it is not loaded. For example, I needed the car package but, as it is installed in my user account’s site directory (ie
C:\Users\jaehyeon\Documents\R\win-library\3.3), it was not loaded. In order to resolve this, I opened the R terminal (R.exe) as administrator and installed the package as following.
install.packages("car", lib="C:\\Program Files\\R\\R-3.3.1\\library", dependencies = TRUE)
Once it is ready, it is relatively straightforward to add a R output as a report - see the screen shots below.
Note that a graphics device (
png()) is explicitly set up.
Organize project page
Now there are three sections to deliver - data, built-in chart and R report. In order to organize them, I created 3 tabs: Example Data, Built-in Chart and R Intro as shown below.
A List - Single web part is added to Example Data while Report web parts are included in the remaining two tabs.
This is all I’ve done within several hours. Although only a few basic features are implemented, I consider it provides good amount of information for internal collaboration.
It’s too early but would you consider it’s alright to resort to LabKey server as an effective tool for the How part? Please inform your ideas.