Pre-Workshop Instructions

Please perform the following steps before the workshop. These steps take a few minutes and you may have issues specific to your machine.

STEP 1: Install the latest Cytoscape into your Desktop environment(3.9.0)

  • If you have Cytoscape installed before 3.9.0, please update it.
  • Download Cytoscape.
  • Open and follow installation steps.
    • If you don’t have Java in your environment, Cytoscape will ask you if you want to download it. Please accept it and download Java.

Mac users need to be careful about image

  • Launch Cytoscape 3.9.0 from start menu or Desktop shortcut

image

  • A message like the above image will appear, so press the OK button to restart Cytoscape.
  • and keep Cytoscape up and running.

STEP 2: Install Google Chrome and the RCy3 package in Google Colab

  • You will need to get a Google Account to use Colab.
  • You will need to install Google Chrome. Our Notebook does not work with Safari. Firefox may work, but I haven’t confirmed it.
  • Open this Google Colab link in Chrome while logged in to Google.
  • Run the following code cell
devtools::install_github("cytoscape/RCy3")
  • It takes a few minutes for install_github to finish.
    • Let’s leave it alone and move onto the main workshop.

Main workshop (primer)

Self-introduction

Kozo Nishida, RIKEN, Japan

  • A member of Bioconductor Community Advisory Board (CAB)
  • Author of a Bioconductor package based on RCy3 (transomics2cytoscape)
  • Cytoscape community contributor (Google Summer of Code, Google Season of Docs)
  • Author of KEGGscape Cytoscape App

What is Cytoscape?

image

  • Open source, cross platform Java desktop GUI app.
  • for network visualization.

Core concepts

Networks and Tables: Network nodes and edges have annotation tables.

image

image

Color, shape, size, or … according to the annotation table can be mapped to nodes and edges.

Why do we need to automate?

Why automate Cytoscape when I could just use the GUI directly?

  • For things you want to do multiple times, e.g., loops
  • For things you want to repeat in the future
  • For things you want to share with colleagues or publish
  • For things you are already working on in R or Python, etc
    • To prepare data for collaborators

In short, for “reproducibility”, “data sharing”, “the use of R or Python”.

How can Cytoscape GUI operations be automated?

image

  • Cytoscape makes that possible with the REST API.

  • Today Cytoscape is not only a Desktop application but also a REST server.

  • You can check if Cytoscape is now working as a server with the command below.

    curl localhost:1234
  • Now Cytoscape has REST API for almost every GUI operation.

    • RCy3 or py4cytoscape is R or Python wrapper of the REST API
    • py4cytoscape is Python clone of RCy3, py4cytoscape has same function specifications with RCy3
  • Since table operations are essential for Bioinformatics, it is convenient to be able to operate them with R[dplyr] or Python[pandas].

CyREST: Turbocharging Cytoscape Access for External Tools via a RESTful API. F1000Research 2015.

Cytoscape Automation: empowering workflow-based network analysis. Genome Biology 2019.

Translating R data into a Cytoscape network using RCy3

Networks offer us a useful way to represent our biological data. But how do we seamlessly translate our data from R into Cytoscape?

image

From here it finally becomes hands-on using Google Colab. Aside from the details, let’s connect Google Colab to local Cytoscape.

Make sure your local Cytoscape is fully up and running before running the code below. It will take some time for Cytoscape to start up and its REST server to start up completely. (Please wait for about 10 seconds.)

library(RCy3)
browserClientJs <- getBrowserClientJs()
IRdisplay::display_javascript(browserClientJs)
cytoscapePing()

Why was the remote Google Colab able to communicate with the local Cytoscape REST service?

We need a detailed description of what happened in

browserClientJs <- getBrowserClientJs()
IRdisplay::display_javascript(browserClientJs)

We used a technology called Jupyter Bridge in the above code. Jupyter Bridge is a JavaScript implementation that makes HTTP requests from a remote REST client look like local requests.

image

Since it is difficult to access Cytoscape in the desktop environment from a remote environment, we use Jupyter Bridge.

And since I couldn’t get Jupyter Bridge to work in the Orchestra environment, this workshop is exceptionally using Google Colab instead of Orchestra.

If you have RCy3 installed locally instead of remotely like Google Colab, you don’t need to use this Jupyter Bridge technology.

(Then) Why use Jupyter Bridge?

  • Users do not need to worry about dependencies and environment.
  • Easily share notebook-based workflows and data sets
  • Workflows can reside in the cloud, access cloud resources, and yet still use Cytoscape features.

Let’s go back to how to translate R data into a Cytoscape network…

Create a Cytoscape network from some basic R objects

nodes <- data.frame(id=c("node 0","node 1","node 2","node 3"),
    group=c("A","A","B","B"), # categorical strings
    score=as.integer(c(20,10,15,5)), # integers
    stringsAsFactors=FALSE)
nodes
edges <- data.frame(source=c("node 0","node 0","node 0","node 2"),
    target=c("node 1","node 2","node 3","node 3"),
    interaction=c("inhibits","interacts","activates","interacts"),  # optional
    weight=c(5.1,3.0,5.2,9.9), # numeric
    stringsAsFactors=FALSE)
edges

Data frame used to create Network

image

Create Network

createNetworkFromDataFrames(nodes, edges, title="my first network", collection="DataFrame Example")

Export an image of the network

Remember. All networks we make are created in Cytoscape so get an image of the resulting network and include it in your current analysis if desired.

exportImage("my_first_network", type = "png")

Initial simple network

image

Main workshop (more practical)

Example Use Case

Omics data - I have a ———– fill in the blank (microarray, RNASeq, Proteomics, ATACseq, MicroRNA, GWAS …) dataset. I have normalized and scored my data. How do I overlay my data on existing interaction data?

The example data set

We downloaded gene expression data from the Ovarian Serous Cystadenocarcinoma project of The Cancer Genome Atlas (TCGA)(International Genome et al.), http://cancergenome.nih.gov via the Genomic Data Commons (GDC) portal(Grossman et al.) on 2017-06-14 using TCGABiolinks Bioconductor package(Colaprico et al.).

  • 300 samples available as RNA-seq data
  • 79 classified as Immunoreactive, 72 classified as Mesenchymal, 69 classified as Differentiated, and 80 classified as Proliferative samples
  • RNA-seq read counts were converted to CPM values and genes with CPM > 1 in at least 50 of the samples are retained for further study
  • The data was normalized and differential expression was calculated for each cancer class relative to the rest of the samples.

We will use the following table as a result of the analysis to integrate it into a interaction network:

  • Gene ranks - containing the p-values, FDR and foldchange values for the 4 comparisons (mesenchymal vs rest, differential vs rest, proliferative vs rest and immunoreactive vs rest)
library(RCurl)
matrix <- getURL("https://raw.githubusercontent.com/cytoscape/cytoscape-tutorials/gh-pages/presentations/modules/RCy3_ExampleData/data/TCGA_OV_RNAseq_All_edgeR_scores.txt")
RNASeq_gene_scores <- read.table(text=matrix, header = TRUE, sep = "\t", quote="\"", stringsAsFactors = FALSE)
RNASeq_gene_scores
top_mesenchymal_genes <- RNASeq_gene_scores[which(RNASeq_gene_scores$FDR.mesen < 0.05 & RNASeq_gene_scores$logFC.mesen > 2),]
head(top_mesenchymal_genes)

Overlay our expression analysis data on the STRING network

To do this we will be using the loadTableData function from RCy3. It is important to make sure that that your identifiers types match up. You can check what is used by STRING by pulling in the column names of the node attribute table.

Overlay our expression data on the String network - cont’d

If you are unsure of what each column is and want to further verify the column to use you can also pull in the entire node attribute table.

node_attribute_table_topmesen <- getTableColumns(table="node")
head(node_attribute_table_topmesen[,3:7])

image

The column “display name” contains HGNC gene names which are also found in our Ovarian Cancer dataset.

To import our expression data we will match our dataset to the “display name” node attribute.

loadTableData(RNASeq_gene_scores, table.key.column = "display name", data.key.column = "Name")  #default data.frame key is row.names

Visual Style

Modify the visual style Create your own visual style to visualize your expression data on the String network. Start with a default style

Formatted String network

mesen_string_network