1

I have a big data frame that look like this:

GeneSymbol   Sample1     Sample2    Sample3    Sample4

 A           value11     value12    value13    value14
 A           value21     value22    value23    value24
 B           etc.        etc.
 B
 B
 B
 C
 C
 C

I would like to plot density functions by group and by lines (rows). For example: for group A, two density plots because there are two elements belonging to group A; for group B, 4 density plots because there are 4 elements belonging to group B, etc.

Artem
  • 2,591
  • 3
  • 14
  • 36
Elb
  • 1,749
  • 4
  • 21
  • 32
  • 2
    Anything similar to this? http://learnr.wordpress.com/2009/08/20/ggplot2-version-of-figures-in-lattice-multivariate-data-visualization-with-r-part-13-2/ – Roman Luštrik Mar 22 '12 at 10:29
  • 1
    Or maybe plotmatrix... http://learnr.wordpress.com/2009/07/20/ggplot2-version-of-figures-in-lattice-multivariate-data-visualization-with-r-part-6/ – Roman Luštrik Mar 22 '12 at 10:32
  • 1
    Or of course http://learnr.wordpress.com/2009/07/02/ggplot2-version-of-figures-in-lattice-multivariate-data-visualization-with-r-part-3/ – Roman Luštrik Mar 22 '12 at 10:33

1 Answers1

0

You can use the combination of dplyr and ggplot2 packages to plot density plots for the case.

First you arrange the gene symbols and number them. After you transform the data frame from wide format into narrow, so the sample numbers will be in the separate column. For the operation you are using gather function.

For plotting you can use ggplot and density geom, then plot the data as a 2D array (3x3).

Please see below the simulation, data preparation and plotting code.

# Simulation
# Data frame: 3 Gene symbols and 100 Variables
set.seed(123)
m <- matrix(rnorm(9 * 100), nrow = 9)
df <- data.frame(
  sample(LETTERS[1:3], 9, replace = TRUE),
  m
)
names(df) <- c("GeneSymbol", paste0("Sample", 1:100))

# Plotting
library(ggplot2)
library(dplyr)
library(tidyr)

df <- df %>% 
  arrange(GeneSymbol) %>%  
  mutate(GeneSymbol = paste0(GeneSymbol, 1:n())) %>%
  gather(sampl_no, value, - GeneSymbol)

ggplot(df, aes(value)) +
  geom_density() +
  ggplot2::facet_wrap(~GeneSymbol, ncol = 3)

Output:

enter image description here

Artem
  • 2,591
  • 3
  • 14
  • 36