2

I am trying to plot weight of a fetus over time.
The y-axis is fetal weight in grams
The x-axis needs to be formatted as the following:
7 weeks 3 days == 27.3
29 weeks 6 days == 29.6
etc

My data (df) looks something like this

weight age
2013   22.4
2302   25.6
2804   27.2
3011   29.1

I have tried something like this... but not sure how to adjust the scale...

ggplot(df, aes(x = age, y = weight)) + 
geom_point() +
scale_x_continuous()

If I get the actual numeric value for the age (i.e. 22.4 == 22weeks + 4/7days == 22.57),
Is it possible to label the corresponding age value with the label i want?
For example...

weight age.label age.value
2013   22.4      22.57
2302   25.6      25.86
2804   27.2      27.29
3011   29.1      29.14

When I call this:

df <- df %>% mutate(age.label = as.character(age.label))

ggplot(df, aes(x = age.value, y = weight)) + 
geom_point() +
scale_x_continuous(label = "age.label")

I get the following...

Error in f(..., self = self) : Breaks and labels are different lengths

Any help much appreciated

mdb_ftl
  • 221
  • 1
  • 10

2 Answers2

3

I borrowed from this answer and this one, to create a variable ticks labels that uses formatting to seperate the days and the weeks.

I have supplied three different methods.

  1. Simply places ticks at every day point but does not number them.
  2. Numbers the days and the weeks correctly and distinguishes between them by making weeks bold and days light grey.
  3. Same as 2 but uses size. This method doesn't work very well, as it creates a large gap between the labels and the plot. It has been included for completeness... and in the hope somebody says how to fix it.

The plot below is the second method.

enter image description here

I think the vertical tick lines could also be coloured so that some of them disappear if you want as well.

library(ggplot2)
library(tidyverse)
df<-read.table(header=TRUE, text="weight age.label age.value

2013   22.4      22.57
2302   25.6      25.86
2804   27.2      27.29
3011   29.1      29.14")


#have ticks for every day using 1/7 distance tick marks
ggplot(df, aes(x = age.value, y = weight)) + 
  geom_point() +
  scale_x_continuous(limits=c(22, 30), 
                     minor_breaks = seq(from = 1, to = 33, by = 1/7),
                     breaks = 1:30) 


#create a df of tick marks labels containing day number and week number
breaks_labels_df  <- data.frame(breaks = seq(from = 1, to = 33, by = 1/7)) %>%
  mutate(minors= rep(0:6, length.out = nrow(.)),
         break_label = ifelse(minors == 0, breaks, minors))

#plot both day number and week number differentiating between them by the label formatting.
#remove the minor tick lines to reduce the busyness of the plot
ggplot(df, aes(x = age.value, y = weight)) + 
  geom_point() +
  scale_x_continuous(limits=c(22, 30), 
                     breaks = seq(from = 1, to = 33, by = 1/7),
                     labels  = breaks_labels_df$break_label)  +
  theme(axis.text.x = element_text(color = c("grey60","grey60","black",rep("grey60",4)), 
                                   size = 8, angle = 0, 
                                   hjust = .5, vjust = .5,
                                   face = c("plain","plain","bold",rep("plain",4))),
        panel.grid.minor.x = element_blank()) +
  labs(title = "Baby weight in relation to age", x = "Age in weeks and days", y = "weight in grams")



#Changing the font size places a large gap between the tick labels and the axis
ggplot(df, aes(x = age.value, y = weight)) + 
  geom_point() +
  scale_x_continuous(limits=c(22, 30), 
                     breaks = seq(from = 1, to = 33, by = 1/7),
                     labels  = breaks_labels_df$break_label) +
     theme(axis.text.x = element_text(vjust = 0, size = c(8,8,12,rep(8,4)), 
                                      margin =  margin(t = 0), lineheight = 0)) 
Jonno Bourne
  • 1,643
  • 1
  • 19
  • 42
  • 1
    This worked very well and will be hugely helpful for all the people working with pregnancy data - will be sure to link to my colleagues struggling with the same issues – mdb_ftl Oct 13 '19 at 12:09
  • Pleased to be able to help such a worthwhile cause! If the date range changes you may need to adjust the lines 'color = c("grey60","grey60","black"....' and 'face = c("plain","plain","bold"...' so that the bold numbers and black numbers are correct. – Jonno Bourne Oct 14 '19 at 09:36
1

In order to add labels to the plot, use the geom_text function in the ggplot2 package. One can use the "hjust" and "vjust" to fine tune the placement.

df<-read.table(header=TRUE, text="weight age
2013   22.4
2302   25.6
2804   27.2
3011   29.1")

library(dplyr)
library(ggplot2)

#calculate the proper decimal value for axis
df<-df %>%mutate(age.value=floor(age)+ (age-floor(age))*10/7) %>% round(2)

ggplot(df, aes(x = age.value, y = weight)) + 
  geom_point() +
  scale_x_continuous(limits=c(20, 30)) +
  geom_text(aes(label = age), hjust = -.2, vjust=.1)

enter image description here

Dave2e
  • 15,736
  • 17
  • 32
  • 37