Wearables: Heart Rate Data from FitBit - Part 2
In Part 2 we will use a Personal application API key to access Intraday Time Series Data, giving us heart rate data with a greater temporal resolution e.g. detail-level = 1sec OR 1min.
Intra-day Time Series
We will use the
get_heart_rate_intraday_time_series() function which has the following arguments:
date: The date, in the format
detail_level: Number of data points to include. Either
start_time: The start of the period, in the format
end_time: The end of the period, in the format
Let’s look at the structure of the data this function returns:
library(fitbitr) str(get_heart_rate_intraday_time_series(token, date = date, detail_level = "1min"))
## 'data.frame': 1395 obs. of 2 variables: ## $ time : chr "00:00:00" "00:01:00" "00:02:00" "00:03:00" ... ## $ value: int 63 65 65 66 68 68 68 68 68 68 ...
Pretty straightforward, we can see a
data.frame consisting of 2 variables; a character vector
end_time, defaulting to a 24-hour period from “00:00:00” on the
date specified) and an integer vector
value corresponding to the Heart Rate for that time.
Let’s import the data into a data.frame and rename
value to the more descriptive
HR. We must also convert
time from a character to something more correct. Otherwise when we call
ggplot it won’t produce any result. I’ve used the
as.difftime() function in base R, rather than the more common
strptime(), as it gives me a time since midnight in units of my choice, rather than appending the time to todays date which would introduce confusion when analysing data from multiple days.
df <- get_heart_rate_intraday_time_series(token, date = date, detail_level = "1min") # Rename col 'value' to 'HR names(df) <- "HR" # Convert time from class = character to difftime, i.e. mins since 00:00 df$time <- as.difftime(df$time, units = "mins")
Plotting Intra-day Time Series
We can plot this data as a simple scatter plot in
geom_point(). I’ve coloured it red, given the cardiac nature of the data, and changed it’s transparency
alpha = 0.4 to improve legibility given the close distribution of the points along the x axis.
Finally I’ve added a smoother (? correct term) using
geom_smooth() to aid the eye in seeing patterns given the ‘overplotting’ described above. This defaults to
method = "gam" with
formula = y ~ s(x, bs = "cs"). I don’t have the depth of mathematical knowledge to say if this is a correct method, but visually ‘eyeballing’ the resultant line it looks pretty appropriate to me!
I’ve elected to show a 95% (default) Confidence Interval using the argument
se = TRUE .
labs() will add a title and label the x-axis, helping to clarify that time is in minutes as
difftime is not the most obvious.
library(ggplot2) library(ggthemes) p <- ggplot(df, aes(x = time, y = HR)) + geom_point(col = "red", alpha = 0.4) + geom_smooth(se = TRUE) + labs(title = "Heart Rate over Time", x = "Time (mins)") + theme_few() p
Exploring different temporal resolutions
As described above the
get_heart_rate_intraday_time_series() function has an argument
detail_level which specifies the ‘Number of data points to include’, i.e. the temporal resolution. This can be specified as either
1sec (and seemingly not any other number of minutes) or
If we could reduce the temporal resolution even from 1 minute to 2 minutes this would halve the amount of data to process, particularly important when we start analysing larger time periods and more than one subject! Clearly this will come at the expense of a loss of detail so I thought it would be interesting to plot the same heart rate data at different detail levels (1 min, 5 min & 15 min) and compare the results. Particularly the smoothed line.
get_heart_rate_intraday_time_series() won’t accept a
detail_level so we have to resort to separate function calls, storing the outputs in their own data frames. We will then
merge() these (again unfortunately only can merge two at a time, so have to repeat twice) into a single dataframe, using the arguments
all = TRUE as we don’t want to select only those time points that intersect (this would lose the greater resolution, defeating the point of the exercise!). I had specified
no.dups = FALSE although in retrospect this is unecessary as they’re all pulling data from the same source so the temporal resolution shouldn’t change the heart rate.
We then need to do a bit of tidying, we will rename the columns to their
"HR_1min", "HR_5min", "HR_15min" respectively and convert the
difftime to allow us to perform the
geom_smooth() regression line.
df.1m <- get_heart_rate_intraday_time_series(token, date = date, detail_level = "1min") df.5m <- get_heart_rate_intraday_time_series(token, date = date, detail_level = "5min") df.15m <- get_heart_rate_intraday_time_series(token, date = date, detail_level = "15min") df.detail.levels <- merge(df.1m, df.5m, by = "time", all = TRUE, no.dups = FALSE) df.detail.levels <- merge(df.detail.levels, df.15m, by = "time", all = TRUE, no.dups = FALSE) names(df.detail.levels)[2:4] <- c("HR_1min", "HR_5min", "HR_15min") df.detail.levels$time <- as.difftime(df.detail.levels$time, units = "mins")
There would be various ways of representing the resultant information. I could produce a 1x3 panel plot. Instead I decided to follow the principle of showing as much useful info in a single figure as possible and instead plot a line for each
detail_level on the same axes.
The data is currently in ‘Wide Format’:
## time HR_1min HR_5min HR_15min ## 1 0 mins 63 66 66 ## 2 1 mins 65 NA NA ## 3 2 mins 65 NA NA
It is easier for us to plot if we convert it into ‘Tall Format’ using the
melt() function in the
reshape2 package with the argument
id.vars = "time" to show that we are preserving time for our x-axis.
library(reshape2) melted <- melt(df.detail.levels, id.vars = "time") head(melted, 3)
## time variable value ## 1 0 mins HR_1min 63 ## 2 1 mins HR_1min 65 ## 3 2 mins HR_1min 65
Now we can create the plot in
ggplot using the aesthetic
col = variable to colour the points by the melted column
variable which consists of the
ggplot(melted, aes(x = time, y = value, col = variable)) + geom_point(alpha = 0.5) + geom_smooth(aes(col = variable), se = FALSE) + labs(title = "Heart Rate over Time", subtitle = "By varying detail level", x = "Time (mins)", y = "HR", colour = "Detail Level") + theme_few()
We can see the following observations:
- A greater detail level gives a greater range of values. This is most significant for the higher heart rates, presumably because they are more transient, in contrast to the low heart rates which were probably sustained during sleep.
- Overall there is little difference in the averaged data between the various detail levels.
This is all fairly intuitive and we can also be seen using the extremely useful
## time HR_1min HR_5min HR_15min ## Length:1397 Min. : 44.00 Min. : 45.0 Min. : 46.00 ## Class :difftime 1st Qu.: 55.00 1st Qu.: 56.0 1st Qu.: 56.00 ## Mode :numeric Median : 66.00 Median : 66.0 Median : 66.00 ## Mean : 69.61 Mean : 69.7 Mean : 69.98 ## 3rd Qu.: 76.00 3rd Qu.: 75.0 3rd Qu.: 74.00 ## Max. :167.00 Max. :154.0 Max. :144.00 ## NA's :2 NA's :1115 NA's :1303
I would thus conclude that capturing heart rate data at 15-minute intervals does not result in a significant loss of detail, provided we are not particularly interested in looking at the peak heart rates.