--- title: "Getting started with read.gt3x" author: "Tuomo Nieminen" date: "`r Sys.Date()`" output: rmarkdown::html_vignette vignette: > %\VignetteIndexEntry{Getting started with read.gt3x} %\VignetteEngine{knitr::rmarkdown} %\VignetteEncoding{UTF-8} --- ```{r setup, include = FALSE} knitr::opts_chunk$set( collapse = TRUE, comment = "#>" ) ``` This document describes how the **read.gt3x** package can be used to read binary activity data into R. To access the read.gt3x package, use: ```{r} library(read.gt3x) ``` For source code and installation instructions, see the [GitHub page](https://github.com/THLfi/read.gt3x). ## Reading .gt3x data into R The read.gt3x package includes two sample .gt3x files which I'll use to demonstrate reading the data. First we need the path to a single gt3x file. We will use data embedded in the package: ```{r} gt3xfile <- system.file( "extdata", "TAS1H30182785_2019-09-17.gt3x", package = "read.gt3x") ``` but longer and more extensive data can be downloaded via `gt3x_datapath`: ```{r download, eval = FALSE} gt3xfile <- gt3x_datapath(1) ``` ### Method 1: Temporary unzip and read The `read.gt3x()` function can take as input a path to a single .gt3x file and will then read activity samples as an R matrix. ```{r} X <- read.gt3x(gt3xfile) ``` ```{r} head(X) ``` ```{r, echo = FALSE} rm(X); gc(); gc() ``` ### Method 2: permanent unzip, then read .gt3x files are actually zip archives which contain two files: log.bin and info.txt. log.bin is a binary file that contains the actual samples. It might make sense to store the data as *unzipped* folders containing these two files, because otherwise the read.gt3x() function will have to unzip each .gt3x archive to a temporary location, every time you need to access the data. `read.gt3x()` also accepts paths to unzipped gt3x folders. To demonstrate the usage, we'll unzip the sample .gt3x files in the package, and then read them. The `unzip.gt3x()` helper function unzips all .gt3x files in a given directory. By default, the contents of a .gt3x file named "subject001.gt3x" are extracted to a folder named "subject001". `unzip.gt3x()` returns a vector of paths to the unzipped gt3x folders. The location argument can be used to choose where to locate those folders. ```{r} datadir <- dirname(gt3xfile) # location of .gt3x files gt3xfolders <- unzip.gt3x(datadir, location = tempdir()) ``` The `read.gt3x()` function accepts a path to an unzipped gt3x folder. It is a bit faster if the unzip step has already been performed. ```{r} gt3xfolder <- gt3xfolders[1] X <- read.gt3x(gt3xfolder) ``` ```{r} head(X) ``` ## Activity data matrix Internally, the data matrix returned by read.gt3x() is a bit smarter than it looks, as it knows all the (relative) timestamps of the observations. ```{r} str(X) ``` - time_index: Time passed since start - start_time: timestamp of first sample - sample_rate: Samples per second. ## Converting to a data.frame the read.gt3x package has an as.data.frame method for the activity matrix, which converts the matrix to a dataframe and adds a "time" column, which gives the timestamp of each sample. The timestamps are stored in R with the GMT timezone but note that this is misleading: in reality the timestamps correspond to the local time of the device! ```{r} X <- as.data.frame(X) head(X) ``` ```{r, echo = FALSE} rm(X); gc(); gc() ```