This document describes how the read.gt3x package can be used to read binary activity data into R. To access the read.gt3x package, use:
For source code and installation instructions, see the GitHub page.
The read.gt3x package includes two sample .gt3x files which I’ll use to demonstrate reading the data. First we need the path to a single gt3x file. We will use data embedded in the package:
but longer and more extensive data can be downloaded via
gt3x_datapath
:
The read.gt3x()
function can take as input a path to a
single .gt3x file and will then read activity samples as an R
matrix.
head(X)
#> Sampling Rate: 100Hz
#> Firmware Version: 1.7.2
#> Serial Number Prefix: TAS
#> X Y Z
#> [1,] 0.000 0.008 0.996
#> [2,] 0.016 0.000 1.008
#> [3,] 0.020 -0.008 1.004
#> [4,] 0.016 -0.012 1.012
#> [5,] 0.016 -0.008 1.008
#> [6,] 0.008 -0.008 1.008
#> used (Mb) gc trigger (Mb) max used (Mb)
#> Ncells 686964 36.7 1102705 58.9 1102705 58.9
#> Vcells 1284776 9.9 8388608 64.0 2841552 21.7
#> used (Mb) gc trigger (Mb) max used (Mb)
#> Ncells 686961 36.7 1102705 58.9 1102705 58.9
#> Vcells 1284804 9.9 8388608 64.0 2841552 21.7
.gt3x files are actually zip archives which contain two files: log.bin and info.txt. log.bin is a binary file that contains the actual samples. It might make sense to store the data as unzipped folders containing these two files, because otherwise the read.gt3x() function will have to unzip each .gt3x archive to a temporary location, every time you need to access the data.
read.gt3x()
also accepts paths to unzipped gt3x folders.
To demonstrate the usage, we’ll unzip the sample .gt3x files in the
package, and then read them. The unzip.gt3x()
helper
function unzips all .gt3x files in a given directory. By default, the
contents of a .gt3x file named “subject001.gt3x” are extracted to a
folder named “subject001”. unzip.gt3x()
returns a vector of
paths to the unzipped gt3x folders. The location argument can be used to
choose where to locate those folders.
datadir <- dirname(gt3xfile) # location of .gt3x files
gt3xfolders <- unzip.gt3x(datadir, location = tempdir())
#> Unzipping gt3x data to /tmp/RtmpRJa5xu
#> 1/1
#> Unzipping /tmp/RtmpS5XUBC/Rinstc12410b31aa/read.gt3x/extdata/TAS1H30182785_2019-09-17.gt3x
#> === info.txt, log.bin extracted to /tmp/RtmpRJa5xu/TAS1H30182785_2019-09-17
The read.gt3x()
function accepts a path to an unzipped
gt3x folder. It is a bit faster if the unzip step has already been
performed.
Internally, the data matrix returned by read.gt3x() is a bit smarter than it looks, as it knows all the (relative) timestamps of the observations.
str(X)
#> 'activity' num [1:33000, 1:3] 0 0.016 0.02 0.016 0.016 0.008 0.016 0.02 0.016 0.012 ...
#> - attr(*, "dimnames")=List of 2
#> ..$ : NULL
#> ..$ : chr [1:3] "X" "Y" "Z"
#> - attr(*, "time_index")= num [1:33000] 0 1 2 3 4 5 6 7 8 9 ...
#> - attr(*, "missingness")='data.frame': 10 obs. of 2 variables:
#> ..$ time : POSIXct[1:10], format: "2019-09-17 18:40:10" "2019-09-17 18:44:21" ...
#> ..$ n_missing: int [1:10] 400 10500 55400 112600 3300 100 100 500 100 24500
#> - attr(*, "total_records")= int 33000
#> - attr(*, "start_time_param")= num 1.57e+09
#> - attr(*, "features")= chr "sleep mode"
#> - attr(*, "start_time_info")= num 1.57e+09
#> - attr(*, "sample_rate")= int 100
#> - attr(*, "impute_zeroes")= logi FALSE
#> - attr(*, "add_light")= logi FALSE
#> - attr(*, "start_time")= POSIXct[1:1], format: "2019-09-17 18:40:00"
#> - attr(*, "stop_time")= POSIXct[1:1], format: "2019-09-18 19:00:00"
#> - attr(*, "last_sample_time")= POSIXct[1:1], format: "2019-09-17 19:20:05"
#> - attr(*, "subject_name")= chr "suffix_85"
#> - attr(*, "time_zone")= chr "-04:00:00"
#> - attr(*, "firmware")= chr "1.7.2"
#> - attr(*, "serial_prefix")= chr "TAS"
#> - attr(*, "acceleration_min")= chr "-8.0"
#> - attr(*, "acceleration_max")= chr "8.0"
#> - attr(*, "bad_samples")= logi FALSE
#> - attr(*, "old_version")= logi FALSE
#> - attr(*, "header")=List of 17
#> ..$ Serial Number : chr "TAS1H30182785"
#> ..$ Device Type : chr "Link"
#> ..$ Firmware : chr "1.7.2"
#> ..$ Battery Voltage : chr "4.18"
#> ..$ Sample Rate : num 100
#> ..$ Start Date : POSIXct[1:1], format: "2019-09-17 18:40:00"
#> ..$ Stop Date : POSIXct[1:1], format: "2019-09-18 19:00:00"
#> ..$ Last Sample Time : POSIXct[1:1], format: "2019-09-17 19:20:05"
#> ..$ TimeZone : chr "-04:00:00"
#> ..$ Download Date : POSIXct[1:1], format: "2019-09-17 19:20:05"
#> ..$ Board Revision : chr "8"
#> ..$ Unexpected Resets : chr "0"
#> ..$ Acceleration Scale: int 256
#> ..$ Acceleration Min : chr "-8.0"
#> ..$ Acceleration Max : chr "8.0"
#> ..$ Subject Name : chr "suffix_85"
#> ..$ Serial Prefix : chr "TAS"
#> ..- attr(*, "class")= chr [1:2] "gt3x_info" "list"
the read.gt3x package has an as.data.frame method for the activity matrix, which converts the matrix to a dataframe and adds a “time” column, which gives the timestamp of each sample. The timestamps are stored in R with the GMT timezone but note that this is misleading: in reality the timestamps correspond to the local time of the device!
X <- as.data.frame(X)
head(X)
#> Sampling Rate: 100Hz
#> Firmware Version: 1.7.2
#> Serial Number Prefix: TAS
#> time X Y Z
#> 1 2019-09-17 18:40:00 0.000 0.008 0.996
#> 2 2019-09-17 18:40:00 0.016 0.000 1.008
#> 3 2019-09-17 18:40:00 0.020 -0.008 1.004
#> 4 2019-09-17 18:40:00 0.016 -0.012 1.012
#> 5 2019-09-17 18:40:00 0.016 -0.008 1.008
#> 6 2019-09-17 18:40:00 0.008 -0.008 1.008
#> used (Mb) gc trigger (Mb) max used (Mb)
#> Ncells 704250 37.7 1363246 72.9 1102705 58.9
#> Vcells 1324144 10.2 8388608 64.0 2841552 21.7
#> used (Mb) gc trigger (Mb) max used (Mb)
#> Ncells 704223 37.7 1363246 72.9 1102705 58.9
#> Vcells 1324134 10.2 8388608 64.0 2841552 21.7