Initial commit

author: Gertjan van den Burg <gertjanvandenburg@gmail.com> 2020-03-10 12:27:53 +0000
committer: Gertjan van den Burg <gertjanvandenburg@gmail.com> 2020-03-10 12:27:53 +0000
commit: 7c6c2e09e3ad1d41f26869cb7b9f9882175c8a6e (patch)
tree: 10aa6710599230c889ec44407a065ee303a79348 /examples/R
download: TCPD-7c6c2e09e3ad1d41f26869cb7b9f9882175c8a6e.tar.gz
TCPD-7c6c2e09e3ad1d41f26869cb7b9f9882175c8a6e.zip
2 files changed, 75 insertions, 0 deletions
diff --git a/examples/R/README.md b/examples/R/README.md
new file mode 100644
index 0000000..14ce4bf
--- /dev/null
+++ b/examples/R/README.md
@@ -0,0 +1,34 @@
+# Loading a TCPD dataset into R
+
+The file ``load_dataset.R`` contains the function ``load.dataset`` that reads 
+the JSON file into an R dataframe. The 
+[RJSONIO](https://cran.r-project.org/web/packages/RJSONIO/index.html) package 
+is required:
+
+```R
+> install.packages('RJSONIO')
+```
+
+Simply run:
+
+```R
+> source('./load_dataset.R')
+> df <- load.dataset('../../datasets/ozone/ozone.json')
+> df
+    t Total Emissions
+1   0          380000
+2   1          400000
+3   2          440000
+4   3          480000
+5   4          510000
+6   5          540000
+7   6          580000
+8   7          630000
+```
+
+Notice that the time axis in TCPD is always 0-based. This needs to be taken 
+into account when comparing detection results to the human annotations. (This 
+is an unfortunate consequence of the differences between indexing in R and 
+Python.)
+
+Missing observations in time series are represented with a ``NA`` value.
diff --git a/examples/R/load_dataset.R b/examples/R/load_dataset.R
new file mode 100644
index 0000000..8ef0e22
--- /dev/null
+++ b/examples/R/load_dataset.R
@@ -0,0 +1,41 @@
+#' ---
+#' title: Example code to load a TCPD time series
+#' author: G.J.J. van den Burg
+#' date: 2020-01-06
+#' license: See the LICENSE file.
+#' copyright: 2019, The Alan Turing Institute
+#' ---
+
+library(RJSONIO)
+
+load.dataset <- function(filename)
+{
+    data <- fromJSON(filename)
+
+    # reformat the data into a data frame with a time index and the data values
+    tidx <- data$time$index
+
+    cols <- c()
+
+    mat <- NULL
+    for (j in 1:data$n_dim) {
+        s <- data$series[[j]]
+        v <- NULL
+        for (i in 1:data$n_obs) {
+            val <- s$raw[[i]]
+            if (is.null(val)) {
+                v <- c(v, NA)
+            } else {
+                v <- c(v, val)
+            }
+        }
+        cols <- c(cols, s$label)
+        mat <- cbind(mat, v)
+    }
+
+    mat <- cbind(tidx, mat)
+    colnames(mat) <- c('t', cols)
+
+    df <- as.data.frame(mat)
+    return(df)
+}
author	Gertjan van den Burg <gertjanvandenburg@gmail.com>	2020-03-10 12:27:53 +0000
committer	Gertjan van den Burg <gertjanvandenburg@gmail.com>	2020-03-10 12:27:53 +0000
commit	7c6c2e09e3ad1d41f26869cb7b9f9882175c8a6e (patch)
tree	10aa6710599230c889ec44407a065ee303a79348 /examples/R
download	TCPD-7c6c2e09e3ad1d41f26869cb7b9f9882175c8a6e.tar.gz TCPD-7c6c2e09e3ad1d41f26869cb7b9f9882175c8a6e.zip