Bring back updated readme

author: Gertjan van den Burg <gertjanvandenburg@gmail.com> 2021-01-14 17:30:56 +0000
committer: Gertjan van den Burg <gertjanvandenburg@gmail.com> 2021-01-14 17:30:56 +0000
commit: c2058f5e5256f87ec1e79a2f3dbb358fd268a454 (patch)
tree: f3fb7a16fbaec90937eec84e4ee440939822abb2 /README.md
parent: Rename directories, remove extra test dir (diff)
download: SyncRNG-c2058f5e5256f87ec1e79a2f3dbb358fd268a454.tar.gz
SyncRNG-c2058f5e5256f87ec1e79a2f3dbb358fd268a454.zip
1 files changed, 135 insertions, 45 deletions
diff --git a/README.md b/README.md
index d7aa3bc..1961ace 100644
--- a/README.md
+++ b/README.md
@@ -1,14 +1,13 @@
-SyncRNG
-=======
+# SyncRNG
+
 A synchronized Tausworthe RNG usable in R and Python.
 
-Why?
-====
+## Why?
 
-This program was created because I needed to have the same random numbers in 
-both R and Python. Although both languages implement a Mersenne-Twister RNG, 
-the implementations are so different that it is not possible to get the same 
-random numbers with the same seed.
+This program was created because it was desired to have the same random 
+numbers in both R and Python programs. Although both languages implement a 
+Mersenne-Twister RNG, the implementations are so different that it is not 
+possible to get the same random numbers with the same seed.
 
 SyncRNG is a Tausworthe RNG implemented in ``syncrng.c``, and linked to both R 
 and Python. Since both use the same underlying C code, the random numbers will 
@@ -17,58 +16,61 @@ be the same in both languages, provided the same seed is used.
 You can read more about my motivations for creating this 
 [here](https://gertjanvandenburg.com/blog/syncrng/).
 
-How
-===
+## Installation
+
+Installing the R package can be done through CRAN:
+
+```
+> install.packages('SyncRNG')
+```
 
-First install the packages as stated under Installation. Then, in Python you 
-can do::
+The Python package can be installed using pip:
 
-    from SyncRNG import SyncRNG
+```
+$ pip install syncrng
+```
 
-    s = SyncRNG(seed=123456)
-    for i in range(10):
-      print(s.randi())
+## Usage
 
-Similarly, after installing the R library you can do in R::
+After installing the package, you can use the basic ``SyncRNG`` random number 
+generator. In Python you can do:
 
-    library(SyncRNG)
 
-    s <- SyncRNG(seed=123456)
-    for (i in 1:10) {
-       cat(s$randi(), '\n')
-    }
+```python
+>>> from SyncRNG import SyncRNG
+>>> s = SyncRNG(seed=123456)
+>>> for i in range(10):
+>>>     print(s.randi())
+```
+
+And in R you can use:
+
+```r
+> library(SyncRNG)
+> s <- SyncRNG(seed=123456)
+> for (i in 1:10) {
+>    cat(s$randi(), '\n')
+> }
+```
 
 You'll notice that the random numbers are indeed the same.
 
-R - User defined RNG
---------------------
+### R: User defined RNG
 
 R allows the user to define a custom random number generator, which is then 
 used for the common ``runif`` and ``rnorm`` functions in R. This has also been 
-implemented in SyncRNG as of version 1.3.0. To enable this, run::
-
-    library(SyncRNG)
+implemented in SyncRNG as of version 1.3.0. To enable this, run:
 
-    set.seed(123456, 'user', 'user')
-    runif(10)
+```r
+> library(SyncRNG)
+> set.seed(123456, 'user', 'user')
+> runif(10)
+```
 
 These numbers are between [0, 1) and multiplying by ``2**32 - 1`` gives the 
 same results as above.
 
-Installation
-============
-
-Installing the R package can be done through CRAN::
-
-    install.packages('SyncRNG')
-
-The Python package can be installed using pip::
-
-    pip install syncrng
-
-
-Usage
-=====
+### Functionality
 
 In both R and Python the following methods are available for the ``SyncRNG`` 
 class:
@@ -79,9 +81,97 @@ class:
 3. ``randbelow(n)``: generate a random integer below a given integer ``n``.
 4. ``shuffle(x)``: generate a permutation of a given list of numbers ``x``.
 
-Notes
-=====
+### Creating the same train/test splits
+
+A common use case for this package is to create the same train and test splits 
+in R and Python. Below are some code examples that illustrate how to do this. 
+Both assume you have a matrix ``X`` with `100` rows.
+
+In R:
+
+```r
+
+# This function creates a list with train and test indices for each fold
+k.fold <- function(n, K, shuffle=TRUE, seed=0)
+{
+	idxs <- c(1:n)
+	if (shuffle) {
+		rng <- SyncRNG(seed=seed)
+		idxs <- rng$shuffle(idxs)
+	}
+
+	# Determine fold sizes
+        fsizes <- c(1:K)*0 + floor(n / K)
+        mod <- n %% K
+        if (mod > 0)
+		fsizes[1:mod] <- fsizes[1:mod] + 1
+
+        out <- list(n=n, num.folds=K)
+	current <- 1
+        for (f in 1:K) {
+		fs <- fsizes[f]
+		startidx <- current
+		stopidx <- current + fs - 1
+		test.idx <- idxs[startidx:stopidx]
+		train.idx <- idxs[!(idxs %in% test.idx)]
+		out$testidxs[[f]] <- test.idx
+		out$trainidxs[[f]] <- train.idx
+		current <- stopidx
+	}
+	return(out)
+}
+
+# Which you can use as follows
+folds <- k.fold(nrow(X), K=10, shuffle=T, seed=123)
+for (f in 1:folds$num.folds) {
+        X.train <- X[folds$trainidx[[f]], ]
+        X.test <- X[folds$testidx[[f]], ]
+
+        # continue using X.train and X.test here
+}
+```
+
+And in Python:
+
+```python
+def k_fold(n, K, shuffle=True, seed=0):
+    """Generator for train and test indices"""
+    idxs = list(range(n))
+    if shuffle:
+        rng = SyncRNG(seed=seed)
+        idxs = rng.shuffle(idxs)
+
+    fsizes = [n // K]*K
+    mod = n % K
+    if mod > 0:
+        fsizes[:mod] = [x+1 for x in fsizes[:mod]]
+
+    current = 0
+    for fs in fsizes:
+        startidx = current
+        stopidx = current + fs
+        test_idx = idxs[startidx:stopidx]
+        train_idx = [x for x in idxs if not x in test_idx]
+        yield train_idx, test_idx
+        current = stopidx
+
+# Which you can use as follows
+kf = k_fold(X.shape[0], K=3, shuffle=True, seed=123)
+for trainidx, testidx in kf:
+    X_train = X[trainidx, :]
+    X_test = X[testidx, :]
+
+    # continue using X_train and X_test here
+
+```
+
+## Notes
 
 The random numbers are uniformly distributed on ``[0, 2^32 - 1]``.
 
+## Questions and Issues
 
+If you have questions, comments, or suggestions about SyncRNG or you encounter 
+a problem, please open an issue [on 
+GitHub](https://github.com/GjjvdBurg/SyncRNG/). Please don't hesitate to 
+contact me, you're helping to make this project better for everyone!
author	Gertjan van den Burg <gertjanvandenburg@gmail.com>	2021-01-14 17:30:56 +0000
committer	Gertjan van den Burg <gertjanvandenburg@gmail.com>	2021-01-14 17:30:56 +0000
commit	c2058f5e5256f87ec1e79a2f3dbb358fd268a454 (patch)
tree	f3fb7a16fbaec90937eec84e4ee440939822abb2 /README.md
parent	Rename directories, remove extra test dir (diff)
download	SyncRNG-c2058f5e5256f87ec1e79a2f3dbb358fd268a454.tar.gz SyncRNG-c2058f5e5256f87ec1e79a2f3dbb358fd268a454.zip