aboutsummaryrefslogtreecommitdiff
diff options
context:
space:
mode:
authorGertjan van den Burg <gertjanvandenburg@gmail.com>2020-05-25 17:24:33 +0100
committerGertjan van den Burg <gertjanvandenburg@gmail.com>2020-05-25 17:24:33 +0100
commit292c9bf4013e3c09ba0b08470a3c974b422d3abe (patch)
tree68ddfc0a9ced0dc7d45ad0cd5ba6a9f14807d250
parentupdate readme with new R feature (diff)
downloadSyncRNG-292c9bf4013e3c09ba0b08470a3c974b422d3abe.tar.gz
SyncRNG-292c9bf4013e3c09ba0b08470a3c974b422d3abe.zip
Update README
-rw-r--r--README.md177
-rw-r--r--README.rst92
2 files changed, 177 insertions, 92 deletions
diff --git a/README.md b/README.md
new file mode 100644
index 0000000..1961ace
--- /dev/null
+++ b/README.md
@@ -0,0 +1,177 @@
+# SyncRNG
+
+A synchronized Tausworthe RNG usable in R and Python.
+
+## Why?
+
+This program was created because it was desired to have the same random
+numbers in both R and Python programs. Although both languages implement a
+Mersenne-Twister RNG, the implementations are so different that it is not
+possible to get the same random numbers with the same seed.
+
+SyncRNG is a Tausworthe RNG implemented in ``syncrng.c``, and linked to both R
+and Python. Since both use the same underlying C code, the random numbers will
+be the same in both languages, provided the same seed is used.
+
+You can read more about my motivations for creating this
+[here](https://gertjanvandenburg.com/blog/syncrng/).
+
+## Installation
+
+Installing the R package can be done through CRAN:
+
+```
+> install.packages('SyncRNG')
+```
+
+The Python package can be installed using pip:
+
+```
+$ pip install syncrng
+```
+
+## Usage
+
+After installing the package, you can use the basic ``SyncRNG`` random number
+generator. In Python you can do:
+
+
+```python
+>>> from SyncRNG import SyncRNG
+>>> s = SyncRNG(seed=123456)
+>>> for i in range(10):
+>>> print(s.randi())
+```
+
+And in R you can use:
+
+```r
+> library(SyncRNG)
+> s <- SyncRNG(seed=123456)
+> for (i in 1:10) {
+> cat(s$randi(), '\n')
+> }
+```
+
+You'll notice that the random numbers are indeed the same.
+
+### R: User defined RNG
+
+R allows the user to define a custom random number generator, which is then
+used for the common ``runif`` and ``rnorm`` functions in R. This has also been
+implemented in SyncRNG as of version 1.3.0. To enable this, run:
+
+```r
+> library(SyncRNG)
+> set.seed(123456, 'user', 'user')
+> runif(10)
+```
+
+These numbers are between [0, 1) and multiplying by ``2**32 - 1`` gives the
+same results as above.
+
+### Functionality
+
+In both R and Python the following methods are available for the ``SyncRNG``
+class:
+
+1. ``randi()``: generate a random integer on the interval [0, 2^32).
+2. ``rand()``: generate a random floating point number on the interval [0.0,
+ 1.0)
+3. ``randbelow(n)``: generate a random integer below a given integer ``n``.
+4. ``shuffle(x)``: generate a permutation of a given list of numbers ``x``.
+
+### Creating the same train/test splits
+
+A common use case for this package is to create the same train and test splits
+in R and Python. Below are some code examples that illustrate how to do this.
+Both assume you have a matrix ``X`` with `100` rows.
+
+In R:
+
+```r
+
+# This function creates a list with train and test indices for each fold
+k.fold <- function(n, K, shuffle=TRUE, seed=0)
+{
+ idxs <- c(1:n)
+ if (shuffle) {
+ rng <- SyncRNG(seed=seed)
+ idxs <- rng$shuffle(idxs)
+ }
+
+ # Determine fold sizes
+ fsizes <- c(1:K)*0 + floor(n / K)
+ mod <- n %% K
+ if (mod > 0)
+ fsizes[1:mod] <- fsizes[1:mod] + 1
+
+ out <- list(n=n, num.folds=K)
+ current <- 1
+ for (f in 1:K) {
+ fs <- fsizes[f]
+ startidx <- current
+ stopidx <- current + fs - 1
+ test.idx <- idxs[startidx:stopidx]
+ train.idx <- idxs[!(idxs %in% test.idx)]
+ out$testidxs[[f]] <- test.idx
+ out$trainidxs[[f]] <- train.idx
+ current <- stopidx
+ }
+ return(out)
+}
+
+# Which you can use as follows
+folds <- k.fold(nrow(X), K=10, shuffle=T, seed=123)
+for (f in 1:folds$num.folds) {
+ X.train <- X[folds$trainidx[[f]], ]
+ X.test <- X[folds$testidx[[f]], ]
+
+ # continue using X.train and X.test here
+}
+```
+
+And in Python:
+
+```python
+def k_fold(n, K, shuffle=True, seed=0):
+ """Generator for train and test indices"""
+ idxs = list(range(n))
+ if shuffle:
+ rng = SyncRNG(seed=seed)
+ idxs = rng.shuffle(idxs)
+
+ fsizes = [n // K]*K
+ mod = n % K
+ if mod > 0:
+ fsizes[:mod] = [x+1 for x in fsizes[:mod]]
+
+ current = 0
+ for fs in fsizes:
+ startidx = current
+ stopidx = current + fs
+ test_idx = idxs[startidx:stopidx]
+ train_idx = [x for x in idxs if not x in test_idx]
+ yield train_idx, test_idx
+ current = stopidx
+
+# Which you can use as follows
+kf = k_fold(X.shape[0], K=3, shuffle=True, seed=123)
+for trainidx, testidx in kf:
+ X_train = X[trainidx, :]
+ X_test = X[testidx, :]
+
+ # continue using X_train and X_test here
+
+```
+
+## Notes
+
+The random numbers are uniformly distributed on ``[0, 2^32 - 1]``.
+
+## Questions and Issues
+
+If you have questions, comments, or suggestions about SyncRNG or you encounter
+a problem, please open an issue [on
+GitHub](https://github.com/GjjvdBurg/SyncRNG/). Please don't hesitate to
+contact me, you're helping to make this project better for everyone!
diff --git a/README.rst b/README.rst
deleted file mode 100644
index c5bf114..0000000
--- a/README.rst
+++ /dev/null
@@ -1,92 +0,0 @@
-=======
-SyncRNG
-=======
-A synchronized Tausworthe RNG usable in R and Python.
-
-Why?
-====
-
-This program was created because it was desired to have the same random
-numbers in both R and Python programs. Although both languages implement a
-Mersenne-Twister RNG, the implementations are so different that it is not
-possible to get the same random numbers with the same seed.
-
-SyncRNG is a Tausworthe RNG implemented in ``syncrng.c``, and linked to both R
-and Python. Since both use the same underlying C code, the random numbers will
-be the same in both languages, provided the same seed is used.
-
-You can read more about my motivations for creating this `here
-<https://gertjanvandenburg.com/blog/syncrng/>`_.
-
-How
-===
-
-First install the packages as stated under Installation. Then, in Python you
-can do::
-
- from SyncRNG import SyncRNG
-
- s = SyncRNG(seed=123456)
- for i in range(10):
- print(s.randi())
-
-Similarly, after installing the R library you can do in R::
-
- library(SyncRNG)
-
- s <- SyncRNG(seed=123456)
- for (i in 1:10) {
- cat(s$randi(), '\n')
- }
-
-You'll notice that the random numbers are indeed the same.
-
-R - User defined RNG
---------------------
-
-R allows the user to define a custom random number generator, which is then
-used for the common ``runif`` and ``rnorm`` functions in R. This has also been
-implemented in SyncRNG as of version 1.3.0. To enable this, run::
-
- library(SyncRNG)
-
- set.seed(123456, 'user', 'user')
- runif(10)
-
-These numbers are between [0, 1) and multiplying by ``2**32 - 1`` gives the
-same results as above.
-
-Installation
-============
-
-Installing the R package can be done through CRAN::
-
- install.packages('SyncRNG')
-
-The Python package can be installed using pip::
-
- pip install syncrng
-
-
-Usage
-=====
-
-In both R and Python the following methods are available for the ``SyncRNG``
-class:
-
-1. ``randi()``: generate a random integer on the interval [0, 2^32).
-2. ``rand()``: generate a random floating point number on the interval [0.0,
- 1.0)
-3. ``randbelow(n)``: generate a random integer below a given integer ``n``.
-4. ``shuffle(x)``: generate a permutation of a given list of numbers ``x``.
-
-Notes
-=====
-
-The random numbers are uniformly distributed on ``[0, 2^32 - 1]``.
-
-Questions and Issues
-====================
-
-If you have questions about SyncRNG or you encounter a problem, please open an
-`issue on GitHub <https://github.com/GjjvdBurg/SyncRNG/>`_.