diff options
Diffstat (limited to 'README.md')
| -rw-r--r-- | README.md | 180 |
1 files changed, 45 insertions, 135 deletions
@@ -1,13 +1,14 @@ -# SyncRNG - +SyncRNG +======= A synchronized Tausworthe RNG usable in R and Python. -## Why? +Why? +==== -This program was created because it was desired to have the same random -numbers in both R and Python programs. Although both languages implement a -Mersenne-Twister RNG, the implementations are so different that it is not -possible to get the same random numbers with the same seed. +This program was created because I needed to have the same random numbers in +both R and Python. Although both languages implement a Mersenne-Twister RNG, +the implementations are so different that it is not possible to get the same +random numbers with the same seed. SyncRNG is a Tausworthe RNG implemented in ``syncrng.c``, and linked to both R and Python. Since both use the same underlying C code, the random numbers will @@ -16,61 +17,58 @@ be the same in both languages, provided the same seed is used. You can read more about my motivations for creating this [here](https://gertjanvandenburg.com/blog/syncrng/). -## Installation - -Installing the R package can be done through CRAN: - -``` -> install.packages('SyncRNG') -``` +How +=== -The Python package can be installed using pip: +First install the packages as stated under Installation. Then, in Python you +can do:: -``` -$ pip install syncrng -``` + from SyncRNG import SyncRNG -## Usage + s = SyncRNG(seed=123456) + for i in range(10): + print(s.randi()) -After installing the package, you can use the basic ``SyncRNG`` random number -generator. In Python you can do: +Similarly, after installing the R library you can do in R:: + library(SyncRNG) -```python ->>> from SyncRNG import SyncRNG ->>> s = SyncRNG(seed=123456) ->>> for i in range(10): ->>> print(s.randi()) -``` - -And in R you can use: - -```r -> library(SyncRNG) -> s <- SyncRNG(seed=123456) -> for (i in 1:10) { -> cat(s$randi(), '\n') -> } -``` + s <- SyncRNG(seed=123456) + for (i in 1:10) { + cat(s$randi(), '\n') + } You'll notice that the random numbers are indeed the same. -### R: User defined RNG +R - User defined RNG +-------------------- R allows the user to define a custom random number generator, which is then used for the common ``runif`` and ``rnorm`` functions in R. This has also been -implemented in SyncRNG as of version 1.3.0. To enable this, run: +implemented in SyncRNG as of version 1.3.0. To enable this, run:: + + library(SyncRNG) -```r -> library(SyncRNG) -> set.seed(123456, 'user', 'user') -> runif(10) -``` + set.seed(123456, 'user', 'user') + runif(10) These numbers are between [0, 1) and multiplying by ``2**32 - 1`` gives the same results as above. -### Functionality +Installation +============ + +Installing the R package can be done through CRAN:: + + install.packages('SyncRNG') + +The Python package can be installed using pip:: + + pip install syncrng + + +Usage +===== In both R and Python the following methods are available for the ``SyncRNG`` class: @@ -81,97 +79,9 @@ class: 3. ``randbelow(n)``: generate a random integer below a given integer ``n``. 4. ``shuffle(x)``: generate a permutation of a given list of numbers ``x``. -### Creating the same train/test splits - -A common use case for this package is to create the same train and test splits -in R and Python. Below are some code examples that illustrate how to do this. -Both assume you have a matrix ``X`` with `100` rows. - -In R: - -```r - -# This function creates a list with train and test indices for each fold -k.fold <- function(n, K, shuffle=TRUE, seed=0) -{ - idxs <- c(1:n) - if (shuffle) { - rng <- SyncRNG(seed=seed) - idxs <- rng$shuffle(idxs) - } - - # Determine fold sizes - fsizes <- c(1:K)*0 + floor(n / K) - mod <- n %% K - if (mod > 0) - fsizes[1:mod] <- fsizes[1:mod] + 1 - - out <- list(n=n, num.folds=K) - current <- 1 - for (f in 1:K) { - fs <- fsizes[f] - startidx <- current - stopidx <- current + fs - 1 - test.idx <- idxs[startidx:stopidx] - train.idx <- idxs[!(idxs %in% test.idx)] - out$testidxs[[f]] <- test.idx - out$trainidxs[[f]] <- train.idx - current <- stopidx - } - return(out) -} - -# Which you can use as follows -folds <- k.fold(nrow(X), K=10, shuffle=T, seed=123) -for (f in 1:folds$num.folds) { - X.train <- X[folds$trainidx[[f]], ] - X.test <- X[folds$testidx[[f]], ] - - # continue using X.train and X.test here -} -``` - -And in Python: - -```python -def k_fold(n, K, shuffle=True, seed=0): - """Generator for train and test indices""" - idxs = list(range(n)) - if shuffle: - rng = SyncRNG(seed=seed) - idxs = rng.shuffle(idxs) - - fsizes = [n // K]*K - mod = n % K - if mod > 0: - fsizes[:mod] = [x+1 for x in fsizes[:mod]] - - current = 0 - for fs in fsizes: - startidx = current - stopidx = current + fs - test_idx = idxs[startidx:stopidx] - train_idx = [x for x in idxs if not x in test_idx] - yield train_idx, test_idx - current = stopidx - -# Which you can use as follows -kf = k_fold(X.shape[0], K=3, shuffle=True, seed=123) -for trainidx, testidx in kf: - X_train = X[trainidx, :] - X_test = X[testidx, :] - - # continue using X_train and X_test here - -``` - -## Notes +Notes +===== The random numbers are uniformly distributed on ``[0, 2^32 - 1]``. -## Questions and Issues -If you have questions, comments, or suggestions about SyncRNG or you encounter -a problem, please open an issue [on -GitHub](https://github.com/GjjvdBurg/SyncRNG/). Please don't hesitate to -contact me, you're helping to make this project better for everyone! |
