Title: | Generate High-Dimensional Binary Data with Correlation Structures |
---|---|
Description: | We design algorithms with linear time complexity with respect to the dimension for three commonly studied correlation structures, including exchangeable, decaying-product and K-dependent correlation structures, and extend the algorithms to generate binary data of general non-negative correlation matrices with quadratic time complexity. Jiang, W., Song, S., Hou, L. and Zhao, H. "A set of efficient methods to generate high-dimensional binary data with specified correlation structures." The American Statistician. See <doi:10.1080/00031305.2020.1816213> for a detailed presentation of the method. |
Authors: | Wei Jiang [aut], Shuang Song [aut, cre], Lin Hou [aut] and Hongyu Zhao [aut] |
Maintainer: | Shuang Song <[email protected]> |
License: | GPL-3 |
Version: | 1.0.0 |
Built: | 2024-10-31 22:08:54 UTC |
Source: | https://github.com/cran/CorBin |
The main function of our package, through which we can simulate correlated binary data under different settings.
cBern(n, p, rho, type, k = NULL)
cBern(n, p, rho, type, k = NULL)
n |
number of observations |
p |
the vector of marginal probabilities with dimension m |
rho |
For the first three types, rho is either a non-negative value indecating the shared correlation coefficient or and m-1 vector indicating the correlation coefficients between adjacent variables. For the general case, rho should be a list, the i-th element of which specifies the coefficients on the i-th minor diagnal. |
type |
including 4 types. type="exchange" type="DCP" type="1-dependent" type="General" |
k |
(for 'General' use only). The number of layers setting for k-dependent structure. k=m-1 for the general case. |
an n*p matrix of binary data
Jiang, W., Song, S., Hou, L. and Zhao, H. A set of efficient methods to generate high-dimensional binary data with specified correlation structures. The American Statistician. DOI:10.1080/00031305.2020.1816213
X <- cBern(10, rep(0.5,3), 0.5, type="exchange") X <- cBern(10, rep(0.5,3), c(0.2,0.2), type="DCP") X <- cBern(5, c(0.4,0.5,0.6), c(0.2,0.3), type="1-dependent") rho <- list() rho[[1]] <- c(0.2,0.3) rho[[2]] <- 0.1 X <- cBern(2, c(0.7,0.8,0.9),rho=rho,type="General", k=2)
X <- cBern(10, rep(0.5,3), 0.5, type="exchange") X <- cBern(10, rep(0.5,3), c(0.2,0.2), type="DCP") X <- cBern(5, c(0.4,0.5,0.6), c(0.2,0.3), type="1-dependent") rho <- list() rho[[1]] <- c(0.2,0.3) rho[[2]] <- 0.1 X <- cBern(2, c(0.7,0.8,0.9),rho=rho,type="General", k=2)
Equivalent to cBern(n, p, rho, type="1-dependent")
cBern1dep(n, p, rho)
cBern1dep(n, p, rho)
n |
number of observations |
p |
the vector of marginal probabilities with dimension m |
rho |
either a non-negative value indecating the shared correlation coefficient or and m-1 vector indicating the correlation coefficients between adjacent variables. |
an n*p matrix of binary data
X <- cBern1dep(5, c(0.4,0.5,0.6), c(0.2,0.3))
X <- cBern1dep(5, c(0.4,0.5,0.6), c(0.2,0.3))
Equivalent to cBern(n, p, rho, type="DCP")
cBernDCP(n, p, rho)
cBernDCP(n, p, rho)
n |
number of observations |
p |
the vector of marginal probabilities with dimension m |
rho |
either a non-negative value indecating the shared correlation coefficient or and m-1 vector indicating the correlation coefficients between adjacent variables. |
an n*p matrix of binary data
X <- cBernDCP(10, rep(0.5,3), c(0.2,0.2))
X <- cBernDCP(10, rep(0.5,3), c(0.2,0.2))
Equivalent to cBern(n, p, rho, type="exchange")
cBernEx(n, p, rho)
cBernEx(n, p, rho)
n |
number of observations |
p |
the vector of marginal probabilities with dimension m |
rho |
a non-negative value indecating the shared correlation coefficient |
an n*p matrix of binary data
X <- cBernEx(10, rep(0.5,3), 0.5)
X <- cBernEx(10, rep(0.5,3), 0.5)
To calculate the maximal allowed correlations max for using cBern1dep to generate binary data with 1-dependent structure
rhoMax1dep(p)
rhoMax1dep(p)
p |
the vector of marginal probabilities with dimension m |
an (m-1)-dimensional vector rho, which is the maximum the correlation between the adjacent variables
For calculating the maximal allowed correlations max for binary data with decaying-product structure.
rhoMaxDCP(p)
rhoMaxDCP(p)
p |
marginal probabilities |
an (m-1)-dimensional vector rho, which is the maximum the correlation between the adjacent variables
For calculating the maximal allowed correlation coefficient for binary data with exchangeable structure.
rhoMaxEx(p)
rhoMaxEx(p)
p |
the vector of marginal probabilities with dimension m |
the maximal allowed correlation coefficient