Title: | Semiparametric Regression Analysis of Interval-Censored Data using Penalized Splines |
---|---|
Description: | Currently incorporate the generalized odds-rate model (a type of linear transformation model) for interval-censored data based on penalized monotonic B-Spline. More methods under other semiparametric models such as cure model or additive model will be included in future versions. For more details see Lu, M., Liu, Y., Li, C. and Sun, J. (2019) <arXiv:1912.11703>. |
Authors: | Yan Liu [aut, cre], Minggen Lu [aut] |
Maintainer: | Yan Liu <[email protected]> |
License: | GPL (>= 2) |
Version: | 1.0.0 |
Built: | 2025-02-15 04:09:16 UTC |
Source: | https://github.com/cran/PenIC |
This package is designed to conduct the semiparametric regression analysis of interval-censored data under the generalized odds-rates model. To estimate the unknown nondecreasing cumulative baseline hazard function, monotone B-splines are used. An expectation maximization (EM) algorithm is developed to facilitate model fitting.
Package: | PenIC |
Type: | Package |
Version: | 1.0.0 |
Date: | 2019-12-11 |
Yan Liu and Minggen Lu
Generate interval-censored data under generalized odds-rate model, with different combinations of right-censoring rate and cumulative baseline hazard function.
dataPA(N, case, alpha)
dataPA(N, case, alpha)
N |
size of dataset |
case |
data generation configuration; takes value in 1, 2 and 3. |
alpha |
parameter of link function; alpha=0 for the PH model and alpha=1 for the PO model. |
The above function generate interval-censored data from generalized odds-rate model, under different simulation configurations. For further details please see Lu et al. (2019+).
d1 |
vector indicating whether an observation is left-censored (1) or not (0). |
d2 |
vector indicating whether an observation is interval-censored (1) or not (0). |
d3 |
vector indicating whether an observation is right-censored (1) or not (0). |
Li |
the left endpoint of the observed interval; if an observation is left-censored, its corresponding entry should be 0. |
Ri |
the right endpoint of the observed interval; if an observation is right-censored, its corresponding entry should be Inf. |
Z |
design matrix of predictor variables (in columns); should be specified without an intercept term. |
Lu, M., Liu, Y., Li, C. and Sun, J. (2019+). An efficient penalized estimation approach for a semi-parametric linear transformation model with interval-censored data. arXiv:1912.11703.
case <- 3 nsub <- 100 # Generate interval-censored data under PH model dat <- dataPA(nsub,case,alpha=0) rp <- c(mean(dat$d1),mean(dat$d2),mean(dat$d3)) rp # [1] 0.63 0.22 0.15
case <- 3 nsub <- 100 # Generate interval-censored data under PH model dat <- dataPA(nsub,case,alpha=0) rp <- c(mean(dat$d1),mean(dat$d2),mean(dat$d3)) rp # [1] 0.63 0.22 0.15
Fits the generalized odds-rate model based on penalized B-splines to interval censored data via an EM algorithm.
EM_fit(g0,b0,d1,d2,d3,Li,Ri,Z,nsub,alpha,qn,order,t.seq,tol=1e-5,itmax=500,lamu=1e5)
EM_fit(g0,b0,d1,d2,d3,Li,Ri,Z,nsub,alpha,qn,order,t.seq,tol=1e-5,itmax=500,lamu=1e5)
g0 |
initial estimate of the spline coefficients; should be of length qn+order+1. |
b0 |
initial estimate of regression coefficients; should be of length dim(Z)[2]. |
d1 |
vector indicating whether an observation is left-censored (1) or not (0). |
d2 |
vector indicating whether an observation is interval-censored (1) or not (0). |
d3 |
vector indicating whether an observation is right-censored (1) or not (0). |
Li |
the left endpoint of the observed interval; if an observation is left-censored, its corresponding entry should be 0. |
Ri |
the right endpoint of the observed interval; if an observation is right-censored, its corresponding entry should be Inf. |
Z |
design matrix of predictor variables (in columns); should be specified without an intercept term. |
nsub |
size of observed dataset. |
alpha |
parameter of link function; alpha=0 for the PH model and alpha=1 for the PO model. |
qn |
the number of interior knots to be used; should not exceed square root of sample size. |
order |
the order of the basis functions; order=3 for cubic spline. |
tol |
the convergence criterion of the EM algorithm. |
t.seq |
an increasing sequence of points at which the cumulative baseline hazard function is evaluated. |
itmax |
maximum iterations of EM procedure. |
lamu |
upper limit of smoothing parameter. |
The above function fits the generalized odds-rate model (with specified value of alpha) to interval censored data via an EM algorithm using penalized monotone B-splines.
b |
estimates of the regression coefficients. |
g |
estimates of the spline coefficients. |
se |
the standard deviation of b. |
base |
estimated cumulative baseline hazard function evaluated at the points t.seq. |
lambda |
final value of smooth parameter. |
flag |
the indicator whether the procedure converged; 0 if converged. |
Lu, M., Liu, Y., Li, C. and Sun, J. (2019+). An efficient penalized estimation approach for a semi-parametric linear transformation model with interval-censored data. arXiv:1912.11703.
set.seed(1) case <- 2 nsub <- 35 # Generate interval-censored data under PH model dat <- dataPA(nsub,case,alpha=0) rp <- c(mean(dat$d1),mean(dat$d2),mean(dat$d3)) rp # [1] 0.2571429 0.3428571 0.4000000 t.seq <- seq(0.01,4,0.01) # number of interior knots to be used qn <- ceiling(nsub^(1/3))-2 order <- 3 d1 <- dat$d1 d2 <- dat$d2 d3 <- dat$d3 Ri <- dat$Ri Li <- dat$Li Z <- dat$Z p <- ncol(Z) b0 <- rep(0,p) g0 <- sort(runif(qn+order+1,-1,1)) # Fit data under PH model fit <- EM_fit(g0,b0,d1,d2,d3,Li,Ri,Z,nsub,alpha=0,qn,order,t.seq,tol=1e-2,itmax=100,lamu=1e5) cbind(fit$b,fit$se) # [,1] [,2] #[1,] -1.0655212 0.5021835 #[2,] 0.7649178 0.3185045
set.seed(1) case <- 2 nsub <- 35 # Generate interval-censored data under PH model dat <- dataPA(nsub,case,alpha=0) rp <- c(mean(dat$d1),mean(dat$d2),mean(dat$d3)) rp # [1] 0.2571429 0.3428571 0.4000000 t.seq <- seq(0.01,4,0.01) # number of interior knots to be used qn <- ceiling(nsub^(1/3))-2 order <- 3 d1 <- dat$d1 d2 <- dat$d2 d3 <- dat$d3 Ri <- dat$Ri Li <- dat$Li Z <- dat$Z p <- ncol(Z) b0 <- rep(0,p) g0 <- sort(runif(qn+order+1,-1,1)) # Fit data under PH model fit <- EM_fit(g0,b0,d1,d2,d3,Li,Ri,Z,nsub,alpha=0,qn,order,t.seq,tol=1e-2,itmax=100,lamu=1e5) cbind(fit$b,fit$se) # [,1] [,2] #[1,] -1.0655212 0.5021835 #[2,] 0.7649178 0.3185045