Title: | Fitting Regression Models & Regression with Arma Errors, Subject to Constraints and Restrictions to the Coefficients |
---|---|
Description: | Fits or generalized linear models either a regression with arma errors (Time series) where the parameters could be subject to constraints or restrictions. The model is specified by an objective function (Gaussian, Binomial or Poisson) or an Arma order (p,q), a vector of bound constraints for the coefficients and the possibility to incorporate restrictions among coefficients (i.e b1 > b2). |
Authors: | Josep Puig Sallés |
Maintainer: | Josep Puig <[email protected]> |
License: | GPL-2 | GPL-3 |
Version: | 0.1.0 |
Built: | 2024-11-06 03:57:06 UTC |
Source: | https://github.com/puigjos/consreg |
ConsReg is a function that allows to estimate a regression model: linear regression (gaussian), logistic regression (binomial) or poisson regression. It allows the introduction of restrictions (both lower and upper limits) and restrictions between the coefficients (in the form, for example, of a>b).
ConsReg(...) ## Default S3 method: ConsReg(x, y, family, optimizer, ini.pars.coef = NULL, constraints = NULL, LOWER = NULL, UPPER = NULL, penalty = 1000, ...) ## S3 method for class 'formula' ConsReg(formula, data = list(), optimizer = "solnp", family = c("gaussian", "binomial"), constraints = NULL, LOWER = NULL, UPPER = NULL, penalty = 1000, na.action = "na.omit", ini.pars.coef = NULL, ...)
ConsReg(...) ## Default S3 method: ConsReg(x, y, family, optimizer, ini.pars.coef = NULL, constraints = NULL, LOWER = NULL, UPPER = NULL, penalty = 1000, ...) ## S3 method for class 'formula' ConsReg(formula, data = list(), optimizer = "solnp", family = c("gaussian", "binomial"), constraints = NULL, LOWER = NULL, UPPER = NULL, penalty = 1000, na.action = "na.omit", ini.pars.coef = NULL, ...)
... |
additional parameters passed in the optimizer (number of iterations, ...) |
x |
matrix of predictive variables |
y |
vector of outcome variable |
family |
a description of the error distribution and link function to be used in the model. Possible values are: "gaussian" (linear regression) or "binomial" (logistic regression) and "poisson" |
optimizer |
Optimizer package used for fit the model (include bayesian and genetic algorithm optimization). Possible values are: "solnp" (default) (Rsolnp), "gosonlp" (Rsolnp), "optim" (stats::optim), "nloptr" (nloptr), DEoptim ("DEoptim"), "dfoptim" (dfoptim), "mcmc" (FME::modMCMC), "MCMCmetrop" (MCMCpack::MCMCmetrop1R),'adaptMCMC'(adaptMCMC::MCMC), "GA" (GA package), "GenSA" (GenSA package) |
ini.pars.coef |
vector of initial parameters. In case there is some constraint, then the ini.pars.coef should fulfill the constraints |
constraints |
vector of constraints (see details) |
LOWER |
(default NULL) vector of lower bounds for the coefficients. If the lenght of LOWER is not equal with the length of the coeefficients, then, the rest will be equal to -Inf |
UPPER |
(default NULL) vector of lower bounds for the coefficients. If the lenght of UPPER is not equal with the length of the coeefficients, then, the rest will be equal to +Inf |
penalty |
(default 1000) penalty to the objective function if some constraints do not fullfill |
formula |
an object of class "formula" (or one that can be coerced to that class): a symbolic description of the model to be fitted |
data |
an optional data frame, list or environment (or object coercible by as.data.frame to a data frame) containing the variables in the model. If not found in data, the variables are taken from environment(formula), typically the environment from which lm is called. |
na.action |
na.action to the data |
Several optimizers of various R packages are implemented, including methods typically used in Bayesian regressions like Markov Chain Monte Carlo simulation.
Constraints will be a string: For example, if x1 and x2 are two coefficient names, then a constraint could be: "x1 > x2" or "x1+x2 > 2". For some constraints, one can write: "x1+x2>2, x1 > 1". Each constraint will be separate by commas.
Important: if there are some constraints that do not fulfill in a model without those constraints,
it is recommended to use ini.pars.coef
parameter to set initial values that fulfill constraints.
See the example
An object of class "ConsReg
".
coefficients |
Coefficients of the regression |
hessian |
hessian matrix if the optimizer can return it |
family |
Model family function |
optimizer |
optimizer object return (see details of each optimization package) |
optimizer.name |
name of the optimizer |
df |
nrow(data) - number of coefficients |
rank |
number of coefficients |
residuals |
residuals of the model |
fitted |
fitted values of the model |
metrics |
Accuracy metrics of the model |
call |
the matched call |
y |
objective variable |
x |
regressors |
formula |
formula term |
Josep Puig Sallés
data('fake_data') fit1 = ConsReg(formula = y~x1+x2+x3+ I(x3^2) + x4, family = 'gaussian', optimizer = 'mcmc', data = fake_data) summary(fit1) # We impose constraints to x3 and x3^2 and x4 fit2 = ConsReg(formula = y~x1+x2+x3+ I(x3^2) + x4, data = fake_data, family = 'gaussian', constraints = '(x3 + `I(x3^2)`) > .01, x4 < .2', optimizer = 'mcmc', ini.pars.coef = c(-1.65, .12, -.004, 0.1, 0.1, .15)) fit1$coefficients fit2$coefficients
data('fake_data') fit1 = ConsReg(formula = y~x1+x2+x3+ I(x3^2) + x4, family = 'gaussian', optimizer = 'mcmc', data = fake_data) summary(fit1) # We impose constraints to x3 and x3^2 and x4 fit2 = ConsReg(formula = y~x1+x2+x3+ I(x3^2) + x4, data = fake_data, family = 'gaussian', constraints = '(x3 + `I(x3^2)`) > .01, x4 < .2', optimizer = 'mcmc', ini.pars.coef = c(-1.65, .12, -.004, 0.1, 0.1, .15)) fit1$coefficients fit2$coefficients
ConsRegArima is a function that allows to estimate a regression model with errors following an ARMA process (p,q). It allows the introduction of restrictions (both lower and upper limits) and restrictions between the coefficients (in the form, for example, of a>b).
ConsRegArima(...) ## Default S3 method: ConsRegArima(x, y, order, seasonal, optimizer, LOWER = NULL, UPPER = NULL, penalty = 1000, constraints = NULL, ini.pars.coef, model_fit = NULL, ...) ## S3 method for class 'formula' ConsRegArima(formula, data = list(), optimizer = c("solnp"), order = c(0, 0), seasonal = list(order = c(0, 0), period = NA), LOWER = NULL, UPPER = NULL, penalty = 1000, constraints = NULL, ini.pars.coef = NULL, na.action = "na.omit", ...)
ConsRegArima(...) ## Default S3 method: ConsRegArima(x, y, order, seasonal, optimizer, LOWER = NULL, UPPER = NULL, penalty = 1000, constraints = NULL, ini.pars.coef, model_fit = NULL, ...) ## S3 method for class 'formula' ConsRegArima(formula, data = list(), optimizer = c("solnp"), order = c(0, 0), seasonal = list(order = c(0, 0), period = NA), LOWER = NULL, UPPER = NULL, penalty = 1000, constraints = NULL, ini.pars.coef = NULL, na.action = "na.omit", ...)
... |
additional parameters passed in the optimizer (number of iterations, ...) |
x |
matrix of predictive variables |
y |
vector of outcome variable |
order |
Arma component (p, q) |
seasonal |
A specification of the seasonal part of the ARMA model (P,Q), plus the period (which defaults to 1). |
optimizer |
Optimizer package used for fit the model (include bayesian and genetic algorithm optimization). Possible values are: "solnp" (default) (Rsolnp), "gosonlp" (Rsolnp), "optim" (stats::optim), "nloptr" (nloptr), DEoptim ("DEoptim"), "dfoptim" (dfoptim), "mcmc" (FME::modMCMC), "MCMCmetrop" (MCMCpack::MCMCmetrop1R), 'adaptMCMC'(adaptMCMC::MCMC), "GA" (GA package), "GenSA" (GenSA package) |
LOWER |
(default NULL) vector of lower bounds for the coefficients. If the lenght of LOWER is not equal with the length of the coeefficients, then, the rest will be equal to -Inf |
UPPER |
(default NULL) vector of lower bounds for the coefficients. If the lenght of UPPER is not equal with the length of the coeefficients, then, the rest will be equal to +Inf |
penalty |
(default 1000) penalty to the objective function if some constraints do not fullfill |
constraints |
vector of constraints (see details) |
ini.pars.coef |
vector of initial parameters. In case there is some constraint, then the ini.pars.coef should fulfill the constraints. This vector is only for regression component. |
model_fit |
object of class |
formula |
an object of class "formula" (or one that can be coerced to that class): a symbolic description of the model to be fitted |
data |
an optional data frame, list or environment (or object coercible by as.data.frame to a data frame) containing the variables in the model. If not found in data, the variables are taken from environment(formula), typically the environment from which lm is called. |
na.action |
na.action to the data |
Several optimizers of various R packages are implemented, including methods typically used in Bayesian regressions like Markov Chain Monte Carlo simulation.
Constraints will be a string: For example, if x1 and x2 are two coefficient names, then a constraint could be: "x1 > x2" or "x1+x2 > 2". For some constraints, one can write: "x1+x2>2, x1 > 1". Each constraint will be separate by commas.
Important: if there are some constraints that do not fulfill in a model without those constraints,
it is recommended to use ini.pars.coef
parameter to set initial values that fulfill constraints.
See the example
On the other hand, aic value is computed as auto.arima function computes the AIC when method == 'CSS':
Where npa
r I set the number of coefficients.
An object of class "ConsRegArima
".
coefficients |
Coefficients (regression + arma errors) |
hessian |
hessian matrix if the optimizer can return it |
optimizer |
optimizer object return (see details of each optimization package) |
optimizer.name |
name of the optimizer |
df |
nrow(data) - number of coefficients |
rank |
number of coefficients |
objective_function |
objective_function used |
model |
A list representing the Kalman Filter used in the fitting |
sigma2 |
the MLE of the innovations variance |
residuals |
residuals of the model |
fitted |
fitted values of the model |
fitted_regression |
fitted values only of the regression part |
fitted_arima |
fitted values only of the arma part |
metrics |
Accuracy metrics of the model (accuracy function of the forecast package) |
call |
the matched call |
y |
objective series |
x |
regressors |
formula |
formula term |
aic |
the AIC value (see details) |
bic |
the BIC value |
aicc |
the AICc value |
Josep Puig Salles
Peiris, M. & Perera, B. (1988), On prediction with fractionally Hyndman RJ, Khandakar Y (2008). “Automatic time series forecasting: the forecast package for R.”
data('series') fit1 = ConsRegArima(formula = y ~ x1+x2 +x3+x4, order = c(2, 1), data = series) summary(fit1) fit2 = ConsRegArima(formula = y ~ x1+x2 +x3+x4, order = c(2, 1), data = series, constraints = '(x3 +.1) > x1', ini.pars.coef = c(.96, .2, -.8, .4), UPPER = 1, LOWER = -1) fit1$coefficients fit2$coefficients
data('series') fit1 = ConsRegArima(formula = y ~ x1+x2 +x3+x4, order = c(2, 1), data = series) summary(fit1) fit2 = ConsRegArima(formula = y ~ x1+x2 +x3+x4, order = c(2, 1), data = series, constraints = '(x3 +.1) > x1', ini.pars.coef = c(.96, .2, -.8, .4), UPPER = 1, LOWER = -1) fit1$coefficients fit2$coefficients
Fake data to show gaussian model example
fake_data
fake_data
data
data('fake_data')
data('fake_data')
Plot an roll object plot an roll.ConsRegArima object
## S3 method for class 'roll.ConsRegArima' plot(x, ...)
## S3 method for class 'roll.ConsRegArima' plot(x, ...)
x |
roll.ConsRegArima object |
... |
Additional params passed to ggplot2::labs function |
ConsReg
Predict or fitted values of object ConsReg
## S3 method for class 'ConsReg' predict(object, newdata = NULL, components = F, ...)
## S3 method for class 'ConsReg' predict(object, newdata = NULL, components = F, ...)
object |
object of class |
newdata |
New data to predict the objective function. If is NULL (default), then the fitted values will be returned |
components |
if its |
... |
Additional argument passed to family. In particular, at this moment, if type = 'link', then for binomial family, it will return the link values |
predictions
data('fake_data') data = fake_data data$y = 1/(1+exp(-data$y)) data$y = ifelse(data$y > .5, 1, 0) table(data$y) fit5 = ConsReg(y~x1+x2+x3+x4, data = data, family = 'binomial', penalty = 10000, LOWER = -.5, UPPER = .2, optimizer = 'gosolnp') pr = predict(fit5, newdata = data[1:3,], type = 'probability') pr
data('fake_data') data = fake_data data$y = 1/(1+exp(-data$y)) data$y = ifelse(data$y > .5, 1, 0) table(data$y) fit5 = ConsReg(y~x1+x2+x3+x4, data = data, family = 'binomial', penalty = 10000, LOWER = -.5, UPPER = .2, optimizer = 'gosolnp') pr = predict(fit5, newdata = data[1:3,], type = 'probability') pr
Obtains predictions of ConsRegArima object
## S3 method for class 'ConsRegArima' predict(object, h = ifelse(is.null(newdata), 1, nrow(newdata)), newdata = NULL, intervals = 90, origdata = NULL, ...) ## S3 method for class 'predict.ConsRegArima' print(x, ...) ## S3 method for class 'predict.ConsRegArima' plot(x, ...)
## S3 method for class 'ConsRegArima' predict(object, h = ifelse(is.null(newdata), 1, nrow(newdata)), newdata = NULL, intervals = 90, origdata = NULL, ...) ## S3 method for class 'predict.ConsRegArima' print(x, ...) ## S3 method for class 'predict.ConsRegArima' plot(x, ...)
object |
ConsRegArima object |
h |
horizont to predict |
newdata |
data frame in which to look for variables with which to predict. In case there is no regression part, this parameter could be set NULL |
intervals |
Confidence level for prediction intervals (default 90) |
origdata |
Original data (default NULL). Useful if lagged predictive variables are used in the formula |
... |
Additional params passed to the function ggplot2::labs |
x |
object of class predict.ConsRegArima |
Returns an object of class predict.ConsRegArima
predict |
dataframe with the predictions |
table |
dataframe with the predictions as well as the fitted values |
intervals |
Interval level |
object |
original object |
Function for creating rolling density forecast from ConsRegArima
models with
option for refitting every n periods.
rolling(object, used.sample, refit, h = 1, orig.data, ...)
rolling(object, used.sample, refit, h = 1, orig.data, ...)
object |
ConsRegArima object |
used.sample |
The starting point in the dataset from which to initialize the rolling forecast. |
refit |
Determines every how many periods the model is re-estimated.
If |
h |
The number of periods to forecast |
orig.data |
data original which was used to estimate the ConsRegArima |
... |
Additional params for predict function |
results |
data.frame with Real, Prediction, Prediction_High,
Prediction_Low and fitted values of the |
refitT |
how many periods the model is re-estimated |
metrics |
Main metrics of the predictions |
data('series') fit1 = ConsRegArima(formula = y ~ x1+x2 +x3, order = c(2, 1), optimizer = 'solnp', data = series) roll = rolling(fit1, used.sample = 40, refit = 5, orig.data = series, h=3) roll plot(roll)
data('series') fit1 = ConsRegArima(formula = y ~ x1+x2 +x3, order = c(2, 1), optimizer = 'solnp', data = series) roll = rolling(fit1, used.sample = 40, refit = 5, orig.data = series, h=3) roll plot(roll)
Sales data set for time series models
sales
sales
data
data('series')
data('series')
Fake data to show Arima example
series
series
data