% Generated by roxygen2: do not edit by hand
% Please edit documentation in R/check_outliers.R
\name{check_outliers}
\alias{check_outliers}
\alias{check_outliers.default}
\title{Check for influential observations}
\usage{
check_outliers(x, ...)

\method{check_outliers}{default}(x, method = c("cook", "mahalanobis",
  "ics"), threshold = NULL, ...)
}
\arguments{
\item{x}{A model object.}

\item{...}{When \code{method = "ics"}, further arguments in \code{...} are
passed down to \code{ICSOutlier::ics.outlier()}.}

\item{method}{The method to calculate the distance, at which value a point is
considered as "outlier".}

\item{threshold}{The threshold indicating at which distance an observation is
considered as outlier. Possible values are \code{"cook"} (Cook's Distance),
\code{"mahalanobis"} (Mahalanobis Distance) or \code{"ics"} (Invariant
Coordinate Selection, \cite{Archimbaud et al. 2018}). May be abbreviated.
See 'Details'.}
}
\value{
Check (message) on whether outliers were detected or not, as well as a
data frame (with the original data that was used to fit the model), including
information on the distance measure and whether or not an observation is considered
as outlier.
}
\description{
Checks for and locates influential observations (i.e., "outliers") via several distance methods.
}
\details{
Performs a distance test to check for influential observations. Those
greater as a certain threshold, are considered outliers. This relatively conservative
threshold is useful only for detection, rather than justificaiton for automatic
observation deletion.
\subsection{Choosing Default Thresholds}{
\describe{
\item{\strong{Cook's Distance}}{
When \code{method = "cook"}, \code{threshold} defaults to 4 divided by numbers of observations.
}
\item{\strong{Mahalanobis Distance}}{
When \code{method = "mahalanobis"}, the default for \code{threshold} is based on
a weird formula (\code{floor(3 * sqrt(sum(cov(predictors)^2)) / nobs(x))}),
which is limted to values between 3 and 10, to account for different variation
in the data depending on the number of observations. There is no "rule of thumb"
for the threshold regarding the Mahalanobis Distance, most studies use a value
between 3 and 10. It is most likely better to define own, sensible thresholds.
}
\item{\strong{Invariant Coordinate Selection}}{
If \code{method = "ics"}, the threshold is determined by \code{ICSOutlier::ics.outlier()}.
Refer to the help-file of that function to get more details about this procedure.
Note that \code{method = "ics"} requires both \pkg{ICS} and \pkg{ICSOutlier}
to be installed, and that it takes a bit longer to compute the results.
}
}
}
}
\examples{
# select only mpg and disp (continuous)
mt1 <- mtcars[, c(1, 3, 4)]
# create some fake outliers and attach outliers to main df
mt2 <- rbind(mt1, data.frame(mpg = c(37, 40), disp = c(300, 400), hp = c(110, 120)))
# fit model with outliers
model <- lm(disp ~ mpg + hp, data = mt2)

check_outliers(model)
plot(check_outliers(model))

check_outliers(model, method = "m")

\dontrun{
# This one takes some seconds to finish...
check_outliers(model, method = "ics")}

}
\references{
\itemize{
\item Cook, R. D. (1977). Detection of influential observation in linear regression. Technometrics, 19(1), 15-18.
\item Archimbaud, A., Nordhausen, K., & Ruiz-Gazen, A. (2018). ICS for multivariate outlier detection with application to quality control. Computational Statistics & Data Analysis, 128, 184–199. \doi{10.1016/j.csda.2018.06.011}
}
}
