|
|||||||||
PREV CLASS NEXT CLASS | FRAMES NO FRAMES | ||||||||
SUMMARY: NESTED | FIELD | CONSTR | METHOD | DETAIL: FIELD | CONSTR | METHOD |
java.lang.Objectorg.apache.commons.math.stat.regression.AbstractMultipleLinearRegression
org.apache.commons.math.stat.regression.OLSMultipleLinearRegression
public class OLSMultipleLinearRegression
Implements ordinary least squares (OLS) to estimate the parameters of a multiple linear regression model.
OLS assumes the covariance matrix of the error to be diagonal and with equal variance.
u ~ N(0, σ2I)
The regression coefficients, b, satisfy the normal equations:
XT X b = XT y
To solve the normal equations, this implementation uses QR decomposition
of the X matrix. (See QRDecompositionImpl
for details on the
decomposition algorithm.)
XTX b = XT y
(QR)T (QR) b = (QR)Ty
RT (QTQ) R b = RT QT y
RT R b = RT QT y
(RT)-1 RT R b = (RT)-1 RT QT y
R b = QT y
Field Summary | |
---|---|
private QRDecomposition |
qr
Cached QR decomposition of X matrix |
Fields inherited from class org.apache.commons.math.stat.regression.AbstractMultipleLinearRegression |
---|
X, Y |
Constructor Summary | |
---|---|
OLSMultipleLinearRegression()
|
Method Summary | |
---|---|
protected RealVector |
calculateBeta()
Calculates regression coefficients using OLS. |
protected RealMatrix |
calculateBetaVariance()
Calculates the variance on the beta by OLS. |
RealMatrix |
calculateHat()
Compute the "hat" matrix. |
protected double |
calculateYVariance()
Calculates the variance on the Y by OLS. |
private static void |
checkUpperTriangular(RealMatrix m,
double epsilon)
Check if a matrix is upper-triangular. |
void |
newSampleData(double[] y,
double[][] x)
Loads model x and y sample data, overriding any previous sample. |
void |
newSampleData(double[] data,
int nobs,
int nvars)
Loads model x and y sample data from a flat array of data, overriding any previous sample. |
protected void |
newXSampleData(double[][] x)
Loads new x sample data, overriding any previous sample |
private static RealVector |
solveUpperTriangular(RealMatrix coefficients,
RealVector constants)
Uses back substitution to solve the system |
Methods inherited from class org.apache.commons.math.stat.regression.AbstractMultipleLinearRegression |
---|
calculateResiduals, estimateRegressandVariance, estimateRegressionParameters, estimateRegressionParametersStandardErrors, estimateRegressionParametersVariance, estimateResiduals, newYSampleData, validateCovarianceData, validateSampleData |
Methods inherited from class java.lang.Object |
---|
clone, equals, finalize, getClass, hashCode, notify, notifyAll, toString, wait, wait, wait |
Field Detail |
---|
private QRDecomposition qr
Constructor Detail |
---|
public OLSMultipleLinearRegression()
Method Detail |
---|
public void newSampleData(double[] y, double[][] x)
y
- the [n,1] array representing the y samplex
- the [n,k] array representing the x sample
java.lang.IllegalArgumentException
- if the x and y array data are not
compatible for the regressionpublic void newSampleData(double[] data, int nobs, int nvars)
newSampleData
in class AbstractMultipleLinearRegression
data
- input data arraynobs
- number of observations (rows)nvars
- number of independent variables (columns, not counting y)public RealMatrix calculateHat()
Compute the "hat" matrix.
The hat matrix is defined in terms of the design matrix X by X(XTX)-1XT
The implementation here uses the QR decomposition to compute the hat matrix as Q IpQT where Ip is the p-dimensional identity matrix augmented by 0's. This computational formula is from "The Hat Matrix in Regression and ANOVA", David C. Hoaglin and Roy E. Welsch, The American Statistician, Vol. 32, No. 1 (Feb., 1978), pp. 17-22.
protected void newXSampleData(double[][] x)
newXSampleData
in class AbstractMultipleLinearRegression
x
- the [n,k] array representing the x sampleprotected RealVector calculateBeta()
calculateBeta
in class AbstractMultipleLinearRegression
protected RealMatrix calculateBetaVariance()
Calculates the variance on the beta by OLS.
Var(b) = (XTX)-1
Uses QR decomposition to reduce (XTX)-1 to (RTR)-1, with only the top p rows of R included, where p = the length of the beta vector.
calculateBetaVariance
in class AbstractMultipleLinearRegression
protected double calculateYVariance()
Calculates the variance on the Y by OLS.
Var(y) = Tr(uTu)/(n - k)
calculateYVariance
in class AbstractMultipleLinearRegression
private static RealVector solveUpperTriangular(RealMatrix coefficients, RealVector constants)
Uses back substitution to solve the system
coefficients X = constants
coefficients must upper-triangular and constants must be a column matrix. The solution is returned as a column matrix.
The number of columns in coefficients determines the length of the returned solution vector (column matrix). If constants has more rows than coefficients has columns, excess rows are ignored. Similarly, extra (zero) rows in coefficients are ignored
coefficients
- upper-triangular coefficients matrixconstants
- column RHS constants vector
private static void checkUpperTriangular(RealMatrix m, double epsilon)
Check if a matrix is upper-triangular.
Makes sure all below-diagonal elements are within epsilon of 0.
m
- matrix to checkepsilon
- maximum allowable absolute value for elements below
the main diagonal
java.lang.IllegalArgumentException
- if m is not upper-triangular
|
|||||||||
PREV CLASS NEXT CLASS | FRAMES NO FRAMES | ||||||||
SUMMARY: NESTED | FIELD | CONSTR | METHOD | DETAIL: FIELD | CONSTR | METHOD |