Michael Thomas Flanagan's Java Scientific Library

BoxCox Class:     Box-Cox transformation

     

Last update: 7 December 2011                                                                                                                              Main Page of Michael Thomas Flanagan's Java Scientific Library

This class contains methods for performing the Box-Cox transformation
    
on an array of data, yi. The Box-Cox transformation is used to attempt to transform an array of data, yi, to one, yi(λ), that conforms to a sample taken from a Gaussian (normal) distribution.

In this class, the data, yi, is first standardized:

where and s are the mean and standard deviation of the yi.
The Box-Cox parameter, λ2, is set at:

An optimum value of the Box-Cox parameter, λ1, is calculated as the value of λ1 maximizing the correlation coefficient of a Gaussian (normal) probability plot of the transformed data. This maximization is performed using the Maximisation class which employs a Nelder and Mead Simplex optimization.
Probabilty plot, skewness and kurtosis merthods are present to facilitate checking the efficacy of the transformation.
An inverse transform method and a fixed value (λ1 and λ2) method are also available.

See Stat class for the general statistical methods and functions.

import directive: import flanagan.analysis.BoxCox;

SUMMARY OF METHODS

Constructors   public BoxCox(double[] ydata)
public BoxCox(float[] ydata)
public BoxCox(int[] ydata)
public BoxCox(long[] ydata)
public BoxCox(short[] ydata)
public BoxCox(byte[] ydata)
public BoxCox(BigDecimal[] ydata)
public BoxCox(BigInteger[] ydata)
public BoxCox(ArrayMaths ydata)
public BoxCox(ArrayList<Object> ydata)
public BoxCox(Vector<Object> ydata)
public BoxCox(Stat ydata)
Analysis Box-Cox transform and full anlysis public void analysis()
public void analysis(String fileTitle)
Transformed Data Transform data with best estimates of λ1 and λ2 public double[] transform()
Scaled transformed data public double[] scaledTransformedData()
Ordered scaled transformed data public double[] orderedScaledTransformedData()
Standardized transformed data public double[] standardizedTransformedData()
Transformed data public double[] transformedData()
λ1 public double lambdaOne()
λ2 public double lambdaTwo()
Gaussian Probability Plot Display plot public void transformedProbabiltyPlot()
Correlation coefficient public double transformedCorrelationCoefficient()
Gradient public double transformedGradient()
public double transformedGradientError()
Intercept public double transformedIntercept()
public double transformedInterceptError()
Mean public double transformedMean()
Standard deviation public double transformedStandardDeviation()
Standard error of the mean public double transformedStandardError()
Moment skewness public double transformedMomentSkewness()
Median skewness public double transformedMedianSkewness()
Quartile skewness public double transformedQuartileSkewness()
Excess kurtosis public double transformedExcessKurtosis()
Median public double transformedMedian()
Minimum public double transformedMinimum()
Maximum public double transformedMaximum()
Range public double transformedRange()
Fixed Value Transform Transform data with user supplied values of λ1 and λ2 public double[] fixedValueTransform(double lambdaOne, double lambdaTwo)
public double[] fixedValueTransform(double lambdaOne)
Inverse Transform Perform inverse transform public double[] inverseTransform(double lambdaOne, double lambdaTwo)
public double[] inverseTransform(double lambdaOne)
Shift factor public double lambdaThree()
Original Data Original data public double[] originalData()
standardized original data public double[] standardizedOriginalData()
Sorted original data public double[] sortedOriginalData()
Shifted standardized original data public double[] shiftedStandardizedOriginalData()
Gaussian Probability Plot Display plot public void originalProbabiltyPlot()
Correlation coefficient public double originalCorrelationCoefficient()
Gradient public double originalGradient()
public double originalGradientError()
Intercept public double originalIntercept()
public double originalInterceptError()
Mean public double originalMean()
Standard deviation public double originalStandardDeviation()
Standard error of the mean public double originalStandardError()
Moment skewness public double originalMomentSkewness()
Median skewness public double originalMedianSkewness()
Quartile skewness public double originalQuartileSkewness()
Excess kurtosis public double originalExcessKurtosis()
Median public double originalMedian()
Minimum public double originalMinimum()
Maximum public double originalMaximum()
Range public double originalRange()
Set the variance denominator Set denominator to n public void setDenominatorToN()




CONSTRUCTORS

public BoxCox(double[] ydata)
public BoxCox(float[] ydata)
public BoxCox(int[] ydata)
public BoxCox(long[] ydata)
public BoxCox(short[] ydata)
public BoxCox(byte[] ydata)
public BoxCox(BigDecimal[] ydata)
public BoxCox(BigInteger[] ydata)
public BoxCox(ArrayMaths ydata)
public BoxCox(ArrayList<Object> ydata)
public BoxCox(Vector<Object> ydata)
public BoxCox(Stat ydata)
Usage:                      BoxCox bc = new BoxCox(ydata);
Creates an instance of BoxCox for transformation of the data entered as the argument ydata. The data are stored as an array of double but may be entered as:



PERFORM TRANSFORM AND TRANSFORMED DATA

Perform Box-Cox transform
public double[] transform()
Usage:                      data = bc.transform();
This method performs the Box-Cox transformation
    
on an array of data, yi, entered via the constructor argument. In this method, the data, yi, is first standardized:

where and s are the mean and standard deviation of the yi.
The Box-Cox parameter, λ2, is set at:

An optimum value of the Box-Cox parameter, λ1, used in this transformation, is calculated as the value of λ1 maximizing the correlation coefficient of a Gaussian (normal) probability plot of the transformed data. This maximization is performed using the Maximisation class which employs a Nelder and Mead Simplex optimization.
This method returns the transformed data scaled to the mean and standard deviation of the original data.

Scaled transformed data
public double[] scaledTransformedData()
Usage:                      data = bc.scaledTransformedData();
This method returns the transformed data scaled to the mean and standard deviation of the original data.

Ordered scaled transformed data
public double[] orderedScaledTransformedData()
Usage:                      data = bc.orderedScaledTransformedData();
This method returns the transformed data scaled to the mean and standard deviation of the original data and ordered into ascending values.

Standardized transformed data
public double[] standardizedTransformedData()
Usage:                      data = bc.standardizedTransformedData();
This method returns the standardized transformed data.

Transformed data
public double[] transformedData()
Usage:                      transformedData = bc.transformedData();
This method returns the transformed data obtained on transforming the shifted standardized original data using the values of λ1 and λ1 that may be obtained using the methods below.

Box-Cox parameter, λ1
public double lambdaOne()
Usage:                      transformedData = bc.lambdaOne();
This method returns the Box-Cox parameter, λ1, giving a maximum value of the probabilty plot correlation coefficient. See above for more details.
If the value of λ1 has been entered, i.e. via a fixed value or inverse transform, this entered value is returned by this method.

Box-Cox parameter, λ2
public double lambdaTwo()
Usage:                      transformedData = bc.lambdaTwo();
This method returns the Box-Cox parameter, λ2, set as

See above for more details.
If the value of λ2 has been entered, i.e. via a fixed value or inverse transform, this entered value is returned by this method.

Gaussian Probability Plot
Mean
public double transformedMean()
Usage:                      mean = bc.transformedMean();
This method returns the mean of the scaled transformed data.

Standard deviation
public double transformedStandardDeviation()
Usage:                      sd = bc.transformedStandardDeviation();
This method returns the standard deviation of the scaled transformed data.

Standard error of the mean
public double transformedStandardError()
Usage:                      sd = bc.transformedStandardError();
This method returns the standard error of the mean of the scaled transformed data.

Moment skewness
public double transformedMomentSkewness()
Usage:                      skewness = bc.transformedMomentSkewness();
This method returns the Moment skewness of the scaled transformed data.

Median skewness
public double transformedMedianSkewness()
Usage:                      skewness = bc.transformedMedianSkewness();
This method returns the Median skewness of the scaled transformed data.

Quartile skewness
public double transformedQuartileSkewness()
Usage:                      skewness = bc.transformedQuartileSkewness();
This method returns the Quartile skewness of the scaled transformed data.

Excess kurtosis
public double transformedExcessKurtosis()
Usage:                      kurtosis = bc.transformedExcessKurtosis();
This method returns the Excess kurtosis of the scaled transformed data.

Median
public double transformedMedian()
Usage:                      median = bc.transformedMedian();
This method returns the median value of the scaled transformed data.

Minimum
public double transformedMinimum()
Usage:                      minimum = bc.transformedMinimum();
This method returns the minimum value of the scaled transformed data.

Maximum
public double transformedMaximum()
Usage:                      maximum = bc.transformedMaximum();
This method returns the maximum value of the scaled transformed data.

Range
public double transformedRange()
Usage:                      range = bc.transformedRange();
This method returns the range of the scaled transformed data.




PERFORM FIXED VALUE TRANSFORM

Perform Box-Cox transform with user supplied values of λ1 and λ2
public double[] fixedValueTransform(double lambdaOne, double lambdaTwo)
public double[] fixedValueTransform(double lambdaOne)
Usage:                      data = bc.fixedValueTransform(lambdaOne, lambdaTwo);
This method performs the Box-Cox transformation
    
on an array of data, yi, entered via the constructor argument. In this method, the data, yi, is first standardized:

where and s are the mean and standard deviation of the yi.
The Box-Cox parameters λ1 and λ2 used are those supplied via the arguments of this method, lambdaOne and lambdaTwo. Obviously this method will not necessarily return a data set that may conform to a sample taken from a Gaussian (normal) distribution. See above for a method that attempts to achieve a full Box-Cox transformation.
This method returns the transformed data scaled to the mean and standard deviation of the original data.

Usage:                      data = bc.fixedValueTransform(lambdaOne);
This method performs the Box-Cox transformation
    
on an array of data, yi, entered via the constructor argument. In this method, the data, yi, is first standardized:

where and s are the mean and standard deviation of the yi.
The Box-Cox parameter λ1 used is that supplied via the argument of this method, lambdaOne, i.e. lambdaTwo is set to zero. Obviously this method will not necessarily return a data set that may conform to a sample taken from a Gaussian (normal) distribution. See above for a method that attempts to achieve a full Box-Cox transformation.
This method returns the transformed data scaled to the mean and standard deviation of the original data.




INVERSE TRANSFORM

Perform Inverse Transform
public double[] inverseTransform(double lambdaOne, double lambdaTwo)
public double[] inverseTransform(double lambdaOne)
Usage:                      inverse = bc.inverseTransform(lambdaOne, lambdaTwo);
This method returns the inverse of the Box Cox transform:

The array entered via a constructor, y, is inverse transformed to the array z and returned to inverse in the above usage. The parameter, λ3, is returned as zero unless the data leads to imaginary transformed data. If this were to occur the original data is shifted so that all values lead to real transformed values. The method for getting the shift factor, λ3, is described below.
The parameters λ1 and λ2 have the same meaning as described above for the Box Cox transform.

Usage:                      inverse = bc.inverseTransform(lambdaOne);
This method returns the inverse of the Box Cox transform:

The array entered via a constructor, y, is inverse transformed to the array z and returned to inverse in the above usage. The parameter, λ3, is returned as zero unless the data leads to imaginary transformed data. If this were to occur the original data is shifted so that all values lead to real transformed values. The method for getting the shift factor, λ3, is described below.
The parameters λ1 has the same meaning as described above for the Box Cox transform and λ2 is set to zero.

Return the value of the inverse transform shift factor
public double lambdaThree()
Usage:                      lambda3 = bc.lambdaThree();
Returns the shift parameter, λ3, as described in the Inverse Transform section immediately above.



ORIGINAL DATA

Original data
public double[] originalData()
Usage:                      data = bc.originalData();
This method returns the original data as an array of doubles.

Standardized original data
public double[] standardizedOriginalData()
Usage:                      data = bc.standardizedOriginalData();
This method returns the original data standardized to a mean of zero and a standard deviation of unity.

Sorted original data
public double[] sortedOriginalData()
Usage:                      data = bc.sortedOriginalData();
This method returns the original data ordered into ascending values.

Shifted standardized original data
public double[] shiftedStandardizedOriginalData()
Usage:                      data = bc.shiftedStandardizedOriginal();
This method returns the original data after standarization and a shift of λ2. This is the data array that is transformed.

Gaussian Probability Plot
Mean
public double originalMean()
Usage:                      mean = bc.originalMean();
This method returns the mean of the scaled original data.

Standard deviation
public double originalStandardDeviation()
Usage:                      sd = bc.originalStandardDeviation();
This method returns the standard deviation of the scaled original data.

Standard error of the mean
public double originalStandardError()
Usage:                      sd = bc.originalStandardError();
This method returns the standard error of the mean of the scaled original data.

Moment skewness
public double originalMomentSkewness()
Usage:                      skewness = bc.originalMomentSkewness();
This method returns the Moment skewness of the scaled original data.

Median skewness
public double originalMedianSkewness()
Usage:                      skewness = bc.originalMedianSkewness();
This method returns the Median skewness of the scaled original data.

Quartile skewness
public double originalQuartileSkewness()
Usage:                      skewness = bc.originalQuartileSkewness();
This method returns the Quartile skewness of the scaled original data.

Excess kurtosis
public double originalExcessKurtosis()
Usage:                      kurtosis = bc.originalExcessKurtosis();
This method returns the Excess kurtosis of the scaled original data.

Median
public double originalMedian()
Usage:                      median = bc.originalMedian();
This method returns the median value of the scaled original data.

Minimum
public double originalMinimum()
Usage:                      minimum = bc.originalMinimum();
This method returns the minimum value of the scaled original data.

Maximum
public double originalMaximum()
Usage:                      maximum = bc.originalMaximum();
This method returns the maximum value of the scaled original data.

Range
public double originalRange()
Usage:                      range = bc.originalRange();
This method returns the range of the scaled original data.



ANALYSIS

public void analysis(String fileTitle)
public void analysis()
Usage:                      bc.analysis(fileTitle);
This method displays:
and writes to a text file, fileTitle: and for both the transformed data, scaled to the original mean and standard deviation, and for the original data:
Usage:                      bc.analysis();
This method is identical to analysis(fileTitle) above, except that in the absence of a supplied file title the output file is named BoxCoxAnalysis.txt.



SET VARIANCE DENOMINATOR TO N

public void setDenominatorToN()
Usage:                      bc.setDenominatorToN();
This method sets the denominator in calculations of variance, standard deviation, skewness and kurtosis to the number of data points, n. The default value is n−1. See Stat class for details.




OTHER CLASSES USED BY THIS CLASS

This class uses the following classes in this library:


This page was prepared by Dr Michael Thomas Flanagan