Title: | Tools for Biometry and Applied Statistics in Agricultural Science |
---|---|
Description: | Tools designed to perform and evaluate cluster analysis (including Tocher's algorithm), discriminant analysis and path analysis (standard and under collinearity), as well as some useful miscellaneous tools for dealing with sample size and optimum plot size calculations. A test for seed sample heterogeneity is now available. Mantel's permutation test can be found in this package. A new approach for calculating its power is implemented. biotools also contains tests for genetic covariance components. Heuristic approaches for performing non-parametric spatial predictions of generic response variables and spatial gene diversity are implemented. |
Authors: | Anderson Rodrigo da Silva [aut, cre] |
Maintainer: | Anderson Rodrigo da Silva <[email protected]> |
License: | GPL (>= 2) |
Version: | 4.2 |
Built: | 2024-11-03 03:20:34 UTC |
Source: | https://github.com/arsilva87/biotools |
Tools designed to perform and evaluate cluster analysis (including Tocher's algorithm), discriminant analysis and path analysis (standard and under collinearity), as well as some useful miscellaneous tools for dealing with sample size and optimum plot size calculations. A test for seed sample heterogeneity is now available. Mantel's permutation test can be found in this package. A new approach for calculating its power is implemented. biotools also contains tests for genetic covariance components. Heuristic approaches for performing non-parametric spatial predictions of generic response variables and spatial gene diversity are implemented.
Package: | biotools |
Type: | Package |
Version: | 4.1 |
Date: | 2021-04-07 |
License: | GPL (>= 2) |
biotools is an ongoing project. Any and all criticism, comments and suggestions are welcomed.
Anderson Rodrigo da Silva
Maintainer: Anderson Rodrigo da Silva <[email protected]>
Rao, R.C. (1952) Advanced statistical methods in biometric research. New York: John Wiley & Sons.
Sharma, J.R. (2006) Statistical and biometrical techniques in plant breeding. Delhi: New Age International.
Da Silva, A.R.; Malafaia, G.; Menezes, I.P.P. (2017) biotools: an R function to predict spatial gene diversity via an individual-based approach. Genetics and Molecular Research, 16: gmr16029655.
Da Silva, A.R., Silva, A.P.A., Tiago-Neto, L.J. (2020) A new local stochastic method for predicting data with spatial heterogeneity. ACTA SCIENTIARUM-AGRONOMY, 43: e49947.
Silva, A.R. & Dias, C.T.S. (2013) A cophenetic correlation coefficient for Tocher's method. Pesquisa Agropecuaria Brasileira, 48: 589-596.
Da Silva, A.R. (2020). On testing for seed sample heterogeneity with the exact probability distribution of the germination count range. Seed Science Research, 30(1): 59-63.
A function to calculate the apparent error rate of two classification vectors, i.e., the proportion of observed cases incorrectly predicted. It can be useful for evaluating discriminant analysis or other classification systems.
aer(obs, predict)
aer(obs, predict)
obs |
a vector containing the observed classes. |
predict |
a vector with the same length of |
The apparent error rate, a number between 0 (no agreement) and 1 (thorough agreement).
Anderson Rodrigo da Silva <[email protected]>
data(iris) da <- lda(Species ~ ., data = iris) pred <- predict(da, dimen = 1) aer(iris$Species, pred$class) # End (not run)
data(iris) da <- lda(Species ~ ., data = iris) pred <- predict(da, dimen = 1) aer(iris$Species, pred$class) # End (not run)
It performs the Box's M-test for homogeneity of covariance matrices obtained from multivariate normal data according to one classification factor. The test is based on the chi-square approximation.
boxM(data, grouping)
boxM(data, grouping)
data |
a numeric data.frame or matrix containing n observations of p variables; it is expected that n > p. |
grouping |
a vector of length n containing the class of each observation; it is usualy a factor. |
A list with class "htest" containing the following components:
statistic |
an approximated value of the chi-square distribution. |
parameter |
the degrees of freedom related of the test statistic in this case that it follows a Chi-square distribution. |
p.value |
the p-value of the test. |
cov |
a list containing the within covariance matrix for each level of |
pooled |
the pooled covariance matrix. |
logDet |
a vector containing the natural logarithm of each matrix in |
data.name |
a character string giving the names of the data. |
method |
the character string "Box's M-test for Homogeneity of Covariance Matrices". |
Anderson Rodrigo da Silva <[email protected]>
Morrison, D.F. (1976) Multivariate Statistical Methods.
data(iris) boxM(iris[, -5], iris[, 5]) # End (not run)
data(iris) boxM(iris[, -5], iris[, 5]) # End (not run)
Lat/Long coordinates within Brazil's limits.
data("brazil")
data("brazil")
A data frame with 17141 observations on the following 2 variables.
x
a numeric vector (longitude)
y
a numeric vector (latitude)
data(brazil) plot(brazil, cex = 0.1, col = "gray")
data(brazil) plot(brazil, cex = 0.1, col = "gray")
A function to compute the confusion matrix of two classification vectors. It can be useful for evaluating discriminant analysis or other classification systems.
confusionmatrix(obs, predict)
confusionmatrix(obs, predict)
obs |
a vector containing the observed classes. |
predict |
a vector with the same length of |
A square matrix containing the number of objects in each class,
observed (rows) and predicted (columns). Diagonal elements
refers to agreement of obs
and predict
.
Anderson Rodrigo da Silva <[email protected]>
data(iris) da <- lda(Species ~ ., data = iris) pred <- predict(da, dimen = 1) confusionmatrix(iris$Species, pred$class) # End (not run)
data(iris) da <- lda(Species ~ ., data = iris) pred <- predict(da, dimen = 1) confusionmatrix(iris$Species, pred$class) # End (not run)
Compute a matrix of partial (co)variances for a group of variables with respect to another.
Take as the covariance matrix of dimension p. Now consider dividing
into two groups
of variables. The partial covariance matrices are calculate by:
cov2pcov(m, vars1, vars2 = seq(1, ncol(m))[-vars1])
cov2pcov(m, vars1, vars2 = seq(1, ncol(m))[-vars1])
m |
a square numeric matrix. |
vars1 |
a numeric vector indicating the position (rows or columns in |
vars2 |
a numeric vector indicating the position (rows or columns in |
A square numeric matrix.
Anderson Rodrigo da Silva <anderson.agro at hotmail.com>
(Cl <- cov(longley)) cov2pcov(Cl, 1:2) # End (Not run)
(Cl <- cov(longley)) cov2pcov(Cl, 1:2) # End (Not run)
A function to create homogeneous groups of named objects according to an objective function evaluated at a covariate. It can be useful to design experiments which contain a fixed covariate factor.
creategroups(x, ngroups, sizes, fun = mean, tol = 0.01, maxit = 200)
creategroups(x, ngroups, sizes, fun = mean, tol = 0.01, maxit = 200)
x |
a numeric vector of a covariate at which to evaluate the objective function. |
ngroups |
the number of groups to create. |
sizes |
a numeric vector of length equal to |
fun |
the objective function, i.e., to create groups with similar |
tol |
the tolerance level to define the groups as homogenenous; see details. |
maxit |
the maximum number of iterations; default is 200. |
creategroups
uses a tol
value to evaluate the following statistic:
, where
.
If
, the groups are considered homogeneous.
A list of
covar |
a character indicating the name of the covariate. |
func |
a character indicating the name of the objective function. |
val.func |
a numeric vector containing the values evaluated by |
niter |
the number of iteration require to achieve convergence. |
labels |
a list containing the labels of the objects in each group. |
groups |
a list of named vectors containing the values for the groups |
Anderson Rodrigo da Silva <[email protected]>
x <- rnorm(10, 1, 0.5) names(x) <- letters[1:10] creategroups(x, ngroups = 2, sizes = c(5, 5)) creategroups(x, ngroups = 3, sizes = c(3, 4, 3), tol = 0.05) # End (not run)
x <- rnorm(10, 1, 0.5) names(x) <- letters[1:10] creategroups(x, ngroups = 2, sizes = c(5, 5)) creategroups(x, ngroups = 3, sizes = c(3, 4, 3), tol = 0.05) # End (not run)
A function to perform discriminant analysis based on the squared generalized Mahalanobis distance (D2) of the observations to the center of the groups.
## Default S3 method: D2.disc(data, grouping, pooled.cov = NULL) ## S3 method for class 'D2.disc' print(x, ...) ## S3 method for class 'D2.disc' predict(object, newdata = NULL, ...)
## Default S3 method: D2.disc(data, grouping, pooled.cov = NULL) ## S3 method for class 'D2.disc' print(x, ...) ## S3 method for class 'D2.disc' predict(object, newdata = NULL, ...)
data |
a numeric |
grouping |
a vector of length n containing the class of each observation (row) in |
pooled.cov |
a |
x , object
|
an object of class |
newdata |
numeric |
... |
further arguments. |
A list of
call |
the call which produced the result. |
data |
numeric matrix; the input data. |
D2 |
a matrix containing the Mahalanobis distances between each row of |
means |
a matrix containing the vector of means of each class in |
pooled |
the pooled covariance matrix. |
confusion.matrix |
an object of class |
Anderson Rodrigo da Silva <[email protected]>
Manly, B.F.J. (2004) Multivariate statistical methods: a primer. CRC Press. (p. 105-106).
Mahalanobis, P.C. (1936) On the generalized distance in statistics. Proceedings of The National Institute of Sciences of India, 12:49-55.
data(iris) (disc <- D2.disc(iris[, -5], iris[, 5])) first10 <- iris[1:10, -5] predict(disc, first10) predict(disc, iris[, -5])$class # End (not run)
data(iris) (disc <- D2.disc(iris[, -5], iris[, 5])) first10 <- iris[1:10, -5] predict(disc, first10) predict(disc, iris[, -5])$class # End (not run)
Function to calculate the squared generalized Mahalanobis distance between all pairs of rows in a data frame with respect to a covariance matrix. The element of the i-th row and j-th column of the distance matrix is defined as
D2.dist(data, cov, inverted = FALSE)
D2.dist(data, cov, inverted = FALSE)
data |
a data frame or matrix of data (n x p). |
cov |
a variance-covariance matrix (p x p). |
inverted |
logical. If |
An object of class "dist".
Anderson Rodrigo da Silva <[email protected]>
Mahalanobis, P. C. (1936) On the generalized distance in statistics. Proceedings of The National Institute of Sciences of India, 12:49-55.
# Manly (2004, p.65-66) x1 <- c(131.37, 132.37, 134.47, 135.50, 136.17) x2 <- c(133.60, 132.70, 133.80, 132.30, 130.33) x3 <- c(99.17, 99.07, 96.03, 94.53, 93.50) x4 <- c(50.53, 50.23, 50.57, 51.97, 51.37) x <- cbind(x1, x2, x3, x4) Cov <- matrix(c(21.112,0.038,0.078,2.01, 0.038,23.486,5.2,2.844, 0.078,5.2,24.18,1.134, 2.01,2.844,1.134,10.154), 4, 4) D2.dist(x, Cov) # End (not run)
# Manly (2004, p.65-66) x1 <- c(131.37, 132.37, 134.47, 135.50, 136.17) x2 <- c(133.60, 132.70, 133.80, 132.30, 130.33) x3 <- c(99.17, 99.07, 96.03, 94.53, 93.50) x4 <- c(50.53, 50.23, 50.57, 51.97, 51.37) x <- cbind(x1, x2, x3, x4) Cov <- matrix(c(21.112,0.038,0.078,2.01, 0.038,23.486,5.2,2.844, 0.078,5.2,24.18,1.134, 2.01,2.844,1.134,10.154), 4, 4) D2.dist(x, Cov) # End (not run)
Function to compute a matrix of average distances within and between clusters.
distClust(d, nobj.cluster, id.cluster)
distClust(d, nobj.cluster, id.cluster)
d |
an object of class "dist" containing the distances between objects. |
nobj.cluster |
a numeric vector containing the numbers of objects per cluster. |
id.cluster |
a numeric vector for identification of the objects per cluster. |
A squared matrix containing distances within (diagonal) and between (off-diagonal) clusters.
Anderson Rodrigo da Silva <[email protected]>
It allows one to find an optimized (minimized or maximized) numeric subsample according to a statistic of interest. For example, it might be of interest to determine a subsample whose standard deviation is the lowest among all of those obtained from all possible subsamples of the same size.
findSubsample(x, size, fun = sd, minimize = TRUE, niter = 10000)
findSubsample(x, size, fun = sd, minimize = TRUE, niter = 10000)
x |
a numeric vector. |
size |
an integer; the size of the subsample. |
fun |
an object of class |
minimize |
logical; if TRUE (default) |
niter |
an integer indicating the number of iterations, i.e., the number of subsamples to be selected
(without replacement) from the original sample, |
A list of
dataname |
a |
niter |
the number of iterations. |
fun |
the objective function. |
stat |
the achieved statistic for the optimized subsample. |
criterion |
a |
subsample |
a numeric vector; the optimized subsample. |
labels |
a string containg the labels of the subsample values. |
Anderson Rodrigo da Silva <[email protected]>
# Example 1 y <- rnorm(40, 5, 2) findSubsample(x = y, size = 6) # Example 2 f <- function(x) diff(range(x)) # max(x) - min(x) findSubsample(x = y, size = 6, fun = f, minimize = FALSE, niter = 20000) # End (not run)
# Example 1 y <- rnorm(40, 5, 2) findSubsample(x = y, size = 6) # Example 2 f <- function(x) diff(range(x)) # max(x) - min(x) findSubsample(x = y, size = 6, fun = f, minimize = FALSE, niter = 20000) # End (not run)
Function to estimate the parameters of the nonlinear Lessman & Atkins (1963) model for determining the optimum plot size as a function of the experimental coefficient of variation (CV) or as a function of the residual standard error.
It creates initial estimates of the parameters a and b by log-linearization
and uses them to provide its least-squares estimates via nls
.
fitplotsize(plotsize, CV)
fitplotsize(plotsize, CV)
plotsize |
a numeric vector containing estimates of plot size. |
CV |
a numeric vector of experimental coefficient of variation or residual standard error. |
A nls
output.
Anderson Rodrigo da Silva <[email protected]>
Lessman, K. J. & Atkins, R. E. (1963) Optimum plot size and relative efficiency of lattice designs for grain sorghum yield tests. Crop Sci., 3:477-481.
ps <- c(1, 2, 3, 4, 6, 8, 12) cv <- c(35.6, 29, 27.1, 25.6, 24.4, 23.3, 21.6) out <- fitplotsize(plotsize = ps, CV = cv) predict(out) # fitted.values plot(cv ~ ps) curve(coef(out)[1] * x^(-coef(out)[2]), add = TRUE) # End (not run)
ps <- c(1, 2, 3, 4, 6, 8, 12) cv <- c(35.6, 29, 27.1, 25.6, 24.4, 23.3, 21.6) out <- fitplotsize(plotsize = ps, CV = cv) predict(out) # fitted.values plot(cv ~ ps) curve(coef(out)[1] * x^(-coef(out)[2]), add = TRUE) # End (not run)
The data give the squared generalized Mahalanobis distances between 17 garlic cultivars. The data are taken from the article published by Silva & Dias (2013).
data(garlicdist)
data(garlicdist)
An object of class "dist" based on 17 objects.
Silva, A.R. & Dias, C.T.S. (2013) A cophenetic correlation coefficient for Tocher's method. Pesquisa Agropecuaria Brasileira, 48:589-596.
data(garlicdist) tocher(garlicdist) # End (not run)
data(garlicdist) tocher(garlicdist) # End (not run)
gencovtest()
tests genetic covariance components from a MANOVA model. Two different approaches can
be used: (I) a test statistic that takes into account the genetic and environmental effects and (II) a test
statistic that only considers the genetic information. The first type refers to tests based on the mean
cross-products ratio, whose distribution is obtained via Monte Carlo simulation of Wishart matrices. The
second way of testing genetic covariance refers to tests based upon an adaptation of Wilks' and Pillai's
statistics for evaluating independence of two sets of variables. All these tests are described by Silva (2015).
## S3 method for class 'manova' gencovtest(obj, geneticFactor, gcov = NULL, residualFactor = NULL, adjNrep = 1, test = c("MCPR", "Wilks", "Pillai"), nsim = 9999, alternative = c("two.sided", "less", "greater")) ## S3 method for class 'gencovtest' print(x, digits = 4, ...) ## S3 method for class 'gencovtest' plot(x, var1, var2, ...)
## S3 method for class 'manova' gencovtest(obj, geneticFactor, gcov = NULL, residualFactor = NULL, adjNrep = 1, test = c("MCPR", "Wilks", "Pillai"), nsim = 9999, alternative = c("two.sided", "less", "greater")) ## S3 method for class 'gencovtest' print(x, digits = 4, ...) ## S3 method for class 'gencovtest' plot(x, var1, var2, ...)
obj |
an object of class |
geneticFactor |
a character indicating the genetic factor from which to test covariance components. It must be declared as a factor in the manova object. |
gcov |
optional; a matrix containing estimates of genetic covariances to be tested. If
|
residualFactor |
optional; a character indicating a source in the manova model to be used as
error term. If |
adjNrep |
a correction index for dealing with unbalanced data. See details. |
test |
a character indicating the test. It must be on of the following:
|
nsim |
the number of Monte Carlo simulations. Used only if |
alternative |
the type of alternative hypothesis. Used only if |
x |
an object of class |
digits |
the number of digits to be displayed by the print method. |
var1 |
a character of integer indicating one of the two response variable or its position. |
var2 |
a character of integer indicating one of the two response variable or its position. |
... |
further arguments. |
The genetic covariance matrix is currently estimated via method of moments, following the equation:
where and
are the matrices of mean cross-products associated with the genetic factor and
the residuals, respectively;
is the number of replications, calculated as the ratio between the
total number of observations and the number of levels of the genetic factor;
is supposed to
adjust nrep, specially when estimating
from unbalanced data.
An object of class gencovtest
, a list of
gcov |
a p-dimensional square matrix containing estimates of the genetic covariances. |
gcor |
a p-dimensional square matrix containing estimates of the genetic correlations. |
test |
the test (as input). |
statistics |
a p-dimensional square matrix containing the |
p.values |
a p-dimensional square matrix containing the associated p-values. |
alternative |
the type of alternative hypothesis (as input). |
X2 |
a p-dimensional square matrix containing the Chi-square (D.f. = 1) approximation for Wilks's and Pillai's statistics. Stored only if one of these two tests is chosen. |
simRatio |
an array consisting of |
dfg |
the number of degrees of freedom associated with the genetic factor. |
dfe |
the number of degrees of freedom associated with the residual term. |
When using the MCPR test, be aware that dfg
should be equal or greater than the number of variables (p).
Otherwise the simulation of Wishart matrices may not be done.
A collinearity diagnosis is carried out using the condition number (CN), for the inferences may be affected by the
quality of . Thus, if CN > 100, a warning message is displayed.
Anderson Rodrigo da Silva <[email protected]>
Silva, A.R. (2015) On Testing Genetic Covariance. LAP Lambert Academic Publishing. ISBN 3659716553
# MANOVA data(maize) M <- manova(cbind(NKPR, ED, CD, PH) ~ family + env, data = maize) summary(M) # Example 1 - MCPR t1 <- gencovtest(obj = M, geneticFactor = "family") print(t1) plot(t1, "ED", "PH") # Example 2 - Pillai t2 <- gencovtest(obj = M, geneticFactor = "family", test = "Pillai") print(t2) plot(t2, "ED", "PH") # End (not run)
# MANOVA data(maize) M <- manova(cbind(NKPR, ED, CD, PH) ~ family + env, data = maize) summary(M) # Example 1 - MCPR t1 <- gencovtest(obj = M, geneticFactor = "family") print(t1) plot(t1, "ED", "PH") # Example 2 - Pillai t2 <- gencovtest(obj = M, geneticFactor = "family", test = "Pillai") print(t2) plot(t2, "ED", "PH") # End (not run)
A test based on the exact probability distribution of the germination count range, i.e, the difference between germination count of seed samples.
germinationcount.test(r, nsamples, n, N, K)
germinationcount.test(r, nsamples, n, N, K)
r |
an integer representing the germination count difference between seed samples. |
nsamples |
an integer representing the number of seed samples. |
n |
an integer representing the number of seeds per sample. |
N |
an integer representing the size (number of seeds) of the seed lot. |
K |
an integer representing the number of germinating seeds in the seed lot. |
A list of
R.value |
integer; the input R-value ( |
p.value |
numeric; the exact p-value. |
germination.rate |
numeric; the germination rate of the seed lot, calculated as the ration of |
Anderson Rodrigo da Silva <[email protected]>
Da Silva, A.R. (2020). On testing for seed sample heterogeneity with the exact probability distribution of the germination count range. Seed Science Research, 30(1): 59–63. doi:10.1017/S0960258520000112
germinationcount.test(r = 6, nsamples = 4, n = 50, N = 2000, K = 1700) # End (Not run)
germinationcount.test(r = 6, nsamples = 4, n = 50, N = 2000, K = 1700) # End (Not run)
Data from and experiment with five maize families carried out in randomized block design, with four replications (environments).
data("maize")
data("maize")
A data frame with 20 observations on the following 6 variables.
NKPR
a numeric vector containing values of Number of Kernels Per cob Row.
ED
a numeric vector containing values of Ear Diameter (in cm).
CD
a numeric vector containing values of Cob Diameter (in cm).
PH
a numeric vector containing values of Plant Heigth (in m).
family
a factor with levels 1
2
3
4
5
env
a factor with levels 1
2
3
4
data(maize) str(maize) summary(maize)
data(maize) str(maize) summary(maize)
Power calculation of Mantel's permutation test.
mantelPower(obj, effect.size = seq(0, 1, length.out = 50), alpha = 0.05)
mantelPower(obj, effect.size = seq(0, 1, length.out = 50), alpha = 0.05)
obj |
an object of class "mantelTest". See |
effect.size |
numeric; the effect size specifying the alternative hypothesis. |
alpha |
numeric; the significance level at which to compute the power level. |
A data frame containing the effect size and its respective power level.
Anderson Rodrigo da Silva <[email protected]>
Silva, A.R.; Dias, C.T.S.; Cecon, P.R.; Rego, E.R. (2015). An alternative procedure for performing a power analysis of Mantel's test. Journal of Applied Statistics, 42(9): 1984-1992.
# Mantel test data(garlicdist) garlic <- tocher(garlicdist) coph <- cophenetic(garlic) mt1 <- mantelTest(garlicdist, coph, xlim = c(-1, 1)) # Power calculation, H1: rho = 0.3 mantelPower(mt1, effect.size = 0.3) # Power calculation, multiple H1s and different alphas p01 <- mantelPower(mt1, alpha = 0.01) p05 <- mantelPower(mt1, alpha = 0.05) p10 <- mantelPower(mt1, alpha = 0.10) plot(p01, type = "l", col = 4) lines(p05, lty = 2, col = 4) lines(p10, lty = 3, col = 4) legend("bottomright", c("0.10", "0.05", "0.01"), title = expression(alpha), col = 4, lty = 3:1, cex = 0.8) # End (Not run)
# Mantel test data(garlicdist) garlic <- tocher(garlicdist) coph <- cophenetic(garlic) mt1 <- mantelTest(garlicdist, coph, xlim = c(-1, 1)) # Power calculation, H1: rho = 0.3 mantelPower(mt1, effect.size = 0.3) # Power calculation, multiple H1s and different alphas p01 <- mantelPower(mt1, alpha = 0.01) p05 <- mantelPower(mt1, alpha = 0.05) p10 <- mantelPower(mt1, alpha = 0.10) plot(p01, type = "l", col = 4) lines(p05, lty = 2, col = 4) lines(p10, lty = 3, col = 4) legend("bottomright", c("0.10", "0.05", "0.01"), title = expression(alpha), col = 4, lty = 3:1, cex = 0.8) # End (Not run)
Mantel's permutation test based on Pearson's correlation coefficient to evaluate the association between two distance square matrices.
mantelTest(m1, m2, nperm = 999, alternative = "greater", graph = TRUE, main = "Mantel's test", xlab = "Correlation", ...)
mantelTest(m1, m2, nperm = 999, alternative = "greater", graph = TRUE, main = "Mantel's test", xlab = "Correlation", ...)
m1 |
an object of class "matrix" or "dist", containing distances among n individuals. |
m2 |
an object of class "matrix" or "dist", containing distances among n individuals. |
nperm |
the number of matrix permutations. |
alternative |
a character specifying the alternative hypothesis. It must be one of "greater" (default), "two.sided" or "less". |
graph |
logical; if TRUE (default), the empirical distribution is plotted. |
main |
opitional; a character describing the title of the graphic. |
xlab |
opitional; a character describing the x-axis label. |
... |
further graphical arguments. See |
A list of
correlation |
numeric; the observed Pearson's correlation between |
p.value |
numeric; the empirical p-value of the permutation test. |
alternative |
character; the alternative hypothesis used to compute |
nullcor |
numeric vector containing randomized values of correlation, i.e., under the null hypothesis that the true correlation is equal to zero. |
Anderson Rodrigo da Silva <[email protected]>
Mantel, N. (1967). The detection of disease clustering and a generalized regression approach. Cancer Research, 27:209–220.
# Distances between garlic cultivars data(garlicdist) garlicdist # Tocher's clustering garlic <- tocher(garlicdist) garlic # Cophenetic distances coph <- cophenetic(garlic) coph # Mantel's test mantelTest(garlicdist, coph, xlim = c(-1, 1)) # End (Not run)
# Distances between garlic cultivars data(garlicdist) garlicdist # Tocher's clustering garlic <- tocher(garlicdist) garlic # Cophenetic distances coph <- cophenetic(garlic) coph # Mantel's test mantelTest(garlicdist, coph, xlim = c(-1, 1)) # End (Not run)
Data set of...
data("moco")
data("moco")
A data frame with 206 observations (sampling points) on the following 20 variables (coordinates and markers).
Lon
a numeric vector containing values of longitude
Lat
a numeric vector containing values of latitude
BNL1434.1
a numeric vector (marker)
BNL1434.2
a numeric vector
BNL840.1
a numeric vector
BNL840.2
a numeric vector
BNL2496.1
a numeric vector
BNL2496.2
a numeric vector
BNL1421.1
a numeric vector
BNL1421.2
a numeric vector
BNL1551.1
a numeric vector
BNL1551.2
a numeric vector
CIR249.1
a numeric vector
CIR249.2
a numeric vector
BNL3103.1
a numeric vector
BNL3103.2
a numeric vector
CIR311.1
a numeric vector
CIR311.2
a numeric vector
CIR246.1
a numeric vector
CIR246.2
a numeric vector
...
...
data(moco) str(moco)
data(moco) str(moco)
It performs multiple correlation t-tests from a correlation matrix based on the statistic:
where, in general, .
multcor.test(x, n = NULL, Df = NULL, alternative = c("two.sided", "less", "greater"), adjust = "none")
multcor.test(x, n = NULL, Df = NULL, alternative = c("two.sided", "less", "greater"), adjust = "none")
x |
a correlation matrix. |
n |
the number of observations; if |
Df |
the number of degrees of freedom of the t statistic; if |
alternative |
the alternative hypothesis. It must be one of "two.sided", "greater" or "less". You can specify just the initial letter. "greater" corresponds to positive association,"less" to negative association. The default is "two.sided". |
adjust |
The adjustment method for multiple tests. It must be one of
"holm", "hochberg", "hommel", "bonferroni", "BH", "BY", "fdr",
"none" (default). For more information, see |
A list with class "multcor.test" containing the following components:
t.values |
the t-value calculated for each correlation. |
p.values |
the p.value for each t-test, adjusted for multiple tests. |
p.check |
a matrix containing the |
adjustemnt |
a character indicating the p-value adjustment method. |
df |
the degrees of freedom of the tests. |
alternative |
a character indicating the type of alternative hypothesis. |
data.name |
a character string giving the name of the data. |
Anderson Rodrigo da Silva <[email protected]>
data(peppercorr) multcor.test(peppercorr, n = 20) # End (not run)
data(peppercorr) multcor.test(peppercorr, n = 20) # End (not run)
Performs pairwise comparisons of multivariate mean vectors of factor levels, overall or nested.
The tests are run in the same spirt of summary.manova()
, based on multivariate statistics such as Pillai's trace
and Wilks' lambda, which can be applied to test multivariate contrasts.
mvpaircomp(model, factor1, nesting.factor = NULL, test = "Pillai", adjust = "none", SSPerror = NULL, DFerror = NULL) ## S3 method for class 'mvpaircomp' print(x, ...)
mvpaircomp(model, factor1, nesting.factor = NULL, test = "Pillai", adjust = "none", SSPerror = NULL, DFerror = NULL) ## S3 method for class 'mvpaircomp' print(x, ...)
model |
a multivariate analysis of variance (MANOVA) model, fitted using |
factor1 |
a character string indicating a factor declared in the |
nesting.factor |
optional; a character string indicating a factor also declared in |
test |
a character string indicating the type of multivariate statistics to be calculated to perform the
F-test approximation. Default is |
adjust |
a character string indicating the p-value adjustment method for multiple comparisons. Default is |
SSPerror |
optional; a numeric matrix representing the residual sum of squares and cross-products, to be used to compute the multivariate statistics. |
DFerror |
optional; a numeric value representing the residual degrees of freedom, to be used to compute the multivariate statistics. |
x |
an object of class |
... |
further arguments. |
An object of class mvpaircomp
, a list of
st |
an array containing the summary of the multivariate tests. |
SSPcontrast |
an array containing p-dimensional square matrices of sum of squares and cross-products of the contrasts. |
adjust |
a character string indicating the p-value adjustment method used. |
fac1 |
a character string indicating the factor being tested. |
fac2 |
a character string indicating the nesting factor. |
Anderson Rodrigo da Silva <[email protected]>
Krzanowski, W. J. (1988) Principles of Multivariate Analysis. A User's Perspective. Oxford.
# Example 1 data(maize) M <- lm(cbind(NKPR, ED, CD, PH) ~ family + env, data = maize) anova(M) # MANOVA table mvpaircomp(M, factor1 = "family", adjust = "bonferroni") # Example 2 (with nesting factor) # Data on producing plastic film from Krzanowski (1998, p. 381) tear <- c(6.5, 6.2, 5.8, 6.5, 6.5, 6.9, 7.2, 6.9, 6.1, 6.3, 6.7, 6.6, 7.2, 7.1, 6.8, 7.1, 7.0, 7.2, 7.5, 7.6) gloss <- c(9.5, 9.9, 9.6, 9.6, 9.2, 9.1, 10.0, 9.9, 9.5, 9.4, 9.1, 9.3, 8.3, 8.4, 8.5, 9.2, 8.8, 9.7, 10.1, 9.2) opacity <- c(4.4, 6.4, 3.0, 4.1, 0.8, 5.7, 2.0, 3.9, 1.9, 5.7, 2.8, 4.1, 3.8, 1.6, 3.4, 8.4, 5.2, 6.9, 2.7, 1.9) Y <- cbind(tear, gloss, opacity) rate <- gl(2, 10, labels = c("Low", "High")) additive <- gl(2, 5, length = 20, labels = c("Low", "High")) fit <- manova(Y ~ rate * additive) summary(fit, test = "Wilks") # MANOVA table mvpaircomp(fit, factor1 = "rate", nesting.factor = "additive", test = "Wilks") mvpaircomp(fit, factor1 = "additive", nesting.factor = "rate", test = "Wilks") # End (not run)
# Example 1 data(maize) M <- lm(cbind(NKPR, ED, CD, PH) ~ family + env, data = maize) anova(M) # MANOVA table mvpaircomp(M, factor1 = "family", adjust = "bonferroni") # Example 2 (with nesting factor) # Data on producing plastic film from Krzanowski (1998, p. 381) tear <- c(6.5, 6.2, 5.8, 6.5, 6.5, 6.9, 7.2, 6.9, 6.1, 6.3, 6.7, 6.6, 7.2, 7.1, 6.8, 7.1, 7.0, 7.2, 7.5, 7.6) gloss <- c(9.5, 9.9, 9.6, 9.6, 9.2, 9.1, 10.0, 9.9, 9.5, 9.4, 9.1, 9.3, 8.3, 8.4, 8.5, 9.2, 8.8, 9.7, 10.1, 9.2) opacity <- c(4.4, 6.4, 3.0, 4.1, 0.8, 5.7, 2.0, 3.9, 1.9, 5.7, 2.8, 4.1, 3.8, 1.6, 3.4, 8.4, 5.2, 6.9, 2.7, 1.9) Y <- cbind(tear, gloss, opacity) rate <- gl(2, 10, labels = c("Low", "High")) additive <- gl(2, 5, length = 20, labels = c("Low", "High")) fit <- manova(Y ~ rate * additive) summary(fit, test = "Wilks") # MANOVA table mvpaircomp(fit, factor1 = "rate", nesting.factor = "additive", test = "Wilks") mvpaircomp(fit, factor1 = "additive", nesting.factor = "rate", test = "Wilks") # End (not run)
The Meier & Lessman (1971) method to determine the maximum curvature point for optimum plot size as a function of the experimental coefficient of variation.
optimumplotsize(a, b)
optimumplotsize(a, b)
a |
a parameter estimate of the plot size model; see |
b |
a parameter estimate of the plot size model; see |
The (approximated) optimum plot size value.
Anderson Rodrigo da Silva <[email protected]>
Meier, V. D. & Lessman, K. J. (1971) Estimation of optimum field plot shape and size for testing yield in Crambe abyssinia Hochst. Crop Sci., 11:648-650.
ps <- c(1, 2, 3, 4, 6, 8, 12) cv <- c(35.6, 29, 27.1, 25.6, 24.4, 23.3, 21.6) out <- fitplotsize(plotsize = ps, CV = cv) plot(cv ~ ps) curve(coef(out)[1] * x^(-coef(out)[2]), add = TRUE) optimumplotsize(a = coef(out)[1], b = coef(out)[2]) # End (not run)
ps <- c(1, 2, 3, 4, 6, 8, 12) cv <- c(35.6, 29, 27.1, 25.6, 24.4, 23.3, 21.6) out <- fitplotsize(plotsize = ps, CV = cv) plot(cv ~ ps) curve(coef(out)[1] * x^(-coef(out)[2]), add = TRUE) optimumplotsize(a = coef(out)[1], b = coef(out)[2]) # End (not run)
Function to perform the simple path analysis and the path analysis under collinearity (sometimes called ridge path analysis). It computes the direct (diagonal) and indirect (off-diagonal) effects of each explanatory variable over a response one.
pathanalysis(corMatrix, resp.col, collinearity = FALSE)
pathanalysis(corMatrix, resp.col, collinearity = FALSE)
corMatrix |
a correlation matrix. |
resp.col |
an integer value indicating the column in |
collinearity |
logical; if |
A list of
coef |
a matrix containing the direct (diagonal) and indirect (off-diagonal) effects of each variable. |
Rsq |
the coefficient of determination. |
ResidualEffect |
the residual effect. |
VIF |
a vector containing the variance inflation factors. |
CN |
the condition number. |
If collinearity = TRUE
, an interactive graphic is displayed for dealing with collinearity.
Anderson Rodrigo da Silva <[email protected]>
Carvalho, S.P. (1995) Metodos alternativos de estimacao de coeficientes de trilha e indices de selecao, sob multicolinearidade. Ph.D. Thesis, Federal University of Vicosa (UFV), Vicosa, MG, Brazil.
data(peppercorr) pathanalysis(peppercorr, 6, collinearity = FALSE) # End (not run)
data(peppercorr) pathanalysis(peppercorr, 6, collinearity = FALSE) # End (not run)
The data give the correlations between 6 pepper variables. The data are taken from the article published by Silva et al. (2013).
data(peppercorr)
data(peppercorr)
An object of class "matrix".
Silva et al. (2013) Path analysis in multicollinearity for fruit traits of pepper. Idesia, 31:55-60.
data(peppercorr) print(peppercorr) # End (not run)
data(peppercorr) print(peppercorr) # End (not run)
raise.matrix
raises a square matrix to a power by using
spectral decomposition.
raise.matrix(x, power = 1)
raise.matrix(x, power = 1)
x |
a square matrix. |
power |
numeric; default is 1. |
An object of class "matrix".
Anderson Rodrigo da Silva <[email protected]>
m <- matrix(c(1, -2, -2, 4), 2, 2) raise.matrix(m) raise.matrix(m, 2) # End (not run)
m <- matrix(c(1, -2, -2, 4), 2, 2) raise.matrix(m) raise.matrix(m, 2) # End (not run)
Function to determine the minimum sample size for calculating a statistic based on its the confidence interval.
samplesize(x, fun, sizes = NULL, lcl = NULL, ucl = NULL, nboot = 200, conf.level = 0.95, nrep = 500, graph = TRUE, ...)
samplesize(x, fun, sizes = NULL, lcl = NULL, ucl = NULL, nboot = 200, conf.level = 0.95, nrep = 500, graph = TRUE, ...)
x |
a numeric vector. |
fun |
an objective function at which to evaluate the sample size; see details. |
sizes |
a numeric vector containing sample sizes; if |
lcl |
the lower confidence limit for the statistic defined in |
ucl |
the upper confidence limit for the statistic defined in |
nboot |
the number of bootstrap samples; it is used only if |
conf.level |
the confidence level for calculating the |
nrep |
the resampling (with replacement) number for each sample size in |
graph |
logical; default is |
... |
further graphical arguments. |
If ucl
or lcl
is NULL
, fun
must be defined as in boot
, i.e.,
the first argument passed will always be the original data and the second will be a vector of indices,
frequencies or weights which define the bootstrap sample. By now, samplesize
considers the
second argument only as index.
A list of
CI |
a vector containing the lower and the upper confidence limit for the statistic evaluated. |
pointsOut |
a data frame containing the sample sizes (in |
If graph = TRUE
, a graphic with the dispersion of the estimates for each sample size,
as well as the graphic containing the number of points outside the confidence interval for
the reference sample.
Anderson Rodrigo da Silva <[email protected]>
cv <- function(x, i) sd(x[i]) / mean(x[i]) # coefficient of variation x = rnorm(20, 15, 2) cv(x) samplesize(x, cv) par(mfrow = c(1, 3), cex = 0.7, las = 1) samplesize(x, cv, lcl = 0.05, ucl = 0.20) abline(h = 0.05 * 500, col = "blue") # sample sizes with 5% (or less) out CI # End (not run)
cv <- function(x, i) sd(x[i]) / mean(x[i]) # coefficient of variation x = rnorm(20, 15, 2) cv(x) samplesize(x, cv) par(mfrow = c(1, 3), cex = 0.7, las = 1) samplesize(x, cv, lcl = 0.05, ucl = 0.20) abline(h = 0.05 * 500, col = "blue") # sample sizes with 5% (or less) out CI # End (not run)
Estimate spatial gene diversity (expected heterozygozity - He) through the individual-centred approach by Manel et al. (2007).
sHe()
calculates the unbiased estimate of He based on the information of allele frequency obtained from codominant or
dominant markers in individuals within a circular moving windows of known radius over the sampling area.
sHe(x, coord.cols = 1:2, marker.cols = 3:4, marker.type = c("codominant", "dominant"), grid = NULL, latlong2km = TRUE, radius, nmin = NULL)
sHe(x, coord.cols = 1:2, marker.cols = 3:4, marker.type = c("codominant", "dominant"), grid = NULL, latlong2km = TRUE, radius, nmin = NULL)
x |
a data frame or numeric matrix containing columns with coordinates of individuals and marker genotyping |
coord.cols |
a vector of integer giving the columns of coordinates in |
marker.cols |
a vector of integer giving the columns of markers in |
marker.type |
a character; the type of molecular marker |
grid |
optional; a two-column matrix containing coordinates over which to predict He |
latlong2km |
logical; should coordinates be converted from lat/long format into kilometer-grid based? |
radius |
the radius of the moving window. It must be in the same format as sampling coordinates |
nmin |
optional; a numeric value indicating the minimum number of individuals used to calculate He. If is
the number of individuals in a certain location is less then |
The unbiased estimate of expected heterogygozity (Nei, 1978) is given by:
where is the frequency of the i-th allele per locus considering the
individuals in a certain location.
A list of
diversity |
a data frame with the following columns: coord.x - the x-axis coordinates of the predicion grid, coord.y - the y-axis coordinates of the predicion grid, n - the number of individuals in a certain points in the grid, MaxDist - the maximum observed distance among these individuals, uHe - the unbiased estimate of gene diversity (as expressed above), and SE - the standard error of uHe. |
mHe |
a matrix containing the estimates of He for every marker, on each point of the |
locations |
a numeric matrix containing the sampling coordinates, as provides as input. |
Depending on the dimension of x
and/or grid
, sHe()
can be time demanding.
Anderson Rodrigo da Silva <[email protected]>
Ivandilson Pessoa Pinto de Menezes <[email protected]>
da Silva, A.R.; Malafaia, G.; Menezes, I.P.P. (2017) biotools: an R function to predict spatial gene diversity via an individual-based approach. Genetics and Molecular Research, 16: gmr16029655.
Manel, S., Berthoud, F., Bellemain, E., Gaudeul, M., Luikart, G., Swenson, J.E., Waits, L.P., Taberlet, P.; Intrabiodiv Consortium. (2007) A new individual-based spatial approach for identifying genetic discontinuities in natural populations. Molecular Ecology, 16:2031-2043.
Nei, M. (1978) Estimation of average heterozygozity and genetic distance from a small number of individuals. Genetics, 89: 583-590.
data(moco) data(brazil) # check points plot(brazil, cex = 0.1, col = "gray") points(Lat ~ Lon, data = moco, col = "blue", pch = 20) # using a retangular grid (not passed as input!) # ex <- sHe(x = moco, coord.cols = 1:2, # marker.cols = 3:20, marker.type = "codominant", # grid = NULL, radius = 150) #ex # plot(ex, xlab = "Lon", ylab = "Lat") # A FANCIER PLOT... # using Brazil's coordinates as prediction grid # ex2 <- sHe(x = moco, coord.cols = 1:2, # marker.cols = 3:20, marker.type = "codominant", # grid = brazil, radius = 150) # ex2 # # library(maps) # borders <- data.frame(x = map("world", "brazil")$x, # y = map("world", "brazil")$y) # # library(latticeExtra) # plot(ex2, xlab = "Lon", ylab = "Lat", # xlim = c(-75, -30), ylim = c(-35, 10), aspect = "iso") + # latticeExtra::as.layer(xyplot(y ~ x, data = borders, type = "l")) + # latticeExtra::as.layer(xyplot(Lat ~ Lon, data = moco)) # End (not run)
data(moco) data(brazil) # check points plot(brazil, cex = 0.1, col = "gray") points(Lat ~ Lon, data = moco, col = "blue", pch = 20) # using a retangular grid (not passed as input!) # ex <- sHe(x = moco, coord.cols = 1:2, # marker.cols = 3:20, marker.type = "codominant", # grid = NULL, radius = 150) #ex # plot(ex, xlab = "Lon", ylab = "Lat") # A FANCIER PLOT... # using Brazil's coordinates as prediction grid # ex2 <- sHe(x = moco, coord.cols = 1:2, # marker.cols = 3:20, marker.type = "codominant", # grid = brazil, radius = 150) # ex2 # # library(maps) # borders <- data.frame(x = map("world", "brazil")$x, # y = map("world", "brazil")$y) # # library(latticeExtra) # plot(ex2, xlab = "Lon", ylab = "Lat", # xlim = c(-75, -30), ylim = c(-35, 10), aspect = "iso") + # latticeExtra::as.layer(xyplot(y ~ x, data = borders, type = "l")) + # latticeExtra::as.layer(xyplot(Lat ~ Lon, data = moco)) # End (not run)
A function to calculate the Singh (1981) criterion for importance of variables based on the squared generalized Mahalanobis distance.
## Default S3 method: singh(data, cov, inverted = FALSE) ## S3 method for class 'singh' plot(x, ...)
## Default S3 method: singh(data, cov, inverted = FALSE) ## S3 method for class 'singh' plot(x, ...)
data |
a data frame or matrix of data (n x p). |
cov |
a variance-covariance matrix (p x p). |
inverted |
logical. If |
x |
an object of class |
... |
further graphical arguments. |
singh
returns a matrix containing the Singh statistic, the
importance proportion and the cummulative proprtion of each
variable (column) in data.
Anderson Rodrigo da Silva <[email protected]>
Singh, D. (1981) The relative importance of characters affecting genetic divergence. Indian Journal Genetics & Plant Breeding, 41:237-245.
# Manly (2004, p.65-66) x1 <- c(131.37, 132.37, 134.47, 135.50, 136.17) x2 <- c(133.60, 132.70, 133.80, 132.30, 130.33) x3 <- c(99.17, 99.07, 96.03, 94.53, 93.50) x4 <- c(50.53, 50.23, 50.57, 51.97, 51.37) x <- cbind(x1, x2, x3, x4) Cov <- matrix(c(21.112,0.038,0.078,2.01, 0.038,23.486,5.2,2.844, 0.078,5.2,24.18,1.134, 2.01,2.844,1.134,10.154), 4, 4) (s <- singh(x, Cov)) plot(s) # End (not run)
# Manly (2004, p.65-66) x1 <- c(131.37, 132.37, 134.47, 135.50, 136.17) x2 <- c(133.60, 132.70, 133.80, 132.30, 130.33) x3 <- c(99.17, 99.07, 96.03, 94.53, 93.50) x4 <- c(50.53, 50.23, 50.57, 51.97, 51.37) x <- cbind(x1, x2, x3, x4) Cov <- matrix(c(21.112,0.038,0.078,2.01, 0.038,23.486,5.2,2.844, 0.078,5.2,24.18,1.134, 2.01,2.844,1.134,10.154), 4, 4) (s <- singh(x, Cov)) plot(s) # End (not run)
A heuristic method to perform spatial predictions. The method consists of a local interpolator with stochastic features. It allows to build effective detailed maps and to estimate the spatial dependence without any assumptions on the spatial process.
spatialpred(coords, data, grid)
spatialpred(coords, data, grid)
coords |
a data frame or numeric matrix containing columns with geographic coordinates |
data |
a numeric vector of compatible dimension with |
grid |
a data frame or numeric matrix containing columns with geographic coordinates where |
If grid
receives the same input as coords
, spatialpred
will calculate the Percenntual Absolute Mean Error (PAME) of predictions.
A data.frame containing spatial predictions, standard errors, the radius and the number of observations used in each prediction over the grid.
Depending on the dimension of coords
and/or grid
, spatialpred()
can be time demanding.
Anderson Rodrigo da Silva <[email protected]>
Da Silva, A.R., Silva, A.P.A., Tiago-Neto, L.J. (2020) A new local stochastic method for predicting data with spatial heterogeneity. ACTA SCIENTIARUM-AGRONOMY, 43:e49947.
# data(moco) # p <- spatialpred(coords = moco[, 1:2], data = rnorm(206), grid = moco[, 1:2]) # note: using coords as grid to calculate PAME # head(p) # lattice::levelplot(pred ~ Lat*Lon, data = p) # End (not run)
# data(moco) # p <- spatialpred(coords = moco[, 1:2], data = rnorm(206), grid = moco[, 1:2]) # note: using coords as grid to calculate PAME # head(p) # lattice::levelplot(pred ~ Lat*Lon, data = p) # End (not run)
tocher
performs the Tocher (Rao, 1952) optimization clustering from a distance matrix.
The cophenetic distance matrix for a Tocher's clustering can also be computed using the methodology proposed
by Silva \& Dias (2013).
## S3 method for class 'dist' tocher(d, algorithm = c("original", "sequential")) ## S3 method for class 'tocher' print(x, ...) ## S3 method for class 'tocher' cophenetic(x)
## S3 method for class 'dist' tocher(d, algorithm = c("original", "sequential")) ## S3 method for class 'tocher' print(x, ...) ## S3 method for class 'tocher' cophenetic(x)
d |
an object of class |
algorithm |
a character indicating the algorithm to be used for clustering objects.
It must be one of the two: |
x |
an object of class |
... |
optional further arguments from |
An object of class tocher
. A list of
call |
the call which produced the result. |
algorithm |
character; the algorithm that has been used as input. |
clusters |
a list of length k (the number of clusters),
containing the labels of the objects in |
class |
a numeric vector indicating the class (the cluster) of each object in |
criterion |
a numeric vector containing the clustering criteria - the greatest amongst
the smallest distances involving each object in |
distClust |
a matrix of distances within (diagonal) and between (off-diagonal) clusters. |
d |
the input object. |
Clustering a large number of objects (say 300 or more) can be time demanding.
Anderson Rodrigo da Silva <[email protected]>
Cruz, C.D.; Ferreira, F.M.; Pessoni, L.A. (2011) Biometria aplicada ao estudo da diversidade genetica. Visconde do Rio Branco: Suprema.
Rao, R.C. (1952) Advanced statistical methods in biometric research. New York: John Wiley & Sons.
Sharma, J.R. (2006) Statistical and biometrical techniques in plant breeding. Delhi: New Age International.
Silva, A.R. & Dias, C.T.S. (2013) A cophenetic correlation coefficient for Tocher's method. Pesquisa Agropecuaria Brasileira, 48:589-596.
Vasconcelos, E.S.; Cruz, C.D.; Bhering, L.L.; Resende Junior, M.F.R. (2007) Alternative methodology for the cluster analysis. Pesquisa Agropecuaria Brasileira, 42:1421-1428.
dist
, D2.dist
, cophenetic
, distClust
, hclust
# example 1 data(garlicdist) (garlic <- tocher(garlicdist)) garlic$distClust # cluster distances # example 2 data(USArrests) (usa <- tocher(dist(USArrests))) usa$distClust # cophenetic correlation cophUS <- cophenetic(usa) cor(cophUS, dist(USArrests)) # using the sequential algorithm (usa2 <- tocher(dist(USArrests), algorithm = "sequential")) usa2$criterion # example 3 data(eurodist) (euro <- tocher(eurodist)) euro$distClust # End (not run)
# example 1 data(garlicdist) (garlic <- tocher(garlicdist)) garlic$distClust # cluster distances # example 2 data(USArrests) (usa <- tocher(dist(USArrests))) usa$distClust # cophenetic correlation cophUS <- cophenetic(usa) cor(cophUS, dist(USArrests)) # using the sequential algorithm (usa2 <- tocher(dist(USArrests), algorithm = "sequential")) usa2$criterion # example 3 data(eurodist) (euro <- tocher(eurodist)) euro$distClust # End (not run)