Returns a distance matrix giving all pairwise Hamming distances between the rows of its argument meanings, which can be a matrix, data frame or vector. Vectors are treated as matrices with a single column, so the distances in its return value can only be 0 or 1.

hammingdists(meanings)

Arguments

meanings

a matrix with the different dimensions encoded along columns, and all combinations of meanings specified along rows. The data type of the cells does not matter since distance is simply based on equality (with the exception of NA values, see below.

Value

A distance matrix of type dist with n*(n-1)/2 rows/columns, where n is the number of rows in meanings.

Details

This function behaves differently from calling dist(meanings, method="manhattan") in how NA values are treated: specifying a meaning component as NA allows you to ignore that dimension for the given row/meaning combinations, (instead of counting a difference between NA and another value as a distance of 1).

See also

Examples

# a 2x2 design using strings print(strings <- matrix(c("a1", "b1", "a1", "b2", "a2", "b1", "a2", "b2"), ncol=2, byrow=TRUE))
#> [,1] [,2] #> [1,] "a1" "b1" #> [2,] "a1" "b2" #> [3,] "a2" "b1" #> [4,] "a2" "b2"
hammingdists(strings)
#> 1 2 3 #> 2 1 #> 3 1 2 #> 4 2 1 1
# a 2x3 design using integers print(integers <- matrix(c(0, 0, 0, 1, 0, 2, 1, 0, 1, 1, 1, 2), ncol=2, byrow=TRUE))
#> [,1] [,2] #> [1,] 0 0 #> [2,] 0 1 #> [3,] 0 2 #> [4,] 1 0 #> [5,] 1 1 #> [6,] 1 2
hammingdists(integers)
#> 1 2 3 4 5 #> 2 1 #> 3 1 1 #> 4 1 2 2 #> 5 2 1 2 1 #> 6 2 2 1 1 1
# a 3x2 design using factors (ncol is always the number of dimensions) print(factors <- data.frame(colour=c("red", "red", "green", "blue"), animal=c("dog", "cat", "dog", "cat")))
#> colour animal #> 1 red dog #> 2 red cat #> 3 green dog #> 4 blue cat
hammingdists(factors)
#> 1 2 3 #> 2 1 #> 3 1 2 #> 4 2 1 2
# if some meaning dimension is not relevant for some combinations of # meanings (e.g. optional arguments), specifying them as NA in the matrix # will make them not be counted towards the hamming distance! in this # example the value of the second dimension does not matter (and does not # count towards the distance) when the the first dimension has value '1' print(ignoredimension <- matrix(c(0, 0, 0, 1, 1, NA), ncol=2, byrow=TRUE))
#> [,1] [,2] #> [1,] 0 0 #> [2,] 0 1 #> [3,] 1 NA
hammingdists(ignoredimension)
#> 1 2 #> 2 1 #> 3 1 1
# trivial case of a vector: first and last two elements are identical, # otherwise a difference of one hammingdists(c(0, 0, 1, 1))
#> 1 2 3 #> 2 0 #> 3 1 1 #> 4 1 1 0