hammingdists.Rd
Returns a distance matrix giving all pairwise Hamming distances between the
rows of its argument meanings
, which can be a matrix, data frame or
vector. Vectors are treated as matrices with a single column, so the
distances in its return value can only be 0 or 1.
hammingdists(meanings)
meanings | a matrix with the different dimensions encoded along
columns, and all combinations of meanings specified along rows. The data
type of the cells does not matter since distance is simply based on
equality (with the exception of |
---|
A distance matrix of type dist
with n*(n-1)/2
rows/columns, where n is the number of rows in meanings
.
This function behaves differently from calling
dist(meanings, method="manhattan")
in how NA
values are treated: specifying a meaning component as NA
allows you
to ignore that dimension for the given row/meaning combinations,
(instead of counting a difference between NA
and another value as a
distance of 1).
# a 2x2 design using strings print(strings <- matrix(c("a1", "b1", "a1", "b2", "a2", "b1", "a2", "b2"), ncol=2, byrow=TRUE))#> [,1] [,2] #> [1,] "a1" "b1" #> [2,] "a1" "b2" #> [3,] "a2" "b1" #> [4,] "a2" "b2"hammingdists(strings)#> 1 2 3 #> 2 1 #> 3 1 2 #> 4 2 1 1# a 2x3 design using integers print(integers <- matrix(c(0, 0, 0, 1, 0, 2, 1, 0, 1, 1, 1, 2), ncol=2, byrow=TRUE))#> [,1] [,2] #> [1,] 0 0 #> [2,] 0 1 #> [3,] 0 2 #> [4,] 1 0 #> [5,] 1 1 #> [6,] 1 2hammingdists(integers)#> 1 2 3 4 5 #> 2 1 #> 3 1 1 #> 4 1 2 2 #> 5 2 1 2 1 #> 6 2 2 1 1 1# a 3x2 design using factors (ncol is always the number of dimensions) print(factors <- data.frame(colour=c("red", "red", "green", "blue"), animal=c("dog", "cat", "dog", "cat")))#> colour animal #> 1 red dog #> 2 red cat #> 3 green dog #> 4 blue cathammingdists(factors)#> 1 2 3 #> 2 1 #> 3 1 2 #> 4 2 1 2# if some meaning dimension is not relevant for some combinations of # meanings (e.g. optional arguments), specifying them as NA in the matrix # will make them not be counted towards the hamming distance! in this # example the value of the second dimension does not matter (and does not # count towards the distance) when the the first dimension has value '1' print(ignoredimension <- matrix(c(0, 0, 0, 1, 1, NA), ncol=2, byrow=TRUE))#> [,1] [,2] #> [1,] 0 0 #> [2,] 0 1 #> [3,] 1 NAhammingdists(ignoredimension)#> 1 2 #> 2 1 #> 3 1 1# trivial case of a vector: first and last two elements are identical, # otherwise a difference of one hammingdists(c(0, 0, 1, 1))#> 1 2 3 #> 2 0 #> 3 1 1 #> 4 1 1 0