Calculate the bag-of-characters similarity between strings.

orderinsensitivedists(strings = NULL, split = NULL,
  segmentcounts = segment.counts(strings, split))

Arguments

strings

a vector or list of strings

split

boundary sequency at which to segment the strings (default splits the string into all its constituent characters)

segmentcounts

if custom segmentation is required, the pre-segmented strings can be passed as this argument (which is a list of lists)

Value

a distance matrix

See also

Examples

orderinsensitivedists(c("xxxx", "asdf", "asd", "dsa"))
#> asdf asd #> asdf 8 #> asd 7 1 #> dsa 7 1 0