Algorithms in Stringomics (I): Pattern-Matching against "Stringomes"
This paper reports an initial design of new data-structures that generalizes the idea of pattern-matching in stringology, from its traditional usage in an (unstructured) set of strings to the arena of a well-structured family of strings. In particular, the object of interest is a family of strings composed of blocks/classes of highly similar \"stringlets,\" and thus mimic a population of genomes made by concatenating haplotype-blocks, further constrained by haplotype-phasing. Such a family of strings, which we dub \"stringomes,\" is formalized in terms of a multi-partite directed acyclic graph with a source and a sink. The most interesting property of stringomes is probably the fact that the