Data mining for haplotypes
represent haplotypes as regular expressions with
limited gaps, e.g. (-,-,-,0,0,*,-,-)
look for "overrepresented" haplotypes--occurring
more than a certain # of times in the cases (note--
not a statistical comparison, but a fixed threshold)
observe that any sub-haplotype of an
overrepresented haplotype (introducing more *'s)
is also overrepresented, and recurse to find the
maximal overrepresented haplotypes