Chinese scientist has devised a new empirical rule to index DNA sequences. NGram indexing has been known to computer science. Chinese characters are indexed using NGrams. However, Wang Liang, a computer scientist at SOSO.com, one of the search engines in China, found that the ideal NGram length to index for DNA words is 12 nucleotide character long (ATCG combinations).
- ADHD apoptosis autism bacteria being right being wrong biology black swan brain brain functioning cancer cell cell creation cell movement cells chaos chemotherapy complexity connectedness craig ventor emergence environment evolution exercise exploration finance Flu functioning of brain function of brain genetics glacier retreat global warming golf google google earth H1N1 H5N1 happiness health infections innovation Intelligence kathryn schultz Life love malaria memory metabolism mosquito mosquitoes ngrams p53 pandemic philosophy photos physics Psychology quantum mechanics sadness science science news search engines serotonin sleep stem cells stock market stress Swine flu sync telomerase telomere telomeres Tunisia unmanned aircraft virus