BItsliced Genomic Signature Index [bigsi]

BItsliced Genomic Signature Index [BIGSI] Docs

Welcome to the BIGSI documentation. You'll find comprehensive guides and documentation to help you start working with BIGSI as quickly as possible.

Please cite our paper if you use this tool in your research:

Get Started    Guides

Constructing a BIGSI

1. Extract k-mers from your data

You can just any tool you want to extract unique k-mers from your raw data. We recommend mccortex as you can use it's error cleaning methods to extract error cleaned k-mers. However, you can also use a k-mer counter software like Jellyfish or a custom script.

mccortex/bin/mccortex31 build -k 31 -s test1 -1 example-data/kmers.txt example-data/test1.ctx
mccortex/bin/mccortex31 build -k 31 -s test2 -1 example-data/kmers.txt example-data/test2.ctx

2. Initialise the BISI

Choosing BIGSI parameters

See to decide on parameters k, m and h.

bigsi init test-bigsi --k 31 --m 1000 --h 1

3. Construct the bloom filters

bigsi bloom --db test-bigsi -c example-data/test1.ctx example-data/test1.bloom
bigsi bloom --db test-bigsi -c example-data/test1.ctx example-data/test2.bloom

4. Insert the bloom filters into the index

bigsi build test-bigsi example-data/test1.bloom example-data/test2.bloom -s s1 -s s2

5. Query the index

bigsi search -o tsv --db test-bigsi -s CGGCGAGGAAGCGTTAAATCTCTTTCTGACG

If you've installed with docker

docker run -v $PWD/example-data:/data phelimb/bigsi mccortex/bin/mccortex31 build -k 31 -s test1 -1 /data/kmers.txt /data/test1.ctx
docker run -v $PWD/example-data:/data phelimb/bigsi mccortex/bin/mccortex31 build -k 31 -s test2 -1 /data/kmers.txt /data/test2.ctx

docker run -v $PWD/example-data:/data phelimb/bigsi bigsi  init /data/test.bigsi --k 31 --m 1000 --h 1

docker run -v $PWD/example-data:/data phelimb/bigsi bigsi bloom --db /data/test.bigsi -c /data/test1.ctx /data/test1.bloom    
docker run -v $PWD/example-data:/data phelimb/bigsi bigsi bloom --db /data/test.bigsi -c /data/test1.ctx /data/test2.bloom

docker run -v $PWD/example-data:/data phelimb/bigsi bigsi build /data/test.bigsi /data/test1.bloom /data/test2.bloom -s s1 -s s2
docker run -v $PWD/example-data:/data phelimb/bigsi bigsi search -o tsv --db /data/test.bigsi -s CGGCGAGGAAGCGTTAAATCTCTTTCTGACG