BItsliced Genomic Signature Index [bigsi]

BItsliced Genomic Signature Index [BIGSI] Docs

Welcome to the BIGSI documentation. You'll find comprehensive guides and documentation to help you start working with BIGSI as quickly as possible.

BIGSIs–BItsliced Genomic Signature Indexes–allow for efficient indexing and search in very large collections of WGS data, in particular bacterial or viral data sets WGS data sets. BIGSI can index and query raw, or assembled data.

A prebuilt index is available for download at or a hosted demo is available here

Please cite our paper if you use this tool in your research:
'Ultra-fast search of all deposited bacterial and viral genomic data'

Get Started    Guides

Constructing a BIGSI

1. Extract k-mers from your data

You can just any tool you want to extract unique k-mers from your raw data. We recommend mccortex as you can use it's error cleaning methods to extract error cleaned k-mers. However, you can also use a k-mer counter software like Jellyfish or a custom script.

mccortex/bin/mccortex31 build -k 31 -s test1 -1 example-data/kmers.txt example-data/test1.ctx
mccortex/bin/mccortex31 build -k 31 -s test2 -1 example-data/kmers.txt example-data/test2.ctx

2. Create the BIGSI config files

Choosing BIGSI parameters

See to decide on parameters k, m and h.

Below are three example configs to get you started with your preferred key value store berkeleyDB, rocksDB, or redis.

## Example config using berkeleyDB
h: 1
k: 31
m: 28000000
storage-engine: berkeleydb
  filename: test-berkeleydb
  flag: "c" ## Change to 'r' for read-only access
## Example config using redis
h: 1
k: 31
m: 28000000
storage-engine: redis
  host: localhost
  port: 6379
## Example config using rocksdb
h: 1
k: 31
m: 28000000
nproc: 4
storage-engine: rocksdb
  filename: test-rocksdb
    create_if_missing: true
    max_open_files: 5000
  read_only: false ## Change to true for read only access

3. Construct the bloom filters

export BIGSI_CONFIG=example-data/configs/rocks.yaml ## set the config path, or use --config

bigsi bloom example-data/test1.ctx example-data/test1.bloom
bigsi bloom example-data/test2.ctx example-data/test2.bloom

4. Insert the bloom filters into the index

bigsi build  example-data/test1.bloom example-data/test2.bloom -s s1 -s s2

5. Query the index


If you've installed with docker


docker run -v $PWD/example-data:/data phelimb/bigsi:63768c2 bigsi bloom --config /data/configs/berkeleydb.yaml /data/test1.ctx /data/test1.bloom    
docker run -v $PWD/example-data:/data phelimb/bigsi:63768c2 bigsi bloom --config /data/configs/berkeleydb.yaml /data/test1.ctx /data/test2.bloom

docker run -v $PWD/example-data:/data phelimb/bigsi:63768c2 bigsi build --config /data/configs/berkeleydb.yaml /data/test1.bloom /data/test2.bloom -s s1 -s s2
docker run -v $PWD/example-data:/data phelimb/bigsi:63768c2 bigsi search --config /data/configs/berkeleydb.yaml CGGCGAGGAAGCGTTAAATCTCTTTCTGACG

Constructing a BIGSI

Suggested Edits are limited on API Reference Pages

You can only suggest edits to Markdown body content, but not to the API spec.