Constructing a BIGSI
1. Extract k-mers from your data
You can just any tool you want to extract unique k-mers from your raw data. We recommend mccortex as you can use it's error cleaning methods to extract error cleaned k-mers. However, you can also use a k-mer counter software like Jellyfish or a custom script.
mccortex/bin/mccortex31 build -k 31 -s test1 -1 example-data/kmers.txt example-data/test1.ctx
mccortex/bin/mccortex31 build -k 31 -s test2 -1 example-data/kmers.txt example-data/test2.ctx
2. Create the BIGSI config files
Choosing BIGSI parameters
See https://bigsi.readme.io/docs/choosing-bigsi-parameters to decide on parameters k, m and h.
Below are three example configs to get you started with your preferred key value store berkeleyDB, rocksDB, or redis.
## Example config using berkeleyDB
h: 1
k: 31
m: 28000000
storage-engine: berkeleydb
storage-config:
filename: test-berkeleydb
flag: "c" ## Change to 'r' for read-only access
## Example config using redis
h: 1
k: 31
m: 28000000
storage-engine: redis
storage-config:
host: localhost
port: 6379
## Example config using rocksdb
h: 1
k: 31
m: 28000000
nproc: 4
storage-engine: rocksdb
storage-config:
filename: test-rocksdb
options:
create_if_missing: true
max_open_files: 5000
read_only: false ## Change to true for read only access
3. Construct the bloom filters
export BIGSI_CONFIG=example-data/configs/rocks.yaml ## set the config path, or use --config
bigsi bloom example-data/test1.ctx example-data/test1.bloom
bigsi bloom example-data/test2.ctx example-data/test2.bloom
4. Insert the bloom filters into the index
bigsi build example-data/test1.bloom example-data/test2.bloom -s s1 -s s2
5. Query the index
bigsi search CGGCGAGGAAGCGTTAAATCTCTTTCTGACG
If you've installed with docker
PWD=`pwd`
docker run -v $PWD/example-data:/data phelimb/bigsi:63768c2 bigsi bloom --config /data/configs/berkeleydb.yaml /data/test1.ctx /data/test1.bloom
docker run -v $PWD/example-data:/data phelimb/bigsi:63768c2 bigsi bloom --config /data/configs/berkeleydb.yaml /data/test1.ctx /data/test2.bloom
docker run -v $PWD/example-data:/data phelimb/bigsi:63768c2 bigsi build --config /data/configs/berkeleydb.yaml /data/test1.bloom /data/test2.bloom -s s1 -s s2
docker run -v $PWD/example-data:/data phelimb/bigsi:63768c2 bigsi search --config /data/configs/berkeleydb.yaml CGGCGAGGAAGCGTTAAATCTCTTTCTGACG
Updated almost 6 years ago