Extracting GenBank files from session files using extract_clusters

A common next step after a cblaster search is to retrieve the identified gene clusters so we can perform additional analysis. cblaster provides the extract_clusters module precisely for this purpose, allowing you to generate GenBank files of specific gene clusters directly from a session file. This works for sessions from both remote and local searches: for remote searches, clusters are downloaded directly from the NCBI, and in local searches, from the SQL database generated using the makedb module.

Example usage

Extract all clusters from a session (can take a long time for remote searches with many results):

$ cblaster extract_clusters session.json -o example_directory

Extract clusters 1-10 and cluster 25 (these numbers can be found in the summary file of the ‘search’ command):

$ cblaster extract_clusters session.json -c 1-10 25 -o example_directory

Extract clusters only from specific organisms (regular expressions):

$ cblaster extract_clusters session.json -or "Aspergillus.*" "Penicillium.*" -o example_directory

Extract clusters only from a specific range on scaffold_123 and all clusters on scaffold_234 (note: expects unique scaffold names):

$ cblaster extract_clusters session.json -sc scaffold_123:1-80000 scaffold_234 -o example_directory