Skip to main content Skip to main navigation menu Skip to site footer
Type: Article
Published: 2026-04-21
Page range: 457-468
Abstract views: 85
PDF downloaded: 37

BlasTax—a user-friendly stand-alone tool to leverage the BLAST+ program for molecular taxonomy

Zoologisches Institut, Technische Universität Braunschweig, Mendelssohnstr. 4, 38106 Braunschweig, Germany
School of Electrical and Computer Engineering, National Technical University of Athens, 15780 Athens, Greece
Centre for Molecular Biodiversity Research, Leibniz Institute for the Analysis of Biodiversity Change (LIB), Museum of Nature, Martin-Luther-King-Platz 3, 20146 Hamburg, Germany, Departamento de Biodiversidad y Biología Evolutiva, Museo Nacional de Ciencias Naturales (MNCN-CSIC), José Gutiérrez Abascal 2, 28006 Madrid, Spain
Departamento de Biodiversidad y Biología Evolutiva, Museo Nacional de Ciencias Naturales (MNCN-CSIC), José Gutiérrez Abascal 2, 28006 Madrid, Spain
Institut für Genetik, Technische Universität Braunschweig, Spielmannstr. 7, 38106 Braunschweig, Germany
Zoologisches Institut, Technische Universität Braunschweig, Mendelssohnstr. 4, 38106 Braunschweig, Germany
Institut für Biochemie und Biologie, Universität Potsdam, Karl-Liebknecht-Str. 24–25, 14476 Potsdam, Germany
Natural History Museum Denmark, University of Copenhagen, Universitetsparken 15, 2100, Copenhagen Ø, Denmark
Institut de Systématique, Évolution, Biodiversité (ISYEB), Muséum National d’Histoire Naturelle, CNRS, Sorbonne Université, EPHE, Université des Antilles, 57 rue Cuvier, 75231 Paris cedex, France
BLAST Museomics Phylogenomics DNA metabarcoding

Abstract

We introduce BlasTax, a standalone software tool wrapping the BLAST algorithm for finding regions of similarity between nucleotide and amino acid sequences. BlasTax is designed to serve both general users of local BLAST who seek a simple and user-friendly interface, and taxonomists engaged in phylogenomics and museomics projects. BlasTax is driven by a graphical user interface that makes various BLAST functions accessible without separately installing the BLAST+ executables. It introduces several advanced modes to retrieve matching reads from FASTQ files of high-throughput sequencing of archival DNA from recent or historical collection material, to append matching sequences to existing alignments, or to decontaminate sequence data sets from sequences of non-target taxa. The program also comprises functions for the preparation of sequence files to be used as reference or query for BLAST, as well as utilities for sequence merging based on species labels, codon trimming and codon-aware multiple sequence alignments.

References

  1. Agne, S., Preick, M., Straube, N. & Hofreiter, M. (2022) Simultaneous barcode sequencing of diverse museum collection specimens using a mixed RNA bait set. Frontiers in Ecology and Evolution, 10, 909846. https://doi.org/10.3389/fevo.2022.909846
  2. Alikhan, N.F., Petty, N.K., Ben Zakour, N.L. & Beatson, S.A. (2011) BLAST Ring Image Generator (BRIG): simple prokaryote genome comparisons. BMC Genomics, 12, 402. https://doi.org/10.1186/1471-2164-12-402
  3. Alser, M., Rotman, J., Deshpande, D., Taraszka, K., Shi, H., Baykal, P.I., Yang, H.T., Xue, V., Knyazev, S., Singer, B.D., Balliu, B., Koslicki, D., Skums, P., Zelikovsky, A., Alkan, C., Mutlu, O. & Mangul, S. (2021) Technology dictates algorithms: recent developments in read alignment. Genome Biology, 22 (1), 249. https://doi.org/10.1186/s13059-021-02443-7
  4. Altschul, S.F., Gish, W., Miller, W., Myers, E.W. & Lipman, D.J. (1990) Basic local alignment search tool. Journal of Molecular Biology, 215 (3), 403–410. https://doi.org/10.1016/S0022-2836(05)80360-2
  5. Bolyen, E., Rideout, J.R., Dillon, M.R., Bokulich, N.A., Abnet, C.C., Al-Ghalith, G.A., Alexander, H., Alm, E.J., Arumugam, M., Asnicar, F., Bai, Y., Bisanz, J.E., Bittinger, K., Brejnrod, A., Brislawn, C.J., Brown, C.T., Callahan, B.J., Caraballo-Rodríguez, A.M., Chase, J., Cope, E.K., Da Silva, R., Diener, C., Dorrestein, P.C., Douglas, G.M., Durall, D.M., Duvallet, C., Edwardson, C.F., Ernst, M., Estaki, M., Fouquier, J., Gauglitz, J.M., Gibbons, S.M., Gibson, D.L., Gonzalez, A., Gorlick, K., Guo, J., Hillmann, B., Holmes, S., Holste, H., Huttenhower, C., Huttley, G.A., Janssen, S., Jarmusch, A.K., Jiang, L., Kaehler, B.D., Kang, K.B., Keefe, C.R., Keim, P., Kelley, S.T., Knights, D., Koester, I., Kosciolek, T., Kreps, J., Langille, M.G.I., Lee, J., Ley, R., Liu, Y.-X., Loftfield, E., Lozupone, C., Maher, M., Marotz, C., Martin, B.D., McDonald, D., McIver, L.J., Melnik, A.V., Metcalf, J.L., Morgan, S.C., Morton, J.T., Naimey, A.T., Navas-Molina, J.A., Nothias, L.F., Orchanian, S.B., Pearson, T., Peoples, S.L., Petras, D., Preuss, M.L., Pruesse, E., Rasmussen, L.B., Rivers, A., Robeson, M.S. II, Rosenthal, P., Segata, N., Shaffer, M., Shiffer, A., Sinha, R., Song, S.J., Spear, J.R., Swafford, A.D., Thompson, L.R., Torres, P.J., Trinh, P., Tripathi, A., Turnbaugh, P.J., Ul-Hasan, S., van der Hooft, J.J.J., Vargas, F., Vázquez-Baeza, Y., Vogtmann, E., von Hippel, M., Walters, W., Wan, Y., Wang, M., Warren, J., Weber, K.C., Williamson, C.H.D., Willis, A.D., Xu, Z.Z., Zaneveld, J.R., Zhang, Y., Zhu, Q., Knight, R. & Caporaso, J.G. (2019) Reproducible, interactive, scalable and extensible microbiome data science using QIIME 2. Nature Biotechnology, 37, 852–857. https://doi.org/10.1038/s41587-019-0209-9
  6. Boyer, F., Mercier, C., Bonin, A., Le Bras, Y., Taberlet, P. & Coissac, E. (2016) OBITools: A UNIX-inspired software package for DNA metabarcoding. Molecular Ecology Resources, 16, 176–182. https://doi.org/10.1111/1755-0998.12428
  7. Buchfink, B., Xie, C. & Huson, D.H. (2015) Fast and sensitive protein alignment using DIAMOND. Nature Methods, 12 (1), 59–60. https://doi.org/10.1038/nmeth.3176
  8. Camacho, C., Coulouris, G., Avagyan, V., Ma, N., Papadopoulos, J., Bealer, K. & Madden, T.L. (2009) BLAST+: architecture and applications. BMC Bioinformatics, 10, 421. https://doi.org/10.1186/1471-2105-10-421
  9. Chen, C., Chen, H., Zhang, Y., Thomas, H.R., Frank, M.H., He, Y. & Xia, R. (2020) TBtools: An integrative toolkit developed for interactive analyses of big biological data. Molecular Plant, 13 (8), 1194–1202. https://doi.org/10.1016/j.molp.2020.06.009
  10. Cock, P.J.A., Chilton, J.M., Grüning, B., Johnson, J.E. & Soranzo, N. (2015) NCBI BLAST+ integrated into Galaxy. GigaScience, 4 (1), 39. https://doi.org/10.1186/s13742-015-0080-7
  11. Cock, P.J.A., Antao, T., Chang, J.T., Chapman, B.A., Cox, C.J., Dalke, A., Friedberg, I., Hamelryck, T., Kauff, F., Wilczynski, B. & de Hoon, M.J.L. (2009) Biopython: freely available Python tools for computational molecular biology and bioinformatics. Bioinformatics, 25 (11), 1422–1423. https://doi.org/10.1093/bioinformatics/btp163
  12. Conesa, A., Götz, S., García-Gómez, J.M., Terol, J., Talón, M. & Robles, M. (2005) Blast2GO: a universal tool for annotation, visualization and analysis in functional genomics research. Bioinformatics, 21 (18), 3674–3676. https://doi.org/10.1093/bioinformatics/bti610
  13. Dietz, L., Eberle, J., Mayer, C., Kukowka, S., Bohacz, C., Baur, H., Espeland, M., Huber, B.A., Hutter, C., Mengual, X., Peters, R.S., Vences, M., Wesener, T., Willmott, K., Misof, B., Niehuis, O. & Ahrens, D. (2023) Standardized nuclear markers improve and homogenize species delimitation in Metazoa. Methods in Ecology and Evolution, 14, 543–555. https://doi.org/10.1111/2041-210X.14041
  14. Dowd, S.E., Zaragoza, J., Rodriguez, J.R., Oliver, M.J. & Payton, P.R. (2005) Windows .NET Network Distributed Basic Local Alignment Search Toolkit (W.ND-BLAST). BMC Bioinformatics, 6, 93. https://doi.org/10.1186/1471-2105-6-93
  15. Du, Z., Wu, Q., Wang, T., Chen, D., Huang, X., Yang, W. & Luo, W. (2020) BlastGUI: A Python-based cross-platform local BLAST visualization software. Molecular Informatics, 39, e1900120. https://doi.org/10.1002/minf.201900120
  16. Dufresnes, C., Brelsford, A., Jeffries, D.L., Mazepa, G., Suchan, T., Canestrelli, D., Nicieza, A., Fumagalli, L., Dubey, S., Martínez-Solano, I., Litvinchuk, S.N., Vences, M., Perrin, N. & Crochet, P.-A. (2021) Mass of genes rather than master genes underlie the genomic architecture of amphibian speciation. Proceedings of the National Academy of Sciences of the U.S.A., 118, e2103963118. https://doi.org/10.1073/pnas.2103963118
  17. Edwards, S.V., Liu, L. & Pearl, D.K. (2016) Implementing and testing the multispecies coalescent model: a valuable paradigm for phylogenomics. Molecular Phylogenetics and Evolution, 94 (Pt A), 447–462. https://doi.org/10.1016/j.ympev.2015.10.027
  18. Federhen, S. (2012) The NCBI Taxonomy database. Nucleic Acids Research, 40, D136–D143. https://doi.org/10.1093/nar/gkr1178
  19. Ferrari, G., Esselens, L., Hart, M.L., Janssens, S., Kidner, C., Mascarello, M., Peñalba, J.V., Pezzini, F., von Rintelen, T., Sonet, G., Vangestel, C., Virgilio, M. & Hollingsworth, P.M. (2023) Developing the protocol infrastructure for DNA sequencing natural history collections. Biodiversity Data Journal, 11, e102317. https://doi.org/10.3897/BDJ.11.e102317
  20. Fujita, M.K., Leaché, A.D., Burbrink, F.T., McGuire, J.A. & Moritz, C. (2012) Coalescent-based species delimitation in an integrative taxonomy. Trends in Ecology & Evolution, 27 (9), 480–488. https://doi.org/10.1016/j.tree.2012.04.012
  21. Gotea, V., Veeramachaneni, V. & Makalowski, W. (2003) Mastering seeds for genomic size nucleotide BLAST searches. Nucleic Acids Research, 31 (23), 6935–6941. https://doi.org/10.1093/nar/gkg886
  22. He, J., Dai, X. & Zhao, X. (2007) PLAN: a web platform for automating high-throughput BLAST searches and for managing and mining results. BMC Bioinformatics, 8, 53.
  23. Kalyaanamoorthy, S., Minh, B.Q., Wong, T.K., von Haeseler, A. & Jermiin, L.S. (2017) ModelFinder: fast model selection for accurate phylogenetic estimates. Nature Methods, 14, 587. https://doi.org/10.1038/nmeth.4285
  24. Kapun, M., Schwentner, M., Haring, E., Akkari, N., Kroh, A., Kruckenhauser, L., Palandačić, A. & Vohland, K. (2025) Museomics, the extended specimen and collectomics – how to frame and name the diversity of information linked to specimens in natural history collections. Natural History Collections and Museomics, 2, 1–21. https://doi.org/10.3897/nhcm.2.161331
  25. Karbstein, K., Kösters, L., Hodač, L., Hofmann, M., Hörandl, E., Tomasello, S., Wagner, N.D., Emerson, B.C., Albach, D.C., Scheu, S., Bradler, S., de Vries, J., Irisarri, I., Li, H., Soltis, P., Mäder, P. & Wäldchen, J. (2024) Species delimitation 4.0: integrative taxonomy meets artificial intelligence. Trends in Ecology & Evolution, 39 (8), 771–784. https://doi.org/10.1016/j.tree.2023.11.002
  26. Kent, W.J. (2002) BLAT—the BLAST-like alignment tool. Genome Research, 12 (4), 656–664. https://doi.org/10.1101/gr.229202
  27. Kerfeld, C.A. & Scott, K.M. (2011) Using BLAST to teach “E-value-tionary” concepts. PLoS Biology, 9, e1001014. https://doi.org/10.1371/journal.pbio.1001014
  28. Lalueza-Fox, C. (2022) Museomics. Current Biology, 32 (21), R1214–R1215. https://doi.org/10.1016/j.cub.2022.09.019
  29. Letsch, H., Greve, C., Hundsdoerfer, A.K., Irisarri, I., Moore, J.M., Espeland, M., Wanke, S., Arifin, U., Blom, M.P.K., Corrales, C., Donath, A., Fritz, U., Köhler, G., Kück, P., Lemer, S., Mengual, X., Salas, N.M., Meusemann, K., Palandačić, A., Printzen, C., Sigwart, J.D., Silva-Brandão, K.L., Simões, M., Stange, M., Suh, A., Szucsich, N., Tilic, E., Töpfer, T., Böhne, A., Janke, A. & Pauls, S.U. (2025) Type genomics: a framework for integrating genomic data into biodiversity and taxonomic research. Systematic Biology, 74, 1029–1044. https://doi.org/10.1093/sysbio/syaf040
  30. Martin, M. (2011) Cutadapt removes adapter sequences from high-throughput sequencing reads. EMBnet.journal, 1, 10–12. https://doi.org/10.14806/ej.17.1.200
  31. Minh, B.Q., Nguyen, M.A.T. & von Haeseler, A. (2013) Ultrafast approximation for phylogenetic bootstrap. Molecular Biology and Evolution, 30, 1188–1195. https://doi.org/10.1093/molbev/mst024
  32. Miralles, A., Bruy, T., Wolcott, K., Scherz, M.D., Begerow, D., Beszteri, B., Bonkowski, M., Felden, J., Gemeinholzer, B., Glaw, F., Glöckner, F.O., Hawlitschek, O., Kostadinov, I., Nattkemper, T.W., Printzen, C., Renz, J., Rybalka, N., Stadler, M., Weibulat, T., Wilke, T., Renner, S.S. & Vences, M. (2020) Repositories for taxonomic data: where we are and what is missing. Systematic Biology, 69, 1231–1253. https://doi.org/10.1093/sysbio/syaa026
  33. Mohanty, J.N., Sahoo, S. & Mishra, P. (2022) NBLAST: a graphical user interface-based two-way BLAST software with a dot plot viewer. Genomics & Informatics, 20 (3), e40. https://doi.org/10.5808/gi.21075
  34. National Center for Biotechnology Information. (n.d.) Developer information – BLAST help. Available from: https://blast.ncbi.nlm.nih.gov/doc/blast-help/developerinfo.html (accessed 17 August 2025).
  35. Neumann, R.S., Kumar, S., Haverkamp, T.H. & Shalchian-Tabrizi, K (2014) BLASTGrabber: a bioinformatic tool for visualization, analysis and sequence selection of massive BLAST data. BMC Bioinformatics, 15, 128. https://doi.org/10.1186/1471-2105-15-128
  36. Newell, P.D., Fricker, A.D., Roco, C.A., Chandrangsu, P. & Merkel, S.M. (2013) A small-group activity introducing the use and interpretation of BLAST. Journal of Microbiology & Biology Education, 14, 238–243. https://doi.org/10.1128/jmbe.v14i2.637
  37. Nguyen, L.T., Schmidt, H.A., von Haeseler, A. & Minh, B.Q. (2015) IQ-TREE: a fast and effective stochastic algorithm for estimating maximum-likelihood phylogenies. Molecular Biology and Evolution, 32, 268–274. https://doi.org/10.1093/molbev/msu300
  38. Okonechnikov, K., Golosova, O., Fursov, M. & UGENE team. (2012) Unipro UGENE: a unified bioinformatics toolkit. Bioinformatics, 28 (8), 1166–1167. https://doi.org/10.1093/bioinformatics/bts091
  39. Page, M., MacLean, D. & Schudoma, C. (2016) blastjs: a BLAST+ wrapper for Node.js. BMC Research Notes, 9, 130. https://doi.org/10.1186/s13104-016-1938-1
  40. Paijmans, J.L., Baleka, S., Henneberger, K., Taron, U.H., Trinks, A., Westbury, M.V. & Barlow, A. (2017) Sequencing single-stranded libraries on the Illumina NextSeq 500 platform. arXiv preprint, arXiv:1711.11004.
  41. Priyam, A., Woodcroft, B.J., Rai, V., Moghul, I., Munagala, A., Ter, F., Chowdhary, H., Pieniak, I., Maynard, L.J., Gibbins, M.A., Moon, H., Davis-Richardson, A., Uludag, M., Watson-Haigh, N.S., Challis, R., Nakamura, H., Favreau, E., Gómez, E.A., Pluskal, T., Leonard, G., Rumpf, W. & Wurm, Y. (2019) Sequenceserver: A modern graphical user interface for custom BLAST databases. Molecular Biology and Evolution, 36 (12), 2922–2924. https://doi.org/10.1093/molbev/msz185
  42. Rancilhac, L., Bruy, T., Scherz, M.D., Almeida Pereira, E., Preick, M., Straube, N., Lyra, M.L., Ohler, A., Streicher, J.W., Andreone, F., Crottini, A., Hutter, C.R., Randrianantoandro, J.C., Rakotoarison, A., Glaw, F., Hofreiter, M. & Vences, M. (2020) Target-enriched DNA sequencing from historical type material enables a partial revision of the Madagascar giant stream frogs (genus Mantidactylus). Journal of Natural History, 54, 87–118. https://doi.org/10.1080/00222933.2020.1748243
  43. Rannala, B. & Yang, Z. (2020) Species delimitation. In: Scornavacca, C., Delsuc, F. & Galtier, N. (Eds.), Phylogenetics in the Genomic Era. Ch. 5.5, pp. 5.5:1–5.5:18.
  44. Renner, S.S., Scherz, M.D., Schoch, C.L., Gottschling, M. & Vences, M. (2024) Improving the gold standard in NCBI GenBank and related databases: DNA sequences from type specimens and type strains. Systematic Biology, 73, 486–494. https://doi.org/10.1093/sysbio/syae009
  45. Rodríguez, A., Burgon, J.D., Lyra, M., Irisarri, I., Baurain, D., Blaustein, L., Göçmen, B., Künzel, S., Mable, B.K., Nolte, A.W., Veith, M., Steinfartz, S., Elmer, K.R., Philippe, H. & Vences, M. (2017) Inferring the shallow phylogeny of true salamanders (Salamandra) by multiple phylogenomic approaches. Molecular Phylogenetics and Evolution, 115, 16–26. https://doi.org/10.1016/j.ympev.2017.07.009
  46. Roure, B., Rodriguez-Ezpeleta, N. & Philippe, H. (2007) SCaFoS: a tool for selection, concatenation and fusion of sequences for phylogenomics. BMC Evolutionary Biology, 7 (Suppl 1), S2. https://doi.org/10.1186/1471-2148-7-S1-S2
  47. Santiago-Sotelo, P. & Ramirez-Prado, J.H. (2021) prfectBLAST: a platform-independent portable front end for the command terminal BLAST+ stand-alone suite. Biotechniques, 53, 299–300.
  48. Salles, M.M.A. & Domingos, F. (2025) Towards the next generation of species delimitation methods: an overview of machine learning applications. Molecular Phylogenetics and Evolution, 210, 108368. https://doi.org/10.1016/j.ympev.2025.108368
  49. Sayers, E. (2022) A general introduction to the E-utilities. Available from: https://www.ncbi.nlm.nih.gov/books/NBK25497/#chap ter4.Usage_Guidelines (accessed 17 August 2025).
  50. Scherz, M.D., Rasolonjatovo, S.M., Köhler, J., Rancilhac, L., Rakotoarison, A., Raselimanana, A.P., Ohler, A., Preick, M., Hofreiter, M., Glaw, F. & Vences, M. (2020) ‘Barcode fishing’ for archival DNA from historical type material overcomes taxonomic hurdles, enabling the description of a new frog species. Scientific Reports, 10 (1), 19109. https://doi.org/10.1038/s41598-020-75431-9
  51. Schmid, S., Straube, N., Albouy, C., Delling, B., Maclaine, J., Matschiner, M., Møller, P.R., Nocita, A., Palandačić, A., Rüber, L., Sonnewald, M., Alvarez, N., Manel, S. & Pellissier, L. (2025) Unlocking natural history collections to improve eDNA reference databases and biodiversity monitoring. BioScience, 75 (12), 1083–1095. https://doi.org/10.1093/biosci/biaf140
  52. Singhal, S., Leaché, A.D., Fujita, M.K., Cadena, C.D. & Zapata, F. (2025) A genomic perspective on species delimitation. Annual Review of Ecology, Evolution, and Systematics, 56, 467–489. https://doi.org/10.1146/annurev-ecolsys-102723-055311
  53. Steinegger, M. & Söding, J. (2017) MMseqs2 enables sensitive protein sequence searching for the analysis of massive data sets. Nature Biotechnology, 35, 1026–1028. https://doi.org/10.1038/nbt.3988
  54. Straube, N., Lyra, M.L., Paijmans, J.L.A., Preick, M., Basler, N., Penner, J., Rödel, M.-O., Westbury, M.V., Haddad, C.F.B., Barlow, A. & Hofreiter, M. (2021) Successful application of ancient DNA extraction and library construction protocols to museum wet collection specimens. Molecular Ecology Resources, 21, 2299–2315. https://doi.org/10.1111/1755-0998.13433
  55. Talamantes-Becerra, B., Carling, J. & Georges, A. (2021) omicR: A tool to facilitate BLASTn alignments for sequence data. SoftwareX, 14, 100702. https://doi.org/10.1016/j.softx.2021.100702
  56. Unger, S. & Rollins, M. (2022) Bioinformatics is a BLAST: Engaging first-year biology students on campus biodiversity using DNA barcoding. CourseSource, 9, 32. https://doi.org/10.24918/cs.2022.32
  57. Vences, M., Miralles, A., Brouillet, S., Ducasse, J., Fedosov, A., Kharchev, V., Kostadinov, I., Kumari, S., Patmanidis, S., Scherz, M.D., Puillandre, N. & Renner, S.S. (2021) iTaxoTools 0.1: kickstarting a specimen-based software toolkit for taxonomists. Megataxa, 6, 77–92. https://doi.org/10.11646/megataxa.6.2.1
  58. Vences, M., Patmanidis, S., Fedosov, A., Miralles, A. & Puillandre, N. (2024) iTaxoTools 1.0: improved DNA barcode exploration with TaxI2. In: DeSalle, R. (Ed.), DNA barcoding: methods and protocols. Methods in Molecular Biology. Vol. 2744. Humana, New York, pp. 281–296. https://doi.org/10.1007/978-1-0716-3581-0_18
  59. Vences, M., Patmanidis, S., Kharchev, V. & Renner, S.S. (2022) Concatenator, a user-friendly program to concatenate DNA sequences, implementing graphical user interfaces for MAFFT and FastTree. Bioinformatics Advances, 2, vbac050. https://doi.org/10.1093/bioadv/vbac050
  60. Wood, D.E. & Salzberg, S.L. (2014) Kraken: Ultrafast metagenomic sequence classification using exact alignments. Genome Biology, 15, R46. https://doi.org/10.1186/gb-2014-15-3-r46