BioJava is an open source project dedicated to providing Java tools for processing biological data. BioJava is a set of library functions written in the Java programming language for manipulating sequences, protein structures, file parsers, CORBA interoperability, DAS, access to AceDB, dynamic programming, and simple statistical routines. BioJava supports a huge range of data, starting from DNA and protein sequences to the level of 3D protein structures. The BioJava libraries are useful for automating many daily and mundane bioinformatics tasks such as to parsing a PDB file, interacting with Jmol and many more. This Application programming interface (API) provides various file parsers, data models and algorithms to facilitate working with the standard data formats and enables rapid application development and analysis. These libraries have also been used in development of various extended analysis tools 1, for example:
- MUSI: an integrated system for identifying multiple specificity from very large peptide or nucleic acid data sets.
- JEnsembl: a version-aware Java API to Ensembl data systems.
- Expression profiling of signature gene sets with trinucleotide threading
- Resolving the structural features of genomic islands: a machine learning approach
- Utility library for structural bioinformatics
The BioJava project grew out of work by Thomas Down and Matthew Pocock to create an API to simplify development of Java-based Bioinformatics tools. BioJava is an active open source project that has been developed over more than 12 years and by more than 60 developers. BioJava is one of a number of Bio* projects designed to reduce code duplication. Examples of such projects that fall under Bio* apart from BioJava are BioPython, BioPerl, BioRuby, EMBOSS etc.
The latest version of BioJava (3.0.5) is a major update to the previous versions. The new version of BioJava contains several independent modules. The old project has been moved to a separate project called biojava-legacy project.