About Bioconductor

The mission of the Bioconductor project is to develop, support, and disseminate free open source software that facilitates rigorous and reproducible analysis of data from current and emerging biological assays. We are dedicated to building a diverse, collaborative, and welcoming community of developers and data scientists.

Scientific, Technical and Community Advisory Boards provide project oversight.

Release and Core Development

The Bioconductor release version is updated twice each year, and is appropriate for most users. There is also a development version, to which new features and packages are added prior to incorporation in the release. A large number of meta-data packages provide pathway, organism, microarray and other annotations.

The Bioconductorproject started in 2001 and is overseen by a core team. A Community Advisory Board and a Technical Advisory Board of key participants meets monthly to support the Bioconductor mission by coordinating training and outreach activities, developing strategies to ensure long-term technical suitability of core infrastructure, and to identify and enable funding strategies for long-term viability. A Scientific Advisory Board including external experts provides annual guidance and accountability.

Key citations to the project include Huber et al., 2015 Nature Methods 12:115-121 and Gentleman et al., 2004 Genome Biology 5:R80

Start Using Bioconductor

Join our ever-growing community and discover how Bioconductor can improve your pipeline

Image depicting boxes and lines

Contribute to Bioconductor

Bioconductor is open source! Join our community of developers and develop your package

Image depicting Bioconductor hexagon logos

Bioconductor Packages

Most Bioconductor components are distributed as R packages. The functional scope of Bioconductor packages includes the analysis of DNA microarray, sequence, flow, SNP, and other genetic or genomic data.

Project Goals

The broad goals of the Bioconductor project are:

  • To provide widespread access to a broad range of powerful statistical and graphical methods for the analysis of genomic data.
  • To facilitate the inclusion of biological metadata in the analysis of genomic data, e.g. literature data from PubMed, annotation data from Entrez genes.
  • To provide a common software platform that enables the rapid development and deployment of extensible, scalable, and interoperable software.
  • To further scientific understanding by producing high-quality documentation and reproducible research.
  • To train researchers on computational and statistical methods for the analysis of genomic data.

Main Project Features

The R project for Statistical Computing. Using R provides a broad range of advantages to the Bioconductor project, including:

  • A high-level interpreted language to easily and quickly prototype new computational methods.
  • A well established system for packaging together software with documentation.
  • An object-oriented framework for addressing the diversity and complexity of computational biology and bioinformatics problems.
  • Access to on-line computational biology and bioinformatics data.
  • Support for rich statistical simulation and modeling activities.
  • Cutting edge data and model visualization capabilities.
  • Active development by a dedicated team of researchers with a strong commitment to good documentation and software design.

Documentation and reproducible research

Each Bioconductor package contains one or more vignettes, documents that provide a textual, task-oriented description of the package’s functionality. Vignettes come in several forms. Many are “HowTo”s that demonstrate how a particular task can be accomplished with that package’s software. Others provide a more thorough overview of the package or discuss general issues related to the package.

Statistical and graphical methods

The Bioconductor project provides access to powerful statistical and graphical methods for the analysis of genomic data. Analysis packages address workflows for analysis of oligonucleotide arrays, sequence analysis, flow cytometry. and other high-throughput genomic data. The R package system itself provides implementations for a broad range of state-of-the-art statistical and graphical techniques, including linear and non-linear modeling, cluster analysis, prediction, resampling, survival analysis, and time-series analysis.

Annotation

The Bioconductor project provides software for associating microarray and other genomic data in real time with biological metadata from web databases such as GenBank, Entrez genes and PubMed (annotate package). Functions are also provided for incorporating the results of statistical analysis in HTML reports with links to annotation web resources. Software tools are available for assembling and processing genomic annotation data, from databases such as GenBank, the Gene Ontology Consortium, Entrez genes, UniGene, the UCSC Human Genome Project (AnnotationDbi package). Annotation data packages are distributed to provide mappings between different probe identifiers (e.g. Affy IDs, Entrez genes, PubMed). Customized annotation libraries can also be assembled.

Bioconductor short courses

The Bioconductor project has developed a program of short courses on software and statistical methods for the analysis of genomic data. Courses have been given for audiences with backgrounds in either biology or statistics. All course materials (lectures and computer labs) are available on this site. There is also a Bioconductor teaching committee to consolidate Bioconductor-focused training materials and establish a community of Bioconductor trainers.

Open source

The Bioconductor project has a commitment to full open source discipline, with distribution via a public git (version control) server. Almost all contributions exist under an open source license. There are many different reasons why open source software is beneficial to the analysis of microarray data and to computational biology in general. The reasons include:

  • To provide full access to algorithms and their implementation
  • To facilitate software improvements through bug fixing and software extension
  • To encourage good scientific computing and statistical practice by providing appropriate tools and instruction
  • To provide a workbench of tools that allow researchers to explore and expand the methods used to analyze biological data
  • To ensure that the international scientific community is the owner of the software tools needed to carry out research
  • To lead and encourage commercial support and development of those tools that are successful
  • To promote reproducible research by providing open and accessible tools with which to carry out that research (reproducible research is distinct from independent verification)

Open development

Users are encouraged to become developers, either by contributing Bioconductor compliant packages or documentation. Additionally Bioconductor provides a mechanism for linking together different groups with common goals to foster collaboration on software, often at the level of shared development.

Quick Statistics

A selection of statistics about the project.

Quick Stats

Last updated "2023-12-18"

3691 total Release Packages

45546715 distinct Software Downloads in 2023

18999 active Support Site Members in the last year

550 active Slack Members

122000 Google Scholar results

Request updated statistics from: [email protected]

Code of Conduct

Please refer to the Bioconductor Code of Conduct

Contact

Please reach out to [email protected]

Bioconductor sticker