!!ATTENTION!!

This website and code base is no longer maintained. Please visit our new home on GitHub at

Project Description

BL!P [blip], or BLAST in Pivot, is a computer program that automates the NCBI BLAST alignment of coding DNA or protein sequences and processes the results for visualization in the Microsoft Live Labs program Pivot. 
Download BL!P
Software requirements:
> Microsoft Windows XP or better.

Explore a sample data set

Watch a short video presentation

Question to ask, or bug to report?


Introduction

NCBI BLAST is a popular software program used to find regions of similarity between biological sequences, and can be used to infer functional and evolutionary relationships between sequences. A NCBI BLAST search using multiple query sequences (e.g. gene predictions from a genome sequencing project) typically generates a large dataset that must be explored for functional or evolutionary patterns on interest. Current approaches to exploring NCBI BLAST results include automated filtering of the dataset using a priori significance thresholds followed by manual inspection. While this approach is satisfactory, novel data exploration and visualization software exists that allows for patterns to be identified more easily and with less bias. One such program is Pivot, which can visualize the relationship between pieces of information allowing for the discovery of hidden patterns. Pivot structures its data into “collections”, which combines groups of similar items based on values of certain attributes (facet categories), and represents each item using an image. We have created a software application, BL!P, that automates the NCBI BLAST search of multiple biological sequences and converts the results into a Pivot collection. BL!P also provides an interface to construct custom image layouts for the collection of Pivot items.

BL!P was developed using C# and .NET 4.0, and uses the Microsoft Biology Foundation (MBF) bioinformatics toolkit to access NCBI resources such as NCBI BLAST and GenBank, as well as parsers to read/write biological sequence data.

BL!P automatically submits multiple FASTA formatted coding DNA or amino acid sequences to a NCBI BLAST protein database. Submissions are polled until complete, and the results are saved to disk for later use. Upon completion of the NCBI BLAST search, the GenBank records for each BLAST hit that meets user specified criteria is downloaded and saved to disk for later use. The results from BLAST and information in the GenBank records are parsed and converted to a Pivot collection. Using data from the Pivot collection, a custom image layout is constructed to represent each BLAST hit. The results are saved to disk and can be loaded into Pivot for exploration. BL!P is a member of the Microsoft Biology Initiative.

About the Author

Vince Forgetta, M.Sc.
Microsoft Intern: Summer 2010
Ph.D. Candidate, McGill University, Department of Human Genetics, Montreal, Quebec, Canada
vincenzo.forgetta@mail.mcgill.ca





Last edited Apr 14, 2013 at 3:16 AM by vforget, version 192