Minutes of the Bioinformatics Standards Committee Working Group
Oct. 19, 2004, 5:30 PM

Vicky Markstein (temp chair)
Betty Cheng (recorder)
Helen Berman
Richard Murai (sp)
Hanchuan Peng
Rama Balachrisnan
Bob Davis (chair microprocessor Standards Committee)
Dan Zuros (Advisor, IEEE)
Peter Markstein
Major Points from Discussion:
Roadmap of biomedical data is needed
Several standards communities exist – unknown if one is “ready-to-go”
Many databases need interoperability or data exchange – a common format is needed (a major point of agreement)
Annotation is problematic (an area for standards development)
Change management or version control for smooth database updates is needed
Need to address fundamental issues for image data – everything from formats to automated search tools (developmental work)
Standards validation is needed (for applications that claim to adhere for standards)
Next meeting:
Tues, Nov. 23rd at 5:30 PM
Items for agenda – regular schedule for meetings
Betty Cheng will take care of mailing list until we get a permanent person.

Details of Discussion:

Phil Bourne is the proposed chair but not yet a member of the IEEE Standards Association. Vicky to serve as interim chair.

1. Add Bioinformatics to the P1953 PAR name.
2. URL (server) provided by the IEEE
    a. Betty Cheng to received messages until a webmaster is found (ownership transferred 11/14/04)
    b. Find a webmaster
3. resend attachments from previous emails. They all came out in bin hex format.

The topic of this meeting is the definition of the working group’s goal which will define the initial project for the Bioinformatics Standards Committee (BSC).

The working group shall define (for itself) a one year project.

Phase I of the project shall be the definition of bioinformatics data structures for the standards group.

Phase II shall be to identify and contact representatives from groups which have developed or are in the process of developing appropriate standards for the field.

Phase III, the working group will chose 1 or 2 areas for initial focus, where the Bioinformatics Standards Committee should develop into the first IEEE bioinformatics standard.

(This should be the extent of the working group’s mission. It is the job of the Bioinformatics Standards Committee (BSC) to produce the first standard. Clarification added 11/16/04)

The BSC will need a successful project in order to win community trust.

A) Is the Protein Databank (PDB) a potential ready-to-go standard?
Not a great example and yet one of the most integrated in terms of both community and integration of data.

Multiple sites for the PDB – worldwide distribution – all open software – totally transportable

B) Open Biological Ontology (OBO): a community established to develop ontologies for biosciences. GO, the Gene Ontology is a successful example and might be a good candidate. Examples of databases that contribute/cooperated include yeast, drosophilia, workbase

C) Create a universal data exchange – PDB a possibility?
GO not as mature
Get closer to the raw data – choose archives vs. databases, i.e. PDB & Genbank

D) The projects mentioned above are too advanced and too detailed.
Establish broad categories first

First establish “framework” to hang other standards. Example – need to define GO within the framework to let others build on GO

But mapping to other databases exists, i.e. sequence ontology

What about format? We need an exchangible format.

Look at GMOD – model organism database

What is biggest issue with databases?

Interoperability – need for this permeates the field at all levels (Agreement from all)

Need for a mechanism to manage change –version control

Validation for data submission – (agreement)

How to avoid constant re-engineering? Community Input? Notification?
Change management for databases needs to be improved

Definition of data is a must – problem with abstraction

First goal should be “snapshot” of field

For GO
Big problems – how to reach the users? IEEE group might help with education

Slow changes – involves discussion with group, leads to lesser need for education – more mechanisms – less need for education as a result

For all databases - Need for development? Better methods for distribution. Annotation – how to automate? Rate of data increase – need more automation for annotation

Software changes ⇒ Need for standardized language to describe annotations

Unable to grab data automatically

Change process needs to be well-documented

Education for database users needs improvement

Better methods for data management

Image data – standardized image formats needed – need searchable database for images. At LBL and Berkeley – large collection of images of drosophilia

Image GO – IMAGO- structures as 2D images, need to be searchable

Industry viewpoint from Richard – welcomes IEEE – a need to bring academia and industry together.

Areas that need development and integration with existing standards (communities)

