Minutes of the Bioinformatics Standards Committee Working Group
Oct. 19, 2004, 5:30 PM
- Vicky Markstein (temp chair)
- Betty Cheng (recorder)
- Helen Berman
- Richard Murai (sp)
- Hanchuan Peng
- Rama Balachrisnan
- Bob Davis (chair microprocessor Standards Committee)
- Dan Zuros (Advisor, IEEE)
- Peter Markstein
- Major Points from Discussion:
- Roadmap of biomedical data is needed
- Several standards communities exist – unknown if one is “ready-to-go”
- Many databases need interoperability or data exchange – a common format is needed (a major point of agreement)
- Annotation is problematic (an area for standards development)
- Change management or version control for smooth database updates is needed
- Need to address fundamental issues for image data – everything from formats to automated search tools (developmental work)
- Standards validation is needed (for applications that claim to adhere for standards)
- Next meeting:
- Tues, Nov. 23rd at 5:30 PM
- Items for agenda – regular schedule for meetings
- Betty Cheng will take care of mailing list until we get a permanent person.
Details of Discussion:
Phil Bourne is the proposed chair but not yet a member of the IEEE Standards Association. Vicky to
serve as interim chair.
- 1. Add Bioinformatics to the P1953 PAR name.
2. URL (server) provided by the IEEE
a. Betty Cheng to received messages until a webmaster is found (ownership transferred 11/14/04)
b. Find a webmaster
3. resend attachments from previous emails. They all came out in bin hex format.
The topic of this meeting is the definition of the working group’s goal which will define the
initial project for the Bioinformatics Standards Committee (BSC).
The working group shall define (for itself) a one year project.
Phase I of the project shall be the definition of bioinformatics data structures for the standards group.
Phase II shall be to identify and contact representatives from groups which have developed or are in the process
of developing appropriate standards for the field.
Phase III, the working group will chose 1 or 2 areas for initial focus, where the Bioinformatics Standards
Committee should develop into the first IEEE bioinformatics standard.
(This should be the extent of the working group’s mission. It is the job of the Bioinformatics Standards
Committee (BSC) to produce the first standard. Clarification added 11/16/04)
The BSC will need a successful project in order to win community trust.
A) Is the Protein Databank (PDB) a potential ready-to-go standard?
Not a great example and yet one of the most integrated in terms of both community and integration of data.
Multiple sites for the PDB – worldwide distribution – all open software – totally transportable
B) Open Biological Ontology (OBO): a community established to develop ontologies for biosciences. GO, the
Gene Ontology is a successful example and might be a good candidate. Examples of databases that
contribute/cooperated include yeast, drosophilia, workbase
C) Create a universal data exchange – PDB a possibility?
GO not as mature
Get closer to the raw data – choose archives vs. databases, i.e. PDB & Genbank
D) The projects mentioned above are too advanced and too detailed.
Establish broad categories first
First establish “framework” to hang other standards. Example – need to define GO within the framework
to let others build on GO
But mapping to other databases exists, i.e. sequence ontology
What about format? We need an exchangible format.
Look at GMOD – model organism database
What is biggest issue with databases?
Interoperability – need for this permeates the field at all levels (Agreement from all)
Need for a mechanism to manage change –version control
Validation for data submission – (agreement)
How to avoid constant re-engineering? Community Input? Notification?
Change management for databases needs to be improved
Definition of data is a must – problem with abstraction
First goal should be “snapshot” of field
Big problems – how to reach the users? IEEE group might help with education
Slow changes – involves discussion with group, leads to lesser need for education – more mechanisms –
less need for education as a result
For all databases - Need for development? Better methods for distribution. Annotation – how to automate?
Rate of data increase – need more automation for annotation
Software changes ⇒ Need for standardized language to describe annotations
Unable to grab data automatically
Change process needs to be well-documented
Education for database users needs improvement
Better methods for data management
Image data – standardized image formats needed – need searchable database for images.
At LBL and Berkeley – large collection of images of drosophilia
Image GO – IMAGO- structures as 2D images, need to be searchable
Industry viewpoint from Richard – welcomes IEEE – a need to bring academia and industry together.
Areas that need development and integration with existing standards (communities)