HDL 2 Nov. 1998
11:42 PM
G-2.1.6/84
July 13, 1998
Item 1 - Welcome and Introduction by Interim Chairman, of IEEE G-2.1.6.
Interim Chairman Alan Godber called the meeting to order at 1:19 PM.
Item 2 – Approval of Draft Agenda.
David Fibush asked for time to make an announcement regarding ITU-R activities. This was added as Item 4A.
John Grigg requested five minutes for a report on liaison with the Video Service Providers Forum. This was included under Item 9.
Alan Godber suggested adding an item for Review and Approval of the Minutes of Meeting #7. This item was postponed at the previous meeting. Copies of the Meeting #7 Minutes were not available, so it was agreed to postpone this item until the next meeting.
Item 3 – Review and Approval of Minutes of the Previous Meeting #8, March 16th, 1998
The Draft Meeting Record, G-2.1.6, Compression and Processing Subcommittee, Meeting #8, March 16, 1998, NTIA/ITS, U.S. Department of Commerce, Boulder, CO, IEEE Doc. G-2.1.6/79, was approved by those present.
Item 4 – Matters Arising from the Minutes
There was a request for posting an electronic copy of Dr. Watson's presentation at the previous meeting on the web site. Doug Lung will request a copy from Dr. Watson. [ACTION ITEM]
Item 4A - Announcement of the establishment of Joint Working Party 10-11Q on audio and video quality assessment and of its first meeting - David Fibush
David Fibush announced there is now a Joint Working Party 10-11Q for both audio and video. The first meeting is scheduled for three days in October. Contributions to the Joint Working Party can be made through David Fibush. Details are available in Coordination Letter - ITU-R JWP 10-11Q on Audio and Video Quality Assessment, David Fibush, June 3, 1998 Doc. G-2.1.6/80 and Circular Letter - Announcement of the establishment of joint working party 10-11Q on audio and video quality assessment and of its first meeting Robert W. Jones, Director, International Telecommunication Union, Radiocommunication Bureau, May 25, 1998, Doc. G-2.1.6/81. These documents will be posted on the IEEE G-2.1.6 web site. [Completed July 13, 1998]
Item 5 – Report of recent Meetings of ITU Video Quality Experts Group (VQEG) at NIST, MD, pertinent to video compression - Arthur Webster, David Fibush, Al Morton, Alan Godber & other participants.
Analog Interface Performance Specifications for Digital Video Teleconferency/Video Telephony Service, NTIA/ITS, Arthur Webster T1A1.5/98-206, IEEE Doc. G-2.1.6/82, was distributed. The document was reviewed. The ITU VQEG Activity Report includes a summary of activity, the final list of proponents in the Validation test, and areas for future work of VQEG. Annex 1 is a report of the Gaithersburg VQEG Meeting May 27-29, 1998.
Compressed Source, also called "In-service" measurements were included as a Future Work area.
Proponents will have to pay for editing tapes and for the tape stock. The cost was estimated to be $1,000 to $1,500 per proponent. People have been found to run the HRCs. Bill Zou from General Instrument volunteered, as well as some others. At this time, there is no need for additional volunteers for HRC processing.
The schedule on Page 6 of the document was discussed. The July 3 deadline for adding patterns to the source sequences and sending them on D1 tapes to the HRC processing sites has not been met. Work is running a few days behind.
It was noted for clarification, regarding the proponents list, TAPESTRIES is the European ACTS project. QUOVADIS is tied in with that project. NASA will submit a proposal.
In response to a question concerning a revised subjective test plan, Arthur Webster replied that Phil Corriveau was working on it, but it hadn't been issued yet. David Fibush stated there is a revised objective test plan on the email reflector. Arthur Webster will post it on the ITU FTP site.
David Fibush observed that part of the meeting was to simplify things. The HRC list was reduced and the number of calculations made after data collection was compressed. Alan Godber commented that there was a change in emphasis during the meeting -- from one series of tests to additional series of test for future work. Some items were deleted from the first series to make it manageable.
5.1 Further Discussion and Recommendations from the Subcommittee.
Discussion continued on the future work of VQEG. Two reasons were given for it to continue: 1) This is a somewhat limited test set and there is a desire for more, and 2) Some test sequences are available only to VQEG. As long as it continued, people can get a hold of this data set and video material. If VQEG ends, access to this material could be a problem. It was noted that this issue has not been resolved, but there is no fear of VQEG going away within a year or two. When it does, perhaps the rights could be transferred.
During a discussion on the HRCs, Arthur Webster commented that two randomizations were recommended. With only one randomization, labs might artificially correlate higher, which would raise the bar for the objective tests. It was mentioned ATTC felt a minimum of two randomizations were required. There are no means to detect the effect of one particular sequence on another with only one randomization. There was concern that VQEG would not have the resources available to do two randomizations. The subjective labs aren't the problem. Nine labs have volunteered. The limitation is the cost of tape stock and editing.
There was a question on the number of labs involved in the subjective testing. Testing will be divided into four groups, using nine subjective labs. Some will do both 525 and 625 line tests; others will do only one. Philip Corriveau is still working on this. It was not known which proponents would be assigned to which labs.
There were several responses to a question concerning the goal of the tests. One answer was that the tests are to find one or more methods of objectively measuring video quality over this range or over a subset of this range of HRCs. Another was to come up with a recommended method, or, if multiple methods are similar, a combination method for objectively measuring video quality. It was noted that the preference is to select one system, per the test plan, if possible.
What is the secondary goal of the tests? Responses included: This experiment will provide a lot of data collected under controlled conditions. The tools developed will be very important for broadcasters making equipment decisions. Quality measurement provides a way to promote digital video quality through competition for better quality numbers. Day to day monitoring is an important consideration. By noticing changes, the system can identify something is about to fail. Anything that gives us consistency and repeatability will be a benefit.
Item 6 - Report of Task Force on Compression Measurements Information Gathering - Chair, Bill Zou.
Alan Godber reported that responsibilities at GI make it difficult for Bill Zou to continue as Chair or even attend these meetings.
6.1 Further Discussion and Action.
Arthur Webster and David Fibush mentioned information and some documents were available on the VQEG reflector. There was a comment that some of the descriptions were only one page with very general block diagrams. While there was concern that more information should have been given up front, it was noted the intent was to exclude oddball systems that weren't serious. If a system without a good description works well, the proponent will have to provide more information.
Alan Godber asked whether there was a need to continue this task force, given the information available from Arthur Webster's VQEG FTP site. He stated we need to make sure we have access to the information on all the systems. Arthur Webster, Alan Godber and Doug Lung will work on this. [ACTION ITEM] It was suggested that the ten proponents be listed on the web site with a pointer to where information on their system is available.
Noting that another purpose of the task force was to arrange for proponents to present their work here; Alan Godber requested comments on whether we need to have others present their proposals. KDD was offered as an example. There was little interest in this.
There was a discussion about VQEG's decision to identify only the top performers in the testing. Arthur Webster responded that he did not think animinity will be a problem. If there is follow on work, the proponents will remain anonymous.
Item 7 - Report of Task Force on "Defining A Unit of Measure & a Means of Calibration for Video Impairment", Leon Stanger
Report of Task Force to define A Unit of Measurement and Means of Calibration for Video Quality Analysis, Leon Stanger, 10 July 1998, IEEE Doc. G-2.1.6/83 was distributed. Leon Stanger reviewed the document. (Information presented in the document is not repeated here.) He commented that using the method described in the first item, which is a threshold system, it is difficult to put numbers of any significance on differences past the first threshold. With regards to the second item, finding one JND is straightforward. The problem is defining two JNDs or one-half JND. One idea is to first compare A to B, define B as one JND, then compare C to B, define that as the second JND and so forth. The third item, using a method based on various viewing distances, may be more of an eye test than a vision test. Small edge effects may be seen better if closer, luminance effects may be seen at a distance.
David Fibush described measurements he did using both PSNR and PQR methods, on a tape on that I.T.S. ran through many different systems for them. Most of the degradations were small. On many sequences, you couldn't see any difference, but it was still possible to measure the differences. Although the differences were well below the visual threshold, looking at the data you could clearly see the differences between the systems. Although you weren't in the one JND area, you could tell differences between the systems. Although thresholding is interesting, there are more powerful methods than thresholding. Stanger responded that this is what he meant by fractional JNDs. 'Fractional JNDs' are important because small errors can accumulate with concatenation.
Leon Stanger outlined the conclusions in his report. Item two showed the most promise - a JND based scale should be established as the calibration reference. The system should cover multiple and fractional JNDs. The focus should be on obtaining repeatable results for equipment testing rather than viewer tests that duplicate the home environment. This will require closer viewing distances and the use of trained observers.
There was concern about item three, regarding the viewer's eyesight. If a person doesn't have normal vision, perhaps we shouldn't be using them. It was suggested that viewing distance be used as an input to the measurement models we do have. Incorporating this should result in a better correlation to subjective quality as opposed to using an integrated error method. It would be another interesting way to test these models.
The definition of a JND was questioned. This led to a discussion about the goal of the task force. Originally it was looking for a definition of a "video volt", and then it was generation of calibrated test material. Now is the goal generation of test material that shows one JND? Stanger agreed it was. The test material would be used to check or calibrate objective test equipment. It was noted this might not work, since every subjective experiment is different from every other subjective experiment. Calibration can be lost depending on how observers are briefed. Another comment was that a JND should remain constant, but that was countered with arguments that different scenes, different viewing conditions and different instructions will change the JND.
There was a suggestion that Bellcore's VIRIS[Video Impairment Reference System] ( system might be useful in adding specific amounts of degradation to an image. This would allow specific amounts of degradation to be added to scenes, by turning knobs. Leon Stanger said that up to this point, he hadn't worried about an electronic means of measuring things. The original idea was to take a tape and record what viewers see as one, two or three JNDs. These results may or may not fit the objective models. However, it gives a baseline. This comment led to another discussion concerning how JNDs add (or won't add). It was noted that vision model scientists, experimenting with different scales, have found most of these psychometric functions are log-linear. It would be useful to have a subjective experiment that determines how much change you need before the change is detectable. Data determined this way would be more structured than the VQEG data. The problem is that with different scenes, you would get a different set of readings. Stanger questioned whether this made a difference, noting that even if you couldn't explain the difference electronically, it is useful to know what values viewers put on the scenes. If necessary, the algorithm can be adjusted to consider this.
Leon Stanger suggested that using VIRIS or a similar system, monotonically increasing amounts of edge noise or blocking could be added to a scene which viewers would then rate to determine the degradation in JNDs.
Stanger agreed that two JNDs would be determined by comparing a new level of degradation with a previous level of degradation, but it would be a big step to call the distance between no JND and the second JND two JNDs. However, he believed that if the experiment is carefully set up with monotonically increasing degradation of one thing, we should be able to say this, by definition. After a discussion on adding JNDs and comparing non-adjacent groups of JNDs, it was agreed we shouldn't try to say four JND is twice as bad as two JND.
It was also stated that if we test for a JND in mosquito noise or a JND in blocking, instead of a combination, we would have to go back and test combinations. Since end to end systems will have multiple impairments, a system for categorizing artifacts is needed.
Leon Stanger outlined his task force's recommendations. IEEE does not need to duplicate VQEG. Instead, it should focus on calibration. Several members suggested that we study each impairment separately.
7.1 Further Discussion and Action
Stanger outlined three areas for future activity: 1) Get specific about how to test - write down the test methods and conditions; 2) Get material with different degradations and monotonically increasing degradation; and 3) Identify labs willing to conduct viewer tests.
It was noted that when looking for thresholds, the tests could be done with one or two people. A simulation doesn't necessary have to be as rigorous from a viewer standpoint and a random selection of viewers is not needed. John Libert commented that this falls directly into how the NIST project is working. They are already doing this for some tests. Instead of bringing in tens of viewers off the street, they are using the people they are working with. There was general agreement that this approach would work.
There was concern that VIRIS may not be the best way to introduce degradations. Some of the artifacts it creates are not the same as MPEG. Bellcore, the creator of VIRIS, was once interested in this work but may have lost funding for it. Varying an encoder's rate control and quantizer was suggested as a method for introducing degradation. It was questioned whether this approach would work, since some encoders truncate the data. It was also questioned whether it was reasonable to try to get continuous gradations in quality. Does MPEG behave linearly? Does it create a continuous gradation? Because It is scene dependent, the answer is probably no.
Leon Stanger recommended forming three task forces to work on the pieces of the puzzle: 1) Definition of the scale in more detail; 2) Developing test material; and 3) Identify labs for testing. A vote on forming the committees was requested. There was one opposing vote. There was concern whether it would result in something useful. We need a clear idea what the payoff is, a definition of what we are trying to achieve. We also need to find the resources. It was recommend that we start small. Take five of the CCIR scenes and four easier ones. Adjust the knobs to where viewers can see the difference. This would give 50 to 100 data points. Try it. If it is a waste of time, stop. This approach is different from the approach VQEG is taking, since VQEG is using DSQS, which is not a threshold measurement.
Discussion continued along two paths: one concerned defining the test method and where it is headed, the other was about locating test material and labs and defining a way to run these tests.
Finding a real-time system for generating specific degradations seemed to be the biggest hurdle. It is also tied into finding material for the tests. It was agreed to combine the task force for developing material and the task force for identifying labs for testing. There was no objection to this.
[ACTION ITEM]
Leon Stanger agreed to chair the task force on Definition. John Libert will work on both groups and chair the second task force to figure out how to do the tests. Leon Stanger will ask people on the original task force which group(s) they want to join. David Fibush requested to be put on the mailing list if Mihir Revel doesn't help. Tektronix could not offer any resources except for the CCIR sequences. Arthur Webster asked to be included in the email list for both groups, but said he couldn't offer much at this time.[ACTION ITEM]
Leon Stanger will send out an email and try to get the task forces established.Item 8 - Further Discussion of Compression Measurement Methodologies.
This was covered under the previous topic. There were no other observations or comments.
8.1 Discussion of Future Work, Additional Assignments, etc.
Item 9 - Any Other Business.
John Grigg described the elements of the Video Services Forum. The Video Service Providers Forum (VSPF) is for people who transport video. The Video Services Industry Forum (VSIF) consists of manufacturers of codecs and transport equipment. A Video Services Customers Forum (VSCF) is being considered. The Video Services Forum is now open to the industry. It is sponsoring the VidTrans98 conference in Miami October 4-7. Companies were invited to join.
John Grigg and Wallace Murray provide liaison to IEEE and T1A1. The Forum is looking at making new services, such as a 6-Mbps DS2 service, available across wider areas. One goal is to find out what broadcasters want before the 2002 Winter Olympics begin. If NTSC is discussed up until six months before the Olympics, but broadcasters decide they want 6-Mbps MPEG at the Olympics, telcos won't be able to deliver and will loose the business to satellite carriers.
Item 10 - Date(s) of Future Meeting(s).
There was no opposition to holding the next meeting in conjunction with the T1A1 meeting, November 2, 1998 at the Ramada Plaza Hotel, Kissimmee, Florida. The T1A1 meeting is being hosted by ECI Telecom.
The committee offered thanks to Bellcore for use of the facilities.
A Motion to Adjourn was seconded by John Libert and Al Morton.
The meeting was adjourned at 5:40 PM.
Submitted by:
H. Douglas Lung
Secretary
APPENDIX "A"
List of Documents Distributed
13 July 1998
Draft Agenda - IEEE Compression and Processing Subcommittee G-2.1.6, Ninth Meeting, Monday, July 13, 1998, Alan Godber, Chairman. (Word 2.0 file 216mt9an.doc)
Draft Meeting Record, G-2.1.6, Compression and Processing Subcommittee, Meeting #8, March 16, 1998, NTIA/ITS U.S. Department of Commerce, 325 Broadway, Boulder, CO, Doug Lung, Secretary, Doc. G-2.1.6/79, 6 July, 1998.
Coordination Letter - ITU-R JWP 10-11Q on Audio and Video Quality Assessment, David Fibush, June 3, 1998 Doc. G-2.1.6/80, 13 July 1998.
Circular Letter - Announcement of the establishment of joint working party 10-11Q on audio and video quality assessment and of its first meeting Robert W. Jones, Director, International Telecommunication Union, Radiocommunication Bureau, May 25, 1998, Doc. G-2.1.6/81, 13 July 1998.
Analog Interface Performance Specifications for Digital Video Teleconferency/Video Telephony Service, NTIA/ITS, Arthur Webster T1A1.5/98-206, IEEE Doc. G-2.1.6/82, 13 July 1998.
Report of Task Force to define A Unit of Measurement and Means of Calibration for Video Quality Analysis, Leon Stanger, 10 July 1998, IEEE Doc. G-2.1.6/83, 13 July 1998.
APPENDIX "B"
ATTENDANCE RECORD
13 July 1998
|
Name |
Affiliation |
Telephone |
Fax |
|
|
Chairman: |
Consultant |
(732) 846-4476 |
(732) 846-4476 |
agodber@idt.net |
|
Secretary: |
Telemundo |
(305) 884-9664 |
dlung@transmitter.com | |
|
David Fibush |
Tektronix |
(503) 627-6289 |
(503) 627-1707 |
davef@tv.tv.tek.com |
|
John Grigg |
U.S. West |
(612) 531-6706 |
(612) 536-2502 |
jjgrigg@uswest.com |
|
Paul W. Jones |
Kodak |
(716) 477-8048 |
(716) 722-0160 |
pjones@kodak.com |
|
John Libert |
NIST |
(301) 975-3828 |
(301) 926-3534 |
libert@eeel.nist.gov |
|
Al Morton |
AT&T |
(908) 949-2499 |
(908) 949-1652 |
acmorton@att.com |
|
Michel Poulin |
Leitch Technology |
(416) 445-9648 |
(416) 445-4762 |
michel.poulin@leitch.com |
|
James R. Redford |
NBC |
(212) 664-5222 |
(212) 246-3650 |
rick.redford@nbc.com |
|
Leon Stanger |
DirecTV |
(310) 726-4676 |
(310) 726-4535 |
LStanger@compuserve.com |
|
Arthur Webster |
NTIA/ITS |
(303) 497-3567 |
(303) 497-5323 |
webster@its.bldrdoc.gov |