| Document Number: | T1A1.5/98-107 |
| TIBBS File: | 8A151070.doc |
| DATE: | October 26, 1998 |
| STANDARDS PROJECT: | Interface Performance Specifications and Coding Techniques for the Transmission of Component Television System Signals (T1A1-05) |
| TITLE: | Summary of October 19 - 21, 1998 Meeting of ITU-R Joint Working Party 10-11Q, Audio and Video Quality Assessment |
| EDITOR: | David Fibush |
| SOURCE: | Tektronix Inc. |
| CONTACTS: | David Fibush Tektronix Inc., MS 50-353 P.O. Box 500 Beaverton, OR 97077 (503)628-3040 (503)627-4486 (fax) davef@exgate.tek.com |
| DISTRIBUTION: | Working Group T1A1.5 |
ABSTRACT: The ITU Radiocommunication Bureau has established a new Joint Working Party on audio and video quality assessment. Included in the work of JWP 10-11Q will be questions previously assigned to Study Group 11 WP11E and some of the questions previously assigned to Study Group 10 WP10C. The first meeting was held in Geneva, October 19 - 21, 1998. This contribution provides a summary of the activities at that meeting.
Summary of October 19 - 21, 1998 Meeting of
ITU-R Joint Working Party 10-11Q, Audio and Video Quality Assessment
Plenary Session (continuing throughout the meeting series)
Abstracted from a statement by the chairman of SG11 (doc 10-11Q/9) "JWP 10-11Q was established to study many common problems concerning objective and subjective methods of image and sound quality assessment. This research will cover a wide range of questions in conjunction with new approaches to interactive TV and sound broadcasting as well as systems delivering multipurpose mass information." Discussions leading to the merging of SG10 (sound) and SG11 (video) are in process by ITU management. This may be decided next year for implementation in 2000.
At the opening meeting, document status, question status, rapporteur groups and other work in progress was reviewed for each of the previous study groups. Detailed lists of questions and recommendations assigned to the JWP are available in documents 10-11Q/8 and 11.
Overlap between JWP 10-11Q work and that of ITU-T SG9 has been considered by ITU management leading to rewriting the SG9 questions to "eliminate overlap". They don't appear to have considered ITU-T SG12. In any case, the measurement technology is the same so continued support of VQEG is important. If all goes well, methods proposed by VQEG will be adopted with little modification by each SG. In the meanwhile it is reported that SG9 is finalizing recommendations for subjective and objective picture evaluation methods. The objective document is a framework presumably to be completed based on results of the VQEG tests. I have been asked to be a JWP 10-11Q Special Rapporteur reporting on SG12 activities relating to JWP 10/11Q areas of interest.
There is a draft recommendation for objective audio quality measurements. The method said to work well at high quality levels such as 64 kb/s and up for one channel of MPEG-1 layer 2 or MPEG-2 layer 3. At lower quality levels there is less correlation however that may be due to the inaccuracies in subjective methods. The draft recommendation is ITU-R 10/BL/9 (a mere 107 pages) and should be approved soon. It is interesting to note that the first round of testing (similar to VQEG) did not meet the requirements. A second round merged the more interesting proposals and produced two versions of "peaked perceptual evaluation of audio quality ". The versions are low and high complexity methods. It is reported the AES is working on a document for measurement of audio quality over the internet.
It was decided to drop the question on objective audio quality in order to provide stability in the recommendation. Apparently, the speech quality method was changed immediately after approval in SG12. Removal of the question means the recommendation can't change unless there is a new question drafted and approved first. Therefore the audio work in JWP 10-11Q will emphasize further subjective testing methods and harmonization of audio and video methods looking towards combined audiovisual (multimedia) testing.
MPEG is working on audio and video quality over the internet based on their own measurement procedures. They are working on error resilience for 32 kb/s and 128 kb/s data rates with new error concealment strategies. They say the single stimulus methods are best for low quality with double stimulus better for high quality. There is reported to be a version 2 of MPEG-4 because of a controversy of what to include for the in-process standard. This may be mostly related to new ways to compress audio. Vittorio Baroncini is the chair of the MPEG-4 sub-group on quality evaluation. Providing two sequences on the same monitor for subjective viewing is an advantage and is easy with the CIF format being used for MPEG-4 testing. There is concern that with high picture degradation the viewer must continuously review the reference causing fatigue. This is not the case for higher quality pictures. There are no plans for joint audio/video testing. See doc 10-11Q/6
Based on input documents relating to work for the JWP (10-11Q/9, 10-11Q/Temp1) and VQEG discussions it is expected more changes will be needed for Recommendation 500 (surprise). These will include such things as better definition of terms, resolution of some contentious items from the VQEG discussion, extension of continuous quality methods to include double stimulus, and requirements for operational type adjustments to processed sequences prior to assessment. Some harmonization with appropriate audio recommendations will be appropriate.
ITU-R BT.1210 (unpublished) is a list of test sequences covering SDTV, conventional TV (NTSC/PAL) and HDTV. It includes the sequences for SDTV listed in BT.802 and attempts to provide more information about all listed sequences including availability. JWP 10-11Q would like to determine what sequences are truly available including possible new material from the VQEG and MPEG work. In addition, we would like to define test sequences to be used specifically for objective measurements. I reported on the SMPTE test materials work that we expect will be appropriate for these applications. Since the SMPTE is about to become a member of ITU-R, a liaison statement regarding co-operative work in this area was drafted. It was suggested that I be the liaison between the two groups which is quite appropriate as I am the chairman of the SMPTE technology committee that handles video test materials. An official rapporteur appointment was not made as was done for SG12, however I expect SMPTE will appoint me as an official liaison to the JWP 10-11Q.
Several task force groups were set up to work on issues to be discussed, and perhaps decided, during this series of meetings plus plans for future work. The four main groups were: VQEG test plan review and recommendations for future action, audio quality matters, possible changes to Recommendation 500, and response to liaison statements. I participated in the task force on VQEG matters.
VQEG Task Force
The primary objective was to review the subjective test plan and make recommendations to VQEG. Since several of the key players were present a lot of time was spent trying to resolve test plan issues. Attending were: Alexander Schertz, Phil Corriveau, Stephane Pefferkorn, Massimo Visca and Vittorio Baroncini. As vice-chairman of the JWP, Vittorio was appointed chairman of the task force. Alexander had prepared a list of subjective test plan issues including comments from the email reflector. The items listed below are not decisions, only recommendations, however they will most likely be accepted.
1. Although the requirement for viewer contrast sensitivity had been canceled, Vittorio said he will be proposing a method to be used in future testing and as a draft addition to Recommendation 500.
2. There is concern by Mr. Nishida that subjective scoring of 50 Hz sequences in 60 Hz countries will be distorted due to flicker. This is a very controversial point as double stimulus methods are supposed to eliminate such problems. Since there are more labs in 50 Hz countries available to do the tests it turns out it will be unnecessary to do 50 Hz testing in 60 Hz countries. Experiments may be undertaken to try to resolve the technical issue separate from the official VQEG testing as most of the experts present believe the flicker should not present a problem with the DSCQS method.
3. It is believed that short term memory affects may bias the scoring unless there is a 50-50 balance of reference first, reference second, presentations. One suggestion is to do all HRC/sequence tests both ways, however that makes the test too long. Therefore, careful ordering is required. Due to the relatively small number of tests, to insure each sequence and HRC is properly spread throughout the test, a manual method must be used following some appropriate set of rules. Although called randomization this is really something else as true random process would have groups of similar items.
4. There is a definition problem in describing the order of events. The word "training" as presently in the document really means stabilization, real trials at the start of the run but the scores are discarded. Training or demonstration teaches the viewer the basic procedure. This can be done with a separate tape. Presently there are two stabilization trials shown for the beginning of each of three tapes. Since time is available on the tapes, it is recommended this be increased to five. A new list of defined terms was developed to be offered to VQEG.
5. It was agreed that the monitor can be either 19" or 20" and need not be a specific Sony model.
6. Voting should be after the viewing of both A/B sequence pairs. Some viewers tend to score before the presentation of the last of the four sequences.
7. Due to the lack of experimental data regarding more precise monitor setup, Recommendation 500 requirements should be used.
8. Screening of observers scores, for possible elimination, will be per Annex 2 of Recommendation 500.
9. Lab to lab data analysis may be extended to include the outlier ratio as defined in the objective test plan. JWP 10-11Q will propose additional statistical methods by the end of December.
10. Possible use of expert viewers created a very heated discussion. There is agreement on not using expert viewers in the basic VQEG test, the question was if an overlay of testing should be added. It was finally agreed not to do so because it would confuse analysis of the data. It would also appear to change the rules for proponents after they have submitted their models. There was some discussion about doing expert trials if the non-expert data did not provide a clear winner. It's not clear such an approach would be agreed to even after the present tests are complete. The idea of using experts in the future is strongly (but not unanimously) supported. This falls into two categories: expert viewing using present Rec. 500 procedures and critical viewing using either expert or non-expert viewers with a different procedure. Such a procedure might include viewer switching between source and processed sequence for an unlimited period of time with the results being a ranking rather than the more traditional subjective score. JWP 10-11Q will consider definition of such procedures for future work.
11. Another issue that was hotly debated is the number of trials for each viewer. Remember we have four basic tests, high/low quality and 50/60 Hz. Each lab will only do either 50 or 60 Hz tests so the discussion relates to one or two sets of viewers to do the high and low quality tests. One point of view says viewers should see both ranges of quality since we are looking for one objective model to cover the entire range. They are also concerned that the overlap HRCs would show different results for two sets of viewers although the DCSQS method is supposed to eliminate such contextual effects. The argument against using one set of viewers for both tests is they become fatigued and the scoring will drift as they become trained on what to look for. After hours of debate it was decided to let each lab make it's own decision on this matter.
Documents
Soft copies of the 10-11Q documents are available. Hard copies of the Temp and Info documents are available except there are soft copies of Temp/5 and Temp/6. Lists of questions and recommendations are shown at the end of this report.
10-11Q/1 Working Party 11B, Liaison Statement to JWP 10-11Q, Adaptive Image Quality Control in Future TV Systems. Describes the Mole technology. It is not clear what it has to do with quality assessment.
10-11Q/2 Working Party 11B, Liaison Statement to JWP 10-11Q, Assessment and Optimization of Quality of Colour Reproduction in Television. States that color plays an important role in picture quality.
10-11Q/3 Working Party 11A, Liaison statement to JWP 10-11Q, Working On A Preliminary Draft new Recommendation: Assessment and Optimization of Quality of Colour Reproduction in Television. A tutorial on color systems and use in various television systems, calls for study more than providing answers.
10-11Q/4 Special Rapporteur for the co-ordination of the activities of VQEG and JWP 10-11Q, Report on the Second Meeting of the Video Quality Experts Group (VQEG) (Gaithersburg; Maryland 27-29 May 1998)
10-11Q/5 International Organisation for Standardisation - ISO/IEC JTC1/SC29/WG11, Liaison to ITU-R JWP 10-11Q, Coding of Moving Pictures and Audio on MPEG-4 audio verification tests. A detailed description of the MPEG-4 audio verification test plans.
10-11Q/6 Coding of Moving Pictures and Audio, Reply to liaison statement of ITU-R JWP 10-11Q on MPEG-4 video verification tests.
10-11Q/7 Special Rapporteur, Video Quality Experts Group: VQEG Subjective Test Plan.
10-11Q/8 Status of Texts of Study Group 11, Working Parties and Task Groups
10-11Q/9 The New Approaches to Quality Assessment and Measurements in Digital Broadcasting. This is a statement by the chairman of SG11. Multimedia and internet are mentioned along with the need for new subjective and objective test methods.
10-11Q/10 Evaluation of New Methods for Objective Testing of Video Quality: Objective Test Plan. The latest version of the VQEG objective plan.
10-11Q/11 Preparation of the First JWP 10-11Q Meeting. This is a list of prior and ongoing work from the previous groups plus proposed organization for work for the JWP.
10-11Q/12 BR, Secretariat Information Document. This discusses the changes made to ITU-T SG9 questions so as to not overlap with ITU-R activities. This seems more cosmetic than practical. There is no mention of the work going on in ITU-T SG12 which is working on quality evaluation for internet transmission.
10-11Q/13 List of Documents
10-11Q/14 Short Status Report on ITU-R TG10/4 Activities. This group has developed a audio objective testing method said to work well at high quality levels such as 64 kb/s and up for MPEG-1 layer 2. At lower quality levels there is less correlation however that may be due to the inaccuracies in subjective methods. The draft recommendation is ITU-R 10/BL/9.
Info/1 Levels of Quality of Television and Multimedia Images. This is an adaptation and extension of the video classes developed at T1A1.5 and VQEG. It was decided this proposed recommendation is not critical to our work and changes as new applications are developed. Therefore it will be used as part of the chairman's report rather than submitted for approval.
Info/2 The New Strategy of TV Image and TV Broadcasting Quality Assessment. A suggestion for organization of our work and possible modifications to Recommendation 500.
Info/3 Special Rapporteur for WP10C to AES SC-06-64. Defines AES-X74 Recommended Practices for Internet Audio Quality Descriptions.
Temp/1, 2, 3 (not approved)
Temp/4 Text for the Chairman's Report, The new strategy of TV images and TV broadcasting quality assessment. That is, new work for modification of Rec. 500.
Temp/5 Liaison Statement to ITU-T SG12, Development of Recommendations for Subjective and Objective Assessment of Television Audio and Video. Appoints D. Fibush as Special Rapporteur for SG12.
Temp/6 Liaison Statement to SMPTE Committee on Television Compression Technology, C24. Requests progress reports and contributions relating to test materials.
Temp/7 Chairman's Report for JWP 10-11Q.
Temp/8 Task Group on Audio. Reviews status and plans for audio recommendations.
Temp/9 Draft Annex to the Chairman's Report, Draft Proposal for Modification to Rec. 500, A Novel Method for Error Robustness Evaluation in Video Communication: Double Stimulus Continuous Quality Evaluation. This method uses two monitors to provide a reference for continuous quality evaluation. It was developed based on experiments performed at CSELT, FUB and CCETT and is being used for MPEG-4 evaluation. (The body of the document was removed since it had not been provided as an input contribution prior to the meeting.)
Temp/10 Liaison Statement to ITU-T SG9, Development of Recommendations for Subjective and Objective Assessment of Television Audio and Video. Appoints Mrs. Alina Karwowsak-Lamparska as Special Rapporteur for SG9.
Temp/11 Task Group on VQEG. The official output of the task force discussed above.
Temp/__ Proposal for New Draft Question, Methodologies for Subjective Assessment and Optimization of Audio and Video Quality. This will be the significant new work on combined audio/video quality assessment.
Next Meetings
ITU-T SG12 Nov 30 - Dec 3, 1998
VQEG tentatively in March, very likely to be delayed
JWP 10-11Q May 17-21
ITU-R WP11B (digital coding and interconnect) May 24-28
List of Questions for WP10-11Q
10C |
85-2/10 |
Subjective assessment of sound quality in broadcasting using digital techniques |
10C |
106-1/10* |
Subjective assessment of sound quality |
10-4 |
210/10** |
Objective perceptual quality assessment methods |
10C |
220/10 |
Subjective assessment of small, medium and large impairments in sound quality |
10C |
Draft New Question |
Calibration of the listening level for headphones |
11E |
64-4/11 |
Objective quality parameters and associated measurement and monitoring methods for digital television signals |
11E< |
211-2/11 |
Subjective assessments of the quality of television pictures including alphanumeric and graphic pictures |
11E |
234/11 |
Subjective assessment of stereoscopic television pictures |
11E |
257/11 |
Relationship between quality, quality evaluation methodology, and type of application, in a multimedia environment |
WP 11E and WP10C Recommendations
|
Rec. ITU-R BS.562-3 1* |
Subjective assessment of sound quality |
|
Rec. ITU-R BS.644-1 * |
Audio quality parameters for the performance of a high-quality sound-programme transmission chain |
|
Rec. ITU-R BS.1283 |
Subjective assessment of sound quality - A guide to existing recommendations (To be published) |
|
Rec. ITU-R BS.1284 |
Methods for subjective assessment of sound quality - General requirements (To be published) |
|
Rec. ITU-R BS.1285 |
Pre-selection methods for the subjective assessment of small impairments in audio systems (To be published) |
|
Rec. ITU-R BS.1286 |
Methods for the subjective assessment of audio systems with accompanying picture (To be published) |
|
Draft new Rec. ITU-R BS. |
Method for objective measurements of perceived audio quality (to be published) 10/BL/8 |
|
Rec. ITU-R BS 1116-1 |
Methods for the subjective assessment of small impairments in audio systems including multichannel sound systems |
|
Rec. ITU-R BS 1285 |
Simplified-and Preselection Methods for the Subjective Assessment of Small Impairments in Audio Systems |
|
Rec. ITU-R BS 1283 |
Methods for the subjective assessment of sound quality - A Guide to Existing Recommendations |
|
Rec. ITU-R BS 1284 |
Methods for the subjective assessment of sound quality - general requirements |
|
Rec. ITU-R BS 1286 |
Methods for the subjective assessment of audio systems with accompanying picture. |
|
Rec. ITU-R BT.1127 2* |
Relative quality requirements of television broadcast systems |
|
Rec. ITU-R BT. 500-8 |
Methodology for the subjective assessment of the quality of television pictures (To be published) |
|
Rec. ITU-R BT.654 * |
Subjective quality of television pictures in relation to the main impairments of the analogue composite television signal |
|
Rec. ITU-R BT.1128-2 |
Subjective assessment of conventional television systems (To be published) |
|
Rec. ITU-R BT.811-1 * |
The subjective assessment of enhanced PAL and SECAM systems |
|
Rec. ITU-R BT.710-3 * |
Subjective assessment methods for image quality in high-definition television (To be published) |
|
Rec. ITU-R BT.1129-2 |
Subjective assessment of standard definition digital television (SDTV) systems (To be published) |
|
Rec. ITU-R BT.812 * |
Subjective assessment of the quality of alphanumeric and graphic pictures in Teletext and similar services |
|
Rec. ITU-R BT.813 * |
Methods for objective picture quality assessment in relation to impairments from digital coding of television signals |
|
Rec. ITU-R BT.814-1 * |
Specifications and alignment procedures for setting of brightness and contrast of displays |
|
Rec. ITU-R BT.815-1 * |
Specification of a signal for measurement of the contrast ratio of displays |
|
Rec. ITU-R BT.1210-1 |
Test materials to be used in subjective assessment (to be published) |
|
Rec. ITU-R BT XX |
Assessment of the Picture Quality of Multi-Programme Services
|
|
Rec. ITU-R BT XX |
Subjective Assessment of Stereoscopic Television (May 1998) |
*These Recommendations can be found in the 1994 BS Series Volume
*These Recommendations can be found in the 1994 BT Series Volume