Just to make it clear, so far as I know, the COBOL standard does not address, and never has addressed, endianness, from the COBOL report of 1960 through the current FCD, for any form or encoding of data, with a single POSSIBLE exception: the intrinsic function REVERSE allows accepts an alphabetic, alphanumeric, or national character string as its argument, and provides data in the same form, with the character order reversed, as its result. That's it. I brought it up as an example/illustration, but from the COBOL standard's point of view, it's a red herring. May not be for the implementor, but it is for the standard. |
Using your concepts, yes, all data is "external". If you're going to add three fields together and put the sum in a fourth field, you need to know that e.g. the first is packed decimal with an implied two fractional digits, the second is a decimal64 in decimal encoding, the third is a numeric literal representing a floating-point number, and the fourth is a binary64, SOMEBODY has to put them all in the same form in order to add them together, and SOMEBODY has to take the result, in whatever form it's in, and put it into the fourth field. By specifying the form of a "standard intermediate data item", the user tells the compiler what that form is.
Right now, in the draft, we provide for two IEEE forms into which, and out of which, operands are converted on their way to arithmetic: binary128, and decimal128 DPD. I am working hard to get a proposal together to add a third intermediate form, decimal128 BID, and to add the ability explicitly to declare items as being in decimal64 BID and decimal128 BID format.
The original issue that I raised had to do with "What do we do if a datum is described as decimal64 (implicitly DPD), and it actually contains BID?" As was pointed out more than once, you have to know what you're dealing with. If it doesn't match what you've been told it is, the results are undefined, and that's on the backs of the sender. But if it's been described to you as decimal64 BID, it's a Really Bad Idea to treat it as decimal64 DPD.
Note that the COBOL standard does not care which "hardware" mode is most suitable for a given implementation. From that standpoint, I believe it would be in the best interests of the COBOL end user, if an implementation is going to provide ONE of the two IEEE decimal forms of arithmetic, that they provide ALL, regardless of which one they prefer. , so that programs imported from a machine that has one form WILL STILL RUN, WITHOUT CONVERSION, on machines that have the other.
No, it's not as efficient to specify a mode of arithmetic (or, if you will, a standard form of numeric conversion for operands) that's different from what the platform is happiest with, because the implementation will have to convert back and forth behind the program's back to get the information into the form the platform recognizes, but YES, the programs that run currently on a machine with one view of IEEE arithmetic WILL STILL RUN on machines with another.
What I'm planning to do is write a proposal to the COBOL working group to ADD explicit support for decimal64 BID and decimal128 BID data descriptions, and to ADD the capability of specifying that the "uniform form" into which arithmetic operands are to be converted may be decimal128 BID (in addition to the decimal128 DPD and binary128 modes now in the FCD). I believe doing this is in the best interest of the end user of the COBOL language, I believe it satisfies the concerns of those who are not happy with the specification that decimal128 BID is not among the "uniform forms" that the FCD proposes to support. Allowing in addition the explicit description of "data spaces" as containing information in decimal64 BID and decimal128 BID form resolves the concerns I raise originally about what happens if somebody ships you a decimal128 item, and it's in BID encoding, since COBOL only supports DPD?
Now, all I can say to the question "Why does COBOL need this?" is "Trust me. I've been at COBOL for a while now."
And I can go off and write the proposal -- in the perhaps-dim hope that we can actually get it into the next standard without delaying it any further -- or I can continue to try to explain why COBOL cares what form an implementor prefers his arithmetic argument in, to say nothing about further explanations that the COBOL standard doesn't care about endianness.
I think there are a number of participants in this forum that would rather I spend my time and energy -- very limited right now, on a variety of fronts -- on the former task!
> Date: Fri, 25 Feb 2011 09:53:43 -0500
> To: stds-754@xxxxxxxxxxxxxxxxx
> From: hack@xxxxxxxxxxxxxx
> Subject: RE: Two technical questions on IEEE Std 754-2008
> Chuck Stevens wrote:
> > I think COBOL needs separate arithmetic MODES for the two encodings
> > because, in COBOL terms, arithmetic includes "getting the data ready" for
> > the arithmetic operation, along with the arithmetic operation itself.
> Ok, so from the language's point of view ALL data are what I call
> "external", including what in other languages would be local variables.
> At least that's the only way I can conceptualize this.
> This is partially confirmed a bit later (I'll get back to something
> mentioned in-between):
> > > The only issue the language has to worry about is interchange formats,
> > Yes. All we talk about is interchange formats.
> > > and those have to address Endianness, character encoding, and other
> > > details
> > I don't think the standard needs to address these; that's the
> > implementor's job.
> Yes -- and so it is for BID vs DPD. What COBOL needs to know is whether
> the type is STANDARD-BINARY or STANDARD-DECIMAL (and what size thereof).
> > > of the same nature as BID vs DPD for STANDARD-DECIMAL.
> > Well, they're similar -- but in a given implementation, I would
> > think the arithmetic ops would expect the values presented to it
> > to be encoded the same way.
> No -- they are EXACTLY of the same nature. The arithmetic ops are
> not prepared to accept arguments of different Endianness either.
> So if COBOL has a mechanism to deal with data imported from machines
> with a different Endianness, it should be able to use the same mechanism
> to deal with decimal encodings. After all, BOTH issues involve nothing
> more than a blind transformation of bits, knowing the size and logical
> type of the item. Ok -- I'll grant that there is one small additional
> difference: Endianness conversion only needs to know the size of the
> type, but decimal re-encoding needs to know in addition that the field
> holds a DFP item. So yes, the support does not come entirely for free,
> and I can understand that the initial version of new DFP support would
> avoid the issue by only supporting DPD. HOWEVER -- please make sure
> that you don't encumber the LANGUAGE with additional types of arithmetic,
> unless the language already has explicit Endianness declarations as well.
> I'm actually trying to SIMPLIFY the job the COBOL standards committee has
> to do. Not knowing the details of how Endianness differences are handled
> it is difficult to make specific suggestions. So let me describe one way
> to do it, and COBOL experts can perhaps redirect this to match reality.
> Suppose we are importing a record that contains both character fields
> and numeric fields of various sizes. Character fields may be subject
> to Ascii/Ebcdic conversions, and numeric fields (other than perhaps
> packed decimal) may be subject to Endianness conversion.
> Therefore the interface handler must know, for each field, its logical
> type (character, packed decimal, or other numeric) as well as its size
> and offset, as there are three different applicable transformations:
> leave unchanged, translate characters, or reverse Endianness.
> What I'm suggesting is that there be a fourth logical type, namely DFP.
> There would then (for this purpose) be three numerical types instead
> of two: DFP (use DPD->BID if needed), BCD (leave alone), and OTHER (do
> Endianness conversion if needed). The BFP types fall under OTHER for
> this purpose.
> Imported files presumably have a magic number or other identifying tag to
> record the origin's character set and Endianness. If COBOL wanted to support
> both BID and DPD encodings for DFP, it would need another indication for this
> -- but if COBOL sticks with a prescribed interchange encoding, this would not
> be necessary.
> What am I missing here? Perhaps the answer is in the "in-between" part I
> promised to get back to:
> > I find it hard to envision that implementations conformant to IEEE 754
> > would allow different encodings for operands to a single arithmetic
> > operation -- on a theoretical fixed-word stack machine, Valuecall
> > (a-binary32), Valuecall (a-decimal128), ADD, for example. If they're all
> > of the same basic type -- Valuecall (a-binary32), valuecall (a-binary128),
> > I'd expect the arithmetic instruction to handle that.
> Well, although IEEE 754 discourages mixed-radix arithmetic it actually
> requires mixed-size arithmetic with a single rounding per "formatOf"
> operation. Same-radix widening (which is what COBOL does) is in fact
> not conforming (but then COBOL never said it would be) if the wide result
> is then narrowed to the target format. As for that hypothetical stack
> machine: the compiler knows the types, and would have an assortment of
> different ADD_format1_to_format2 instructions all corresponding to the
> generic ADD operation of the language. Such languages exist. It is of
> course simpler if the computing model is that ALL operands are converted
> to a single type -- preferably decimal128.
> > We in COBOL need to
> > know the form in which arithmetic operands are expected so we can put them
> > in that form before handing them off to the operations. For a given mode
> > of arithmetic, all operands are converted to the same form, program-wide.
> Right -- and here we come to the crux of the problem: Do these operands
> ALWAYS remain in the original encoding (possibly from an external source)?
> If so, the conversion step I described above must indeed be carried out at
> this point all the time, and there is no need to remember record layout
> because individual fields are being presented. These fields must however
> be presented with size and type information -- and I claim it is sufficient
> to identify DFP types as DFP: the implementation knows its own preferred
> encoding, and (if COBOL requires a DPD-based interchange format) it knows
> the source encoding. (Here Endianness may be more difficult, conceptually,
> because the implementation must somehow find out what the original encoding
> is -- that magic source-id must be accessible.)
> The is one situation where a choice of one or another flavour of DFP
> arithmetic (DFP-DPD vs DFP-BID) makes sense: when both are available
> only as software libraries, and the compiler could call either one.
> Perhaps that is the model that has been in everyone else's mind but
> mine, because I'm familiar with machines that support DFP in hardware.
> But since the two DFP arithmetics produce IDENTICAL results in all
> circumstances (unlike BFP vs DFP vs "native"), I find it difficult to
> understand why one would burden the LANGUAGE with the distinction, as
> opposed to the installation or compilation options.
> P.S. I think I'm running out of steam on this issue...
> It would perhaps help is somebody could tell me how COBOL
> implementations deal with Endianness and character-code
> issues, so I wouldn't have to guess like I did above.
> ---Sent: 2011-02-25 16:17:03 UTC