It seems that all metadata standards have their own unique problems. This page documents some significant structural problems found in some current metadata and file format specifications, and gives possible solutions to these problems. [Also see my Commentary on Meta Information Formats.]
A significant problem of the 1992 TIFF 6.0 specification is that there is no way to distinguish an IFD (image file directory) offset from a simple integer value. As a result, new IFD's may not be created without risking corruption of the files by unaware software. This is not only a problem for proprietary maker notes which commonly use a TIFF IFD structure, but is also a problem for extensibility of TIFF-based RAW image formats (as demonstrated by the DNG 1.3 specification -- see below).
A simple solution:
Use a TIFF field type of 13 (IFD) instead of 4 (LONG) for IFD offsets. This was first proposed in 1995 by Adobe in their PageMaker 6.0 TIFF Technical Notes[2], but unfortunately it never found its way into the TIFF specification. Even so, Olympus Optical Co. has shown some intelligence and is using this field type in the maker notes of their recent digital cameras.
Another useful addition: (added 2009-09-27)
A number of camera and cell phone manufacturers (Concord, Kodak, Motorola, Nokia, Olympus, Pentax, Ricoh, Samsung and Sony) leave blank IFD entries in the maker notes of images from some models. Presumably this simplifies the embedded software by allowing the output file structure to be kept constant even when the number of maker note IFD entries changes. It could be useful if this feature was added explicitly to the offical TIFF specification by defining a field type of 0 as a "free IFD entry" to be ignored. (Note that this ability already exists implicitly at a certain level in the specification, which states: "Readers should skip over fields containing an unexpected field type".)
With the DNG 1.3 specification of June 2009, Adobe added a new Camera Profile IFD referenced by an offset using the standard (and unfortunate) TIFF LONG field type. This means that the new Profile IFD will be lost if the file is rewritten by any software which does not have explicit knowledge of the 1.3 specification. But to make things worse, Adobe didn't even use a standard IFD format for the data. Instead, the IFD begins with a TIFF-style header and uses relative instead of absolute offsets. This would have been a good idea if the IFD was stored as the value of an UNDEFINED tag rather than referenced from a LONG offset. (If done this way, the new information would have been preserved if the file was rewritten by unaware software.) But as implemented it just adds to the pain of parsing the file by requiring even more specialized code to be written in support of the DNG 1.3 format.
A simple solution:
Sack the Adobe developers who were responsible for this, and use field type 13 (as recommended above for TIFF 6.0) and a standard TIFF IFD structure when adding new IFD's in the future.
The EXIF specification has been the standard for digital camera metadata for many years, and while the digital camera technology has advanced significantly in these years, the EXIF specification has not. There are a number of significant problems with the EXIF specification which have never been addressed.
Some current problems with the EXIF specification are:
Simple solutions:
In February 2007 Microsoft proposed a new PhotoInfo tag called "OffsetSchema" (hex. 0xEA1D, dec. 59933) in an attempt to patch a deficiency in the EXIF maker note specification (see point 1 in EXIF 2.2 section above). This tag represents the offset difference between the original maker note location in the EXIF and the new location after editing, and is designed to allow the maker note tag values to be accessed after the location of the maker notes is changed by editing the EXIF. [Bless their little hearts for trying to improve this situation, but while the idea is good the implementation is flawed and ultimately unworkable.]
There are two main problems with the implementation, and the second is a show stopper:
1. For this new tag to be available to a single-pass metadata reader, it must come before the maker note data (hex. 0x927C, dec. 37500). But since the EXIF/TIFF format specifies that tags must be stored in numerical order, the maker note tag (hex. 0x927C) comes before the OffsetSchema tag (hex. 0xEA1D).
2. The OffsetSchema tag will be invalidated by any software that rewrites the EXIF and moves the maker notes without properly updating the tag. In an ideal world all application developers would release an updated version of their software which treats the OffsetSchema properly, and all users would update to this new version. But since this is the real world it just won't happen, which makes the value of OffsetSchema unreliable. Too bad, because this wouldn't have been a problem if Microsoft had specified that the new tag represented the original offset of the maker notes instead of the difference from the original position. With this change, the tag wouldn't need updating when the EXIF is edited, and the information would be much more reliable. The only problem here would be editing software that explicitly changes the maker note offsets. However, software with this ability is rare, and it is more reasonable to ask that the OffsetSchema tag simply be deleted by any software that updates the maker note offsets. (Software must be fairly advanced in the first place if it parses the proprietary maker note data structures and changes these offsets.)
A simple solution:
Create a new tag which comes before the maker notes (hex. 0x927B, dec. 37499 would be good) and represents the original offset of the maker notes.
The JPEG File Interchange Format version 1.02 was released in 1992. The biggest structural problem with this standard is that metadata in these files in is stored in segments which have a maximum size of 65533 bytes. This limit has necessitated a number of creative solutions, each creating complications and problems of their own. [See my comments on the PreviewImage problem for example.]
A simple solution:
Since the value of the segment size word includes the 2 bytes of the segment size word itself, a value of 0 or 1 is not allowed by the current JPEG standard. The standard could be enhanced so a value of 1 indicates an extended JPEG segment where the 2-byte size word (with value 0x0001) is followed immediately by a 4-byte integer giving the size of the extended JPEG segment. This would allow segment sizes of up to 4294967291 bytes (assuming the size includes these 4 bytes). Further, a value of 0 could be defined for an 8-byte integer if one really wanted to support huge metadata segments. Either change to an existing JPEG would break all current JPEG reader/writers, but the change is trivial and could easily be implemented.
An alternative solution:
Define a new application marker segment which uses a 4-byte size word. This technique is already used for the extended JPEG2000 codestream MCT, MCC and MIC marker segments.
In February 2009 CIPA[10] released a "Multi-Picture Format" standard for storing large images in JPEG files. This format is yet another attempt to bypass the JPEG segment-size limit (see above) to store large preview images. But again, there is a significant problem with this standard: Pointers in the new APP2 MPF segment use offsets relative to the start of the MPF header in this segment to reference image data after the JPEG EOI. Unfortunately, these offsets are quickly broken if any data after the MPF segment changes length. This problem could have been avoided if offsets had been specified relative to the end of file, but it is too late for this now that the specification is public. However, another problem is that information after the JPEG EOI is often discarded by software when the file is edited.
A possible work-around:
Enforce the rule that the MPF APP2 segment must come after all other APP segments. (It would have been smart if this was specified in the CIPA standard, but sadly this wasn't the case.) If this is done, then metadata in the remaining APP segments (EXIF, IPTC, XMP, etc) can safely be edited without breaking the MPF offsets. I suggest that all metadata editors employ this strategy, regardless of the segment order specified in the standard (which says that the MPF APP2 segment must come immediately after the EXIF APP1 segment).
Unfortunately this work-around has the same problems as the Microsoft OffsetSchema tag because the MPF information may easily be invalidated by an unaware editor, and it doesn't address the problem of losing data stored after the JPEG EOI.
A simple solution:
Change the JPEG specification to allow larger segments (as mentioned above in the JPEG section), and change the MPF specification to store all information inside a JPEG segment.
[2014-04-29: CIPA has changed the location of these standards documents, so the URL's referenced here are now broken.]