The structure of a DICOM file, the DICOM tags
As we already know a DICOM file storing one image contain the image data and data belonging to the patient and data (name, age, etc.) belonging to the examination (date of acquisition, manufacturer, etc.) and identifiers: the study UID, the series’ UID’s, and the image UID’s.
The software that interprets the image will have to be able to find, first of all, the part of the DICOM file containing the image; also all of the identifiers and the other data contained in the DICOM file. The DICOM standard has a special pair of characters, the parentheses and the comma: ’(’ and ’)’ and ’,’. Now, numbers of 2x4 hexadecimal digits enclosed by the these parentheses and separated by the comma uniquely identify a specific DICOM field or data. For instance this tag:
is the identifier of the patient’s name - „ten-ten is the patient name” as DICOM experts would say. The last thing that we have to learn is that the data, in this case the patient name is enclosed by a pair of the tag shown above:
Here ^ is another special character separating certain sub data between the same tags. When the computer finds the first ( character then it is ready to learn the tag identifier given before the next ) character. When he machine finds the closing tag within the closing pair of parentheses it will check if the same tag is found. If not then there must be an error in the DICOM file. If no error is found then one DICOM data like the patient name is learned. After decoding all the data like the patient’s birth date and the part containing the actual image, the computer is able to interpret the result as an image on the screenl.
Here is a decoded segment of the DICOM information found in a DICOM file:
- Dicom-File-Format
- Dicom-Meta-Information-Header
- Used TransferSyntax: LittleEndianExplicit
(0002,0000) UL 182 # 4, 1 MetaElementGroupLength
(0002,0001) OB 00\01 # 2, 1 FileMetaInformationVersion
(0002,0002) UI =CTImageStorage # 26, 1 MediaStorageSOPClassUID
(0002,0003) UI 1.3.12.2.1107.5.1.1.20377.20031125114113176.4 # 46, 1 MediaStorageSOPInstanceUID
(0002,0010) UI =LittleEndianImplicit # 18, 1 TransferSyntaxUID
(0002,0012) UI 1.2.826.0.1.3680043.2.93.0.99 # 30, 1 ImplementationClassUID
(0002,0013) SH ERAD_60 # 8, 1 ImplementationVersionName
- Dicom-Data-Set
- Used TransferSyntax: LittleEndianImplicit
(0008,0005) CS ISO_IR 100 # 10, 1 SpecificCharacterSet
(0008,0008) CS ORIGINAL\PRIMARY\LOCALIZER\CT_SOM4 TOP # 38, 4 ImageType
(0008,0016) UI =CTImageStorage # 26, 1 SOPClassUID
(0008,0018) UI 1.3.12.2.1107.5.1.1.20377.20031125114113176.4 # 46, 1 SOPInstanceUID
(0008,0020) DA 20031125 # 8, 1 StudyDate
(0008,0021) DA 20031125 # 8, 1 SeriesDate
(0008,0022) DA 20031125 # 8, 1 AcquisitionDate
(0008,0023) DA 20031125 # 8, 1 ContentDate
(0008,0030) TM 113945.000000 # 14, 1 StudyTime
(0008,0031) TM 114003.384000 # 14, 1 SeriesTime
(0008,0032) TM 114109.299000 # 14, 1 AcquisitionTime
(0008,0033) TM 114109.299000 # 14, 1 ContentTime
(0008,0040) US 0 # 2, 1 ACR_NEMA_OldDataSetType
(0008,0041) LO IMA TOPO # 8, 1 ACR_NEMA_DataSetSubtype
(0008,0050) SH (no value available) # 0, 0 AccessionNumber
(0008,0060) CS CT # 2, 1 Modality
(0008,0070) LO SIEMENS # 8, 1 Manufacturer
(0008,0080) LO SE-AOK RAD.ONKOT.KLIN # 22, 1 InstitutionName
(0008,0090) PN FORGACS # 8, 1 ReferringPhysiciansName
(0008,1010) SH sict04 # 6, 1 StationName
(0008,1030) LO NATIV DR. FORGACS/ TA # 42, 1 StudyDescription
(0008,1090) LO SOMATOM PLUS 4 # 14, 1 ManufacturersModelName
(0009,0010) LO SPI RELEASE 1 # 14, 1 PrivateCreator
(0009,0012) LO SIEMENS CM VA0 CMS # 20, 1 PrivateCreator
(0009,0013) LO SIEMENS CM VA0 LAB # 20, 1 PrivateCreator
(0009,0020) LO SIEMENS CT VA0 IDE # 20, 1 PrivateCreator
(0009,0030) LO SIEMENS CT VA0 ORI # 20, 1 PrivateCreator
(0009,1010) LT SPI VERSION 01.00 # 18, 1 Comments
The first column contains the DICOM tags, the second column contains the actual content, the third column contains the actual number of characters belonging, the fourth column tells if the actual field has more than one data contained (these are back slash separated like in the second row) and the official DICOM name of the field.
The DICOM standard has specific names for the different DICOM tags or data or DICOM field identified by a certain pair of tags. For instance (0008,0060) is the tag identifying the DICOM field called Modality. The content of this field above is CT. Another example is the StudyDate being the official name of the field containing the actual date of the study: the tag is (0008,0020) and the content above is 20031006, that is November 6, 2003. As an exercise, find the tag and the DICOM data describing the manufacturer f he actual modality. What is the tag? What is the official DICOM name? How many characters do we have? Who is the manufacturer?
(1) The most important DICOM information is the sequence of numbers describing the image itself. The image information is also introduced by an identical tag. Since there are different methods for the numeric interpretation of the pixels of an image, the DICOM information must contain some data about the said code. We are not going to sink into further details regarding the different numeric image representation methods.
However, there are other DICOM data that the computer needs to find when actually rendering an image. In the following paragraph we are going to learn about the methodology DICOM is using when describing the spatial location, orientation and the physical size of an image.