Valideren van TIFF-bestanden met DPF-Manager/en: verschil tussen versies
Nieuwe pagina aangemaakt met 'DPF Manager is a particularly user-friendly open source tool for checking TIFF files, with a simple interface to show whether your TIFF file satisfies the right TIF...' |
Nieuwe pagina aangemaakt met 'Select the 'Default' option, and click the 'Full check' button.' |
||
| Regel 16: | Regel 16: | ||
== DPF Manager for TIFF file validation == | == DPF Manager for TIFF file validation == | ||
There is a DPF Manager tutorial on [https://www.youtube.com/watch?v=4rPFfjxKTO4 YouTube]. | |||
{{#ev:youtube|https://www.youtube.com/watch?v=4rPFfjxKTO4}} | |||
=== Install DPF Manager === | |||
[http://dpfmanager.org/ Download] and install DPF Manager. It is available for Windows, macOS and Linux. | |||
[http://dpfmanager.org/ Download DPF Manager | |||
[[Bestand:DPFManager 1 rv.jpg|800px]] | [[Bestand:DPFManager 1 rv.jpg|800px]] | ||
=== Select files to validate === | |||
=== | |||
Open the DPF Manager program on your computer. | |||
Open | |||
[[Bestand:DPFManager 2 rv.jpg|600px|class=preview-image]] | [[Bestand:DPFManager 2 rv.jpg|600px|class=preview-image]] | ||
Drag the folder containing the TIFF files that you want to validate to the 'Files/Folders' window. | |||
[[Bestand:DPFManager 3 rv.jpg|600px]] | [[Bestand:DPFManager 3 rv.jpg|600px]] | ||
...or click 'Select' to choose the folder containing the TIFF files for validation. | |||
... | |||
[[Bestand:DPFManager 5en6 rv.jpg|800px]] | [[Bestand:DPFManager 5en6 rv.jpg|800px]] | ||
Select the 'Default' option, and click the 'Full check' button. | |||
[[Bestand:DPFManager 7 rv.jpg|600px]] | [[Bestand:DPFManager 7 rv.jpg|600px]] | ||
The 'Tasks' window opens below, where you can follow the progress of the validation process. The validation is finished when the bar is fully green. Close the window by clicking on 'Tasks' at the bottom left. | |||
[[Bestand:DPFManager 8en9 rv.jpg|800px]] | [[Bestand:DPFManager 8en9 rv.jpg|800px]] | ||
=== Analyse the results === | |||
=== | |||
When the validation is complete, you can view the report with validation results by clicking on 'Reports' in the menu bar at the top. | |||
[[Bestand:DPFManager 13 rv.jpg|600px]] | [[Bestand:DPFManager 13 rv.jpg|600px]] | ||
You will then see a general overview showing: | |||
* when the validation was performed; | |||
* | * how many TIFF files were validated; | ||
* | * which folder was validated; | ||
* | * how many errors were detected; | ||
* | * how many warnings there are; | ||
* | * how many TIFF files passed the validation; | ||
* | * the score. | ||
* | |||
[[Bestand:DPFManager 14 rv.jpg|600px]] | [[Bestand:DPFManager 14 rv.jpg|600px]] | ||
Click the folder symbol to go straight to the reports. You can check the results by clicking on the line. | |||
[[Bestand:DPFManager 15en16 rv.jpg|800px]] | [[Bestand:DPFManager 15en16 rv.jpg|800px]] | ||
You will then see an overview of the results per file. This shows a summary of the general report for the entire folder at the top, followed by summaries of the reports for the individual TIFF files. The overview shows you, for each TIFF file: | |||
* a colour code indicating whether the validation was successful; | |||
* | * which files have been validated; | ||
* | * how many errors were detected; | ||
* | * how many warnings there are. | ||
* | |||
[[Bestand:DPFManager 18 rv.jpg|600px]] | [[Bestand:DPFManager 18 rv.jpg|600px]] | ||
Click on the HTML symbol to see a brief visual summary of the validation results for the entire folder. | |||
All reports, both for the entire folder and for individual TIFF files, are available in four file formats: HTML, PDF, XML and JSON. Simply click on the 'HTML', 'PDF', 'XML' and/or 'JSON' symbol. For the validation report for an individual TIFF file, click on the 'HTML', 'PDF', 'XML' and/or 'JSON' symbol next to that file. | |||
[[Bestand:DPFManager 20 rv.jpg|600px]] | [[Bestand:DPFManager 20 rv.jpg|600px]] | ||
==== The HTML validation report for the entire folder ==== | |||
= | |||
[[Bestand:DPFManager 19 5 assemblage sm.jpg|600px]] | [[Bestand:DPFManager 19 5 assemblage sm.jpg|600px]] | ||
Click [[:Bestand:report_folder.pdf|HERE]] to download a PDF of an example validation report for a folder of TIFF files without any errors. | |||
==== The HTML validation report for an individual file ==== | |||
= | |||
[[Bestand:DPFManager 21c rv assemblage.jpg|600px]] | [[Bestand:DPFManager 21c rv assemblage.jpg|600px]] | ||
Click [[:Bestand:10-GVH-19710101-0001.tif.pdf|HERE]] to download a PDF of an example validation report for an individual TIFF file without any errors. | |||
== Example error messages == | |||
= | |||
Not all file validations result in a report without any error messages. You will find a number of example error messages, with solutions for correcting them, below. | |||
=== Example 1: use of special characters === | |||
[[Bestand:DPFManager copyright overzicht.jpg|600px]] | [[Bestand:DPFManager copyright overzicht.jpg|600px]] | ||
The validation report indicates that the TIFF file does not comply with baseline TIFF v6.0 specifications. The error message is 'Only 7-bits ASCII-codes are accepted'. Hover your cursor over the error message to see more details. | |||
[[Bestand:DPFManager copright.jpg|600px]] | [[Bestand:DPFManager copright.jpg|600px]] | ||
[[Bestand:DPFManager copyright toelichting.jpg|600px]] | [[Bestand:DPFManager copyright toelichting.jpg|600px]] | ||
ASCII is a code for displaying letters, numbers and punctuation marks on a computer screen. It consists of 128 characters in total, and you can find an overview on [https://en.wikipedia.org/wiki/ASCII Wikipedia]. The error message indicates a problem with the [[Ingebedde metadata bij foto's|embedded metadata]] from 'tag 33432 Copyright'. You can find the details for this tag higher up in the report, in the list of IFD tags: '© Rony Vissers'. The copyright symbol in not 7-bits ASCII-code, and that's the reason for the error message. | |||
ASCII is | |||
Fortunately, it's easy to rectify. If you open the file with image editing software (e.g. Adobe Photoshop or GIMP) and view the embedded metadata, you can simply change '© Rony Vissers' to 'copyright: Rony Vissers'. You can access the embedded metadata in Adobe Photoshop by clicking on 'File info' in the 'File' menu. In GIMP, access the embedded metadata by clicking on 'Metadata' in the 'Image' menu, and then 'Edit Metadata'. Don't forget to save the updated TIFF file once you have modified it. See also the [[Ingebedde metadata bij foto's/en|embedded metadata]] article for information about modifying embedded metadata. | |||
[[Bestand:DPFManager correctie 2.jpg|800px]] | [[Bestand:DPFManager correctie 2.jpg|800px]] | ||
When you check the updated TIFF file with DPF Manager, you will see that the previously reported error has disappeared and the file is now valid. | |||
[[Bestand:DPFManager correctie.jpg|600px]] | [[Bestand:DPFManager correctie.jpg|600px]] | ||
If the TIFF files are the result of a digitisation project carried out by a specialist digitisation company, ask them to fix the errors rather than doing it yourself. | |||
=== Example 2: use of compression === | |||
Even though the TIFF file format is mainly known as a file format without [[Datacompressie|compression]], it does offer this possibility. Compression is not recommended for digitisation, however. DPF Manager can detect TIFF files that have been compressed. | |||
Here is a validation report from the same image: saved without compression on the left, and saved with JPEG compression on the right. The TIFF file with JPEG compression gives an error message. | |||
[[Bestand:Validatierapport vergelijking compressie rv2.jpg|800px]] | [[Bestand:Validatierapport vergelijking compressie rv2.jpg|800px]] | ||
The only way to fix this error is to perform the capture or scan again and save it as Baseline TIFF v6.0 without compression. If the RAW file used to create the TIFF file is available, you can use that to create a Baseline TIFF v6.0 without compression. | |||
[[Categorie:5. Digitaal bewaren/en]] | |||
[[Categorie:5. Digitaal bewaren]] | |||
Versie van 24 nov 2022 13:16
The process for validating file formats verifies whether a digital file's contents and structure satisfy the requirements set for that file format's specification.
DPF Manager is a particularly user-friendly open source tool for checking TIFF files, with a simple interface to show whether your TIFF file satisfies the right TIFF specification. And if your file does not satisfy these requirements, the tool also explains why not.
== Why validate?
It's very important to validate file formats for their long-term preservation. One major stumbling block when developing a digital preservation strategy is that we often don't have a clear overview of the file formats in our digital archive, even though this is important for regularly checking whether they can still be opened with available software. After all, it's possible that this software might cease to exist in the future. File identification and validation can help you to detect in good time whether a format is going to become obsolete, so you can take action by converting any relevant and affected files into a different format.
It's also important to check that files delivered in an outsourced digitisation assignment satisfy the set quality requirements.
When to validate?
Quality requirements are set in advance of a digitisation project, for example with regard to which file format to use. Guidelines in the High-quality text and image content digitisation article recommend using Uncompressed baseline TIFF v6.0. When the digitisation is complete, you should therefore check that the TIFF files received satisfy this specification. Even if errors are discovered in the file validation process, the digitisation company can still convert the files into the correct format.
So you're not just checking that files with a .tif extension are actually TIFF files, but also that they satisfy the set requirements in the Uncompressed baseline TIFF v6.0 specification. The file's structure is analysed to check for any errors when the file was created, which could result in not all software being able to read it.
DPF Manager for TIFF file validation
There is a DPF Manager tutorial on YouTube.
Install DPF Manager
Download and install DPF Manager. It is available for Windows, macOS and Linux.
Select files to validate
Open the DPF Manager program on your computer.
Drag the folder containing the TIFF files that you want to validate to the 'Files/Folders' window.
...or click 'Select' to choose the folder containing the TIFF files for validation.
Select the 'Default' option, and click the 'Full check' button.
The 'Tasks' window opens below, where you can follow the progress of the validation process. The validation is finished when the bar is fully green. Close the window by clicking on 'Tasks' at the bottom left.
Analyse the results
When the validation is complete, you can view the report with validation results by clicking on 'Reports' in the menu bar at the top.
You will then see a general overview showing:
- when the validation was performed;
- how many TIFF files were validated;
- which folder was validated;
- how many errors were detected;
- how many warnings there are;
- how many TIFF files passed the validation;
- the score.
Click the folder symbol to go straight to the reports. You can check the results by clicking on the line.
You will then see an overview of the results per file. This shows a summary of the general report for the entire folder at the top, followed by summaries of the reports for the individual TIFF files. The overview shows you, for each TIFF file:
- a colour code indicating whether the validation was successful;
- which files have been validated;
- how many errors were detected;
- how many warnings there are.
Click on the HTML symbol to see a brief visual summary of the validation results for the entire folder.
All reports, both for the entire folder and for individual TIFF files, are available in four file formats: HTML, PDF, XML and JSON. Simply click on the 'HTML', 'PDF', 'XML' and/or 'JSON' symbol. For the validation report for an individual TIFF file, click on the 'HTML', 'PDF', 'XML' and/or 'JSON' symbol next to that file.
The HTML validation report for the entire folder
Click HERE to download a PDF of an example validation report for a folder of TIFF files without any errors.
The HTML validation report for an individual file
Click HERE to download a PDF of an example validation report for an individual TIFF file without any errors.
Example error messages
Not all file validations result in a report without any error messages. You will find a number of example error messages, with solutions for correcting them, below.
Example 1: use of special characters
The validation report indicates that the TIFF file does not comply with baseline TIFF v6.0 specifications. The error message is 'Only 7-bits ASCII-codes are accepted'. Hover your cursor over the error message to see more details.
ASCII is a code for displaying letters, numbers and punctuation marks on a computer screen. It consists of 128 characters in total, and you can find an overview on Wikipedia. The error message indicates a problem with the embedded metadata from 'tag 33432 Copyright'. You can find the details for this tag higher up in the report, in the list of IFD tags: '© Rony Vissers'. The copyright symbol in not 7-bits ASCII-code, and that's the reason for the error message.
Fortunately, it's easy to rectify. If you open the file with image editing software (e.g. Adobe Photoshop or GIMP) and view the embedded metadata, you can simply change '© Rony Vissers' to 'copyright: Rony Vissers'. You can access the embedded metadata in Adobe Photoshop by clicking on 'File info' in the 'File' menu. In GIMP, access the embedded metadata by clicking on 'Metadata' in the 'Image' menu, and then 'Edit Metadata'. Don't forget to save the updated TIFF file once you have modified it. See also the embedded metadata article for information about modifying embedded metadata.
When you check the updated TIFF file with DPF Manager, you will see that the previously reported error has disappeared and the file is now valid.
If the TIFF files are the result of a digitisation project carried out by a specialist digitisation company, ask them to fix the errors rather than doing it yourself.
Example 2: use of compression
Even though the TIFF file format is mainly known as a file format without compression, it does offer this possibility. Compression is not recommended for digitisation, however. DPF Manager can detect TIFF files that have been compressed.
Here is a validation report from the same image: saved without compression on the left, and saved with JPEG compression on the right. The TIFF file with JPEG compression gives an error message.
The only way to fix this error is to perform the capture or scan again and save it as Baseline TIFF v6.0 without compression. If the RAW file used to create the TIFF file is available, you can use that to create a Baseline TIFF v6.0 without compression.