Use of Hash Values in OSINT Investigations: A Beginner's Guide
A beginner-friendly guide to using MD5 and SHA in OSINT investigations.
Paul Wright & Neal ysart
3/12/202515 min read


Use of Hash Values in OSINT Investigations: A Beginner's Guide
Keeping digital data safe is vital in Open-Source Intelligence (OSINT) investigations, the techniques of which continue to evolve rapidly. The cryptographic hashing algorithms, Message Digest Algorithm 5 (MD5[1]) and Secure Hash Algorithm such as SHA-1[2], were once considered the most secure; however, their weaknesses are now being recognised. Even with niggling worries, they are still valuable in certain situations and capable of being used as proof to ascertain the veracity of digital evidence. SHA encryption refers to a collection of cryptographic hash functions that ensure safe hashing of information in digital signatures, certificates, and cryptocurrencies.
Previously used extensively, SHA-1 is now recognised as weak and has been replaced by SHA-2, which has stronger security based on variations like SHA-256 and SHA-512[3].
This article examines the use of MD5 and SHA in OSINT investigations, compares their advantages and disadvantages, and provides a step-by-step guide to hashing.
FORENSIC INVESTIGATION
Digital forensic investigators and analysts follow a systematic and structured approach to uncover evidence. However, experienced practitioners always make it a priority to stay informed about advancements in digital forensic technologies. This is especially important in incident response scenarios, where understanding and leveraging new tools and techniques is critical for successful forensic and OSINT investigations.
Many essential steps are required to help ensure that any investigation or response activity is legally compliant. One highly significant factor is the security of digital evidence, as in most cases, the success of an investigation could depend on being able to defend how the digital evidence was collected and stored.
IMPORTANCE OF MAINTAINING THE INTEGRITY OF DIGITAL DATA
Preserving the integrity of digital evidence and being able to prove it is vital in any legal case. If evidence is changed for whatever reason, its credibility is reduced, significantly affecting the case outcome.
The proper management of digital evidence requires formal preservation and containment techniques. In the same way OSINT practitioners rely on fundamental approaches to data acquisition and preservation, investigators using digital forensic software must do so with appropriate safeguards to help guarantee that evidence is preserved securely, its integrity is maintained, and it can be retrieved with certainty. These techniques include secure storage environments, maintaining a transparent chain of custody, and using forensic software[4] to make exact bitwise data images.
UNDERSTANDING HASH VALUES
MD5 is a hash function that became popular for confirming data integrity in early 1990 when it could generate a 128-bit hash. It quickly became popular in cryptography and was subsequently routinely used in digital investigations to verify the authenticity of digital evidence. It allowed investigators and third parties to establish swiftly if any changes had been made to the evidence. SHA-1 performs a similar function but has a more complex hash value of 160 bits.
In simple terms, the hash is a number unique to the contents of the data. If the data is altered, even slightly, the hash value will change accordingly, signifying that tampering may have occurred. In other words, as a method of demonstrating that digital evidence has not been tampered with, MD5 and SHA-1 do all the work for you.
ROLE OF MD5 AND SHA-1 IN OSINT INVESTIGATIONS
MD5 and SHA-1 are also used in digital forensics investigations to recover lost data, look for malware, and ensure that files are intact. For instance, when investigators copy data from a suspect's hard drive, they create a hash value of the data to ensure that it remains unchanged throughout their investigation. The same goes for their use in network security, where investigators use hash values to identify changes in the data sent over the network by hackers. Many forensic tools, including Forensic Tool Kit (FTK[5]), SIFT Workstation[6], Caine[7] and Paladin[8], employ hash values to ensure that the digital evidence they collect remains free from bias.
Experienced OSINT practitioners strongly recommend applying hash values to give each piece of evidence a unique digital fingerprint. This enables any minor changes in the future to be detected by comparing the hash value at the time of collection with the hash value after the investigation has been completed. Crucially, it provides an effective way for OSINT practitioners to demonstrate that the evidence they have collected has not been altered during the inquiry.


THE PROCESS
Hashing a file ensures its integrity, verifies changes, and is essential for forensic analysis, digital preservation, and authenticity. It is performed methodically using redundancy mechanisms and a thorough documentation process. Below is a step-by-step guide to its use in OSINT investigations.
Choosing a Hashing Algorithm
The first task is selecting an appropriate hashing algorithm. The most widely used algorithms are MD5, SHA-1, and SHA-256. MD5 is a low-latency algorithm deemed inappropriate for security-critical use due to vulnerability to collision attacks. SHA-1 is more secure than MD5 but is not secure enough. The safest options are SHA-256 and SHA-512. These are the most widely employed algorithms in forensic research and integrity verification.
At least two different hash values, such as MD5 and SHA-256, should be used for verification and redundancy, and two different hashing tools should be used to provide reliability. This way, even if one of the algorithms is compromised in the future, another independent hash may be used for comparison.
Generating the Hash Value
Windows users can indeed generate file hashes using both the Command Prompt (cmd) and PowerShell, but there are some nuances to consider:
Using the Command Prompt:
The command provided is correct for generating an MD5 hash using certutil[9] in the Command:
Prompt: certutil -hashfile "C:\Path\To\File\[file name]" MD5[10]
This command will generate and display the MD5 hash of the specified file.
Using PowerShell:
In PowerShell, a different cmdlet is typically used to generate file hashes:
Get-FileHash -Path "C:\Path\To\File\[file name]" -Algorithm MD5
This PowerShell command uses the Get-FileHash cmdlet, which is more versatile and supports various hash algorithms[11].
It's worth noting that certutil supports multiple hash algorithms, including MD5, SHA-1, SHA-256, SHA-384, and SHA-5122. Similarly, the Get-FileHash cmdlet in PowerShell supports various algorithms[12].
For security-sensitive applications, it's recommended to use stronger hash algorithms like SHA-256 instead of MD5, as MD5 is considered cryptographically weak.
Linux and macOS Users:
For Linux and macOS users, the terminal provides built-in commands for computing hashes.
Using GUI Tools (Cross-Platform)
Several tools are available for users who prefer graphical interfaces. HashCalc[13] is a Windows-based tool that supports MD5, SHA-1, and SHA-256. HashMyFiles[14] is another Windows-based tool that allows batch-hashing of multiple files for forensic investigations. OpenHashTab[15] integrates into Windows File Explorer, enabling quick hashing without needing a separate application. For Linux and macOS users, GtkHash[16] provides an easy-to-use graphical interface that supports multiple hashing algorithms.
Example Hash Values for Corroboration and Redundancy
To ensure accuracy, two different hashes should be generated and recorded. As an example, hashing the word "hello" using the MD5 algorithm produces the following hash value:
MD5: 5d41402abc4b2a76b9719d911017c592
SHA-256 can generate a second hash for additional verification, ensuring redundancy in case one algorithm becomes obsolete or compromised.
Documentation of the Procedure and Process Used
Documentation of the procedure used while hashing a file is essential to ensure forensic reliability and reproducibility. First, the file details, including the file name, type, size, and file creation and modification timestamps, need to be recorded. The storage location of the original file must also be documented to ensure traceability.
The second is to document the hashing process itself. This includes the hash date and time, the operating system (OS) and version used, and the software or tool used. The exact command or process should be documented for future reference and verification.
Following the generation of hash values, they should be stored securely in a log file or a forensic report. It is also recommended that the recorded values be saved as a backup for future use in case they are needed again. For further assurance, signed copies of the hash values may be securely stored in a cryptographic timestamping service to provide more evidence of integrity.
A second hash should be generated with a unique algorithm for consistency. This provides redundancy and a second checkpoint. When the file is hashed again later, and the values obtained are identical, its integrity is confirmed.
Summary
Hashing a file or words is a simple but essential process for file integrity checking. Users can make their recordings trustworthy and genuine by using at least two different hash algorithms and reporting the entire process. Correct logging, using logging templates, and redundancy measures provide forensic precision, while the verifiable audit trail guarantees that the file has not been altered.
FORENSIC HASHING LOG TEMPLATE
This forensic hashing log template provides an official format for documenting the digital file hashing process. It supports accuracy, integrity checking, and reproducibility in forensic exams. A log entry must be created for each hashed file to maintain a clear audit trail.
Case Information
The first section of the log records the key case information. The practitioner must record the case ID for reference, which provides the hash log's link to an investigation or analysis. The Investigator's Name must also be recorded so that it can be determined who hashed.
Additionally, the Date and Time of the hash should be noted to ensure a proper timeline for when the action was performed. The purpose of the hash should also be noted, stating plainly if it is for forensic examination, verification of file integrity, legal documentation, or otherwise.
File Information
The second section documents detailed information about the file being hashed. The File Name and Type should be noted to reference which file was being analysed. The file size should also be noted to help identify any unauthorised modifications. The practitioner should also document the file path (Original Location), pointing out where the file resided before hashing. All file creation dates, file modification dates, and file access dates should be included to detect any change or alteration of the file.
Hashing Details
This section documents the exact process used to generate the hash values. The OS should be recorded, specifying whether the hashing was performed on Windows, macOS, or Linux. The Hashing tool/software used must be identified, such as HashMyFiles or OpenHashTab. The exact Command or Procedure used should be written down, ensuring the process is fully repeatable.
Generated Hash Values
Once the file has been hashed, the investigator must enter the resulting hash values in this section. At least one hash value must be entered, but multiple hashing algorithms are desirable for corroboration and redundancy. The practitioner must enter the MD5 Hash, SHA-1 Hash, SHA-256 Hash, and SHA-512 Hash where needed. Entering more than one hash value enables the file's integrity to be validated using different cryptographic standards in the future.
Verification and Confirmation
To confirm the process's validity, the investigator must check if a secondary hash was generated in case of redundancy. This segment must also check whether two different hash algorithms were utilised and whether the hash values remained the same upon re-checking. If inconsistencies are found, the investigator must suggest and document potential reasons, e.g., corruption in the file or unauthorised changes. A comparison should also be done with any previously stored hash values if available.
Storage and Documentation
The practitioner should confirm whether the hash values have been securely documented in an investigation report for proper forensic handling. A secure copy of the hash values should also be maintained to verify that the integrity verification procedure can be re-accessed. It should be noted if a cryptographic timestamping service was used since this provides an additional layer of proof through the evidence that the hash existed at some point. Anything else in the way of notes or observations, such as environmental or system conditions that could have affected the hashing, should be documented.
Signature and Authentication
To confirm the process, the practitioner must sign that the hashing was performed correctly and according to procedure. If a supervisor or reviewer is present, they should also sign off to support the findings. Lastly, the ‘date of review’ should be recorded to document when the hashing log was formally verified.
Summary
This forensic hashing log methodically and uniformly documents file integrity checks. By carefully documenting every step, practitioners can maintain an open audit trail, which is crucial for forensic accuracy, legal compliance, and OSINT investigations.
LIMITATIONS AND VULNERABILITIES
Collision attacks occur when two different sources produce the same hash result, which makes digital proof less reliable. This flaw allows bad actors to change files without being detected because the original and changed files would have the same hash.
MD5 and SHA-1 have become more insecure due to increased processing capacity. These algorithms are inappropriate for safeguarding sensitive digital forensics data, as contemporary attackers may more readily create collisions to exploit their flaws. Thus, they are no longer suitable for use in settings where security is crucial.
HISTORICAL CASES
In 1996, a flaw was found in MD5's design. While it was not deemed a fatal weakness then, cryptographers began recommending other algorithms, such as SHA-1, which was subsequently found vulnerable[17].
Researchers discovered the first collision for MD5 in 2004, and in 2017, Google and the CWI Institute successfully broke SHA-1[18]. These real-world examples demonstrated the weaknesses of these algorithms in protecting digital forensics software from tampering, leading to widespread recommendations to adopt more secure hashing algorithms like SHA-256 and SHA-512.
In the well-known 2011 Sony PlayStation Network breach[19], forensic detectives used SHA-1 hash values to ensure the stolen files were genuine. In the same way, MD5 ensured no more changes were made to the retrieved data after the 2013 Target data hack[20].
SHOULD GIVEN HASHES BE MADE REDUNDANT?
MD5 and SHA-1, while often used in digital forensics, have many shortcomings, so other techniques like SHA-256 and SHA-512 are now being used. They can better resist collision attacks, as SHA-256, for example, produces a 256-bit hash value, which provides higher protection from exploits where users try to find two different inputs that will yield the same output. The more robust options, such as SHA-256 and SHA-512, are suggested for OSINT investigations, cybersecurity, and forensic analysis concerning cryptographic integrity.
As OSINT investigations evolve, so must the best practices, frameworks, guidelines and any standards that govern them in the future[21]. Current best practices in digital forensics emphasise using more secure algorithms like SHA-256 and SHA-512 to safeguard the integrity of digital evidence, and OSINT practitioners must echo this approach. Many practitioners and organisations, including law enforcement and computer forensics companies, have shifted to these modern standards to ensure data security. However, despite the known vulnerabilities of MD5 and SHA-1, they are still used in some cases for legacy systems or where speed is prioritised over security.
Many forensic tools still support MD5 and SHA-1 because of their historical usage in various investigations, even though contemporary algorithms are becoming increasingly popular.
OSINT practitioners can create hash values using traditional and contemporary techniques, which guarantees compatibility with earlier scenarios in which MD5 or SHA-1 evidence was gathered. However, as network forensics tools and other cutting-edge technologies progress, there is growing pressure to switch entirely to more secure SHA-2 algorithms like SHA-256 and SHA-512 for all facets of an investigation.
THE CASE FOR CONTINUED USE
Though MD5 and SHA-1 have known vulnerabilities, they are still utilised for cases where efficacy and speed are more desirable than security. Indeed, they are extensively used in systems that are too old for other, safer protocols or where this upgrade would be impossible or extremely costly. Once more, the protocol allows for a quick verification of the integrity of the digital evidence on low-risk forensics analysis where it would be impractical for the crash attack to occur. What is needed is the ability to balance risk and utility in OSINT operations.
OSINT practitioners generally need to weigh the risks of using older methods against the benefits they offer. While MD5 and SHA-1 are less secure than newer options like SHA-256, there are instances where they can still be helpful when timeliness is a concern. In some situations, ease of use and the fact that they have been used to solve problems justify their use.
Those currently using MD5 or SHA-1 should consider switching to more current hashing algorithms. An incremental upgrade can be requested, where systems transition to more secure versions like SHA-256 but remain compatible with older evidence files. This ensures that OSINT investigations can still use historical files but with greater security for newer evidence.
THE BENEFITS OF HASHING
Hash value algorithms play a crucial role in ensuring the integrity and authenticity of digital evidence in OSINT investigations. They help practitioners maintain the chain of custody and prevent unauthorised alterations. Here are some cases that demonstrate their importance:
Digital Forensics and Evidence Authentication
Hash values provide a robust mechanism for detecting any changes to digital evidence[22]. In formal proceedings, it's essential to demonstrate that digital evidence hasn't been tampered with since its acquisition. By comparing the current hash to the original, investigators can confirm a file's consistency, offering assurance against tampering.
Fraud Detection and Prevention
Hash values can identify suspicious users and protect businesses from fraudulent activities in fraud investigations. For example, when onboarding new users or accepting card-not-present transactions, hash values of browser and device data can be used alongside other OSINT techniques to assess a user's risk profile.
Terrorist Network Investigations
When uncovering terrorist networks, digital forensic intelligence (DFINT) and OSINT work together to trace suspects' digital footprints. Hash values play a crucial role in analysing devices, communications, and financial transactions, allowing investigators to uncover hidden connections within the network[23].
Chain of Custody in Remote Investigations
Hash values become even more critical for maintaining the chain of custody in remote workers' cases. When retrieving assets from remote locations, hash values can verify the integrity of digital evidence, ensuring that no changes occurred during transit or handling[24].
By using hash values, OSINT practitioners can:
1. Verify the integrity of collected data
2. Detect any unauthorised modifications
3. Establish a reliable chain of custody
4. Enhance the credibility of evidence in legal proceedings
These examples highlight how hash value algorithms are vital for maintaining the security and reliability of OSINT investigations. They help demonstrate that evidence has not been tampered with or altered during the inquiry.
CONCLUSION
Knowledge of the hash value and use in OSINT investigations is necessary and guarantees the integrity of digital information. MD5 and SHA-1 help practitioners verify data authenticity; however, outdated algorithms are utilised less nowadays because they are vulnerable to collision attacks. With the advent of new, more secure algorithms like SHA-256 and SHA-512, the utilisation of MD5 and SHA-1 is increasingly questionable. They are helpful in some situations, such as dealing with legacy systems or low-risk inquiries. Ultimately, adopting more modern algorithms is unavoidable. Still, it is possible to fill the gap between the old and the new, and practitioners should understand how the older algorithms calculate hash values. OSINT practitioners must transition to newer standards while balancing risk and practicality and ensuring that previous digital evidence can be verified.
Authored by: The Coalition of Cyber Investigators.
© 2025 The Coalition of Cyber Investigators. All rights reserved.
The Coalition of Cyber Investigators is a collaboration between
Paul Wright (United Kingdom) - Experienced Cybercrime, Intelligence (OSINT & HUMINT) and Digital Forensics Investigator; and
Neal Ysart (Philippines) - Elite Investigator & Strategic Risk Advisor, Ex-Big 4 Forensic Leader.
With over 80 years of combined hands-on experience, Paul and Neal remain actively engaged in their field.
They established the Coalition to provide a platform to collaborate and share their expertise and analysis of topical issues in the converging domains of investigations, digital forensics and OSINT. Recognising that this convergence has created grey areas around critical topics, including the admissibility of evidence, process integrity, ethics, contextual analysis and validation, the coalition is Paul and Neal’s way of contributing to a discussion that is essential if the unresolved issues around OSINT derived evidence are to be addressed effectively. Please feel free to share this article and contribute your views.
[1] Shacklett, M. E., & Loshin, P. (2021, August 23). MD5. Search Security. https://www.techtarget.com/searchsecurity/definition/MD5 (Accessed 11 March 2025)
[2] GeeksforGeeks. (2024, July 18). SHA1 Hash. GeeksforGeeks. https://www.geeksforgeeks.org/sha-1-hash-in-java/ (Accessed 11 March 2025)
[3] GeeksforGeeks. (2024, July 18). SHA1 Hash. GeeksforGeeks. https://www.geeksforgeeks.org/sha-1-hash-in-java/ (Accessed 11 March 2025)
[4] Montini, H. (2024, July 15). What is Computer Forensics: Complete Guide. Proven Data. https://www.provendata.com/blog/what-is-computer-forensics/#:~:text= (Accessed 11 March 2025)
[5] FTK Forensics Toolkit - Digital Forensics Software Tools | Exterro. (2024, November 6). Exterro. https://www.exterro.com/digital-forensics-software/forensic-toolkit (Accessed 11 March 2025)
[6] Soni, A. (2025, February 11). SIFT Workstation | SANS Institute. https://www.sans.org/tools/sift-workstation/ (Accessed 11 March 2025)
[7] CAINE Live USB/DVD - computer forensics digital forensics. (n.d.). https://www.caine-live.net/ (Accessed 11 March 2025)
[8] SUMURI LLC. (2025, January 6). PALADIN Forensic : Best Digital Forensics Software. SUMURI. https://sumuri.com/software/paladin/#:~:text= (Accessed 11 March 2025)
[9] [Msft, R. M., & [Msft, R. M. (2024, February 4). How to generate file hash using Certutil - 250 Hello. 250 Hello - Random Musings on Security and Exchange. https://blog.rmilne.ca/2023/12/21/how-to-generate-file-hash-using-certutil/ (Accessed 11 March 2025)
[10] Generate SHA256 Hash of a STRING from Windows Command Line. (n.d.). Server Fault. https://serverfault.com/questions/1119066/generate-sha256-hash-of-a-string-from-windows-command-line (Accessed 11 March 2025)
[11] Jaya.Rayapati. (2024, April 11). PowerShell script to verify the file hash of a file on Windows devices. Hexnode Help Center. https://www.hexnode.com/mobile-device-management/help/powershell-script-to-verify-the-file-hash-of-a-file-on-windows-devices/ (Accessed 11 March 2025)
[12] Sdwheeler. (n.d.). Get-FileHash (Microsoft.PowerShell.Utility) - PowerShell. Microsoft Learn. https://learn.microsoft.com/en-us/powershell/module/microsoft.powershell.utility/get-filehash?view=powershell-7.5 (Accessed 11 March 2025)
[13] CNET. (2025, January 10). HashCalc for Windows - Free download and software reviews - CNET Download. https://download.cnet.com/hashcalc/3000-2250_4-10130770.html (Accessed 11 March 2025)
[14] HashMyFiles: Calculate MD5/SHA1/CRC32 hash of files. (n.d.). NirSoft. https://www.nirsoft.net/utils/hash_my_files.html (Accessed 11 March 2025)
[15] OpenHashTab. (n.d.). https://www.majorgeeks.com/files/details/openhashtab.html (Accessed 11 March 2025)
[16]GTKHash. (n.d.). GtkHash. https://gtkhash.org /(Accessed 11 March 2025)
[17] Dobbertin, H. (1996). "The Status of MD5 After a Recent Attack." RSA Laboratories. https://www.researchgate.net/publication/220132613_The_Status_of_MD5_After_a_Recent_Attack
[18] Google. (2017, February 23). Announcing the first SHA1 collision. Google Online Security Blog. https://security.googleblog.com/2017/02/announcing-first-sha1-collision.html( Accessed 11 March 2025)
[19] Quinn, B., & Arthur, C. (2020, April 16). PlayStation Network hackers access data of 77 million users. The Guardian. https://www.theguardian.com/technology/2011/apr/26/playstation-network-hackers-data (Accessed 11 March 2025)
[20] Verizon DBIR (2014). "2013 Target Data Breach Report." Verizon Data Breach Investigations Report. https://www.verizon.com/business/resources/reports/dbir/
[21] Open Source Investigation Best Practices 2025. (n.d.). Neotas - Due Diligence and Employment Screening. https://www.neotas.com/open-source-investigation-best-practices/ (Accessed 11 March 2025)
[22] Callaghan, P. (n.d.). Why hash values are crucial in digital evidence authentication. https://blog.pagefreezer.com/importance-hash-values-evidence-collection-digital-forensics Accessed 11 March 2025)
[23] Lerner, E. (2024, September 2). Digital Forensics and OSINT: Synergy in Modern Investigations. https://www.linkedin.com/pulse/digital-forensics-osint-synergy-modern-investigations-efim-lerner-aw8lf/ (Accessed 11 March 2025)
[24] Regan, B. (2022, September 9). Getting started with chain of custody for DFIR investigations. Medium. https://blueteamtactics.net/getting-started-with-chain-of-custody-for-dfir-investigations-9d1c902a5002 (Accessed 11 March 2025)