Digital Fingerprinting Technique

Digital fingerprinting is a technique that DeviceLock employs to identify data transmitted across various devices and network protocols. This technique leverages the concept of mapping documents or files to collections of relatively short alphanumeric strings (hashes) referred to as digital fingerprints that can help to uniquely identify the data held in the document or file.

When using this technique, DeviceLock takes digital fingerprints from samples of sensitive documents and then compares them with the digital fingerprints of the documents being inspected. If the “fingerprints match” percentage exceeds the desired threshold as configured, the documents in question are then considered “sensitive” and subjected to the desired security action.

The use of digital fingerprints provides for identifying and protecting information held in files or transmitted over a network. For example, one can use them to identify financial data stored in MS Office documents, business information stored in PDF files, or source code stored in text files. Digital fingerprints can also be used to identify and protect non-text files (such as images, design drawings, and multimedia files), as well as to identify binary content attempting to be copied from one file to another.

Digital fingerprints can be used to identify full copies as well as pieces of documents, even if the document has been changed. They allow the contents of the document to be identified reliably, despite its possible distortion caused by adding non-essential information (individual characters, insignificant words, etc.).

Digital fingerprints are especially efficient when identifying standard documents that change insignificantly. For example, they make it easy to identify filled contracts that differ only in the second parties’ data. By reliably identifying data held in documents and files, digital fingerprints help track and protect sensitive information, thus providing for the scalable application of protective controls to that information as it flows across the corporate network and/or to peripheral user devices.