(Text for those without flash or javascript) Fulcrum's professionals are experienced CPAs, MBAs, ASAs, CFAs, affiliated professors and industry specialists Our expertise encompasses damages analysis, lost profit studies, business and intangible asset valuations, appraisals, fraud investigations, statistics, forensic economic analysis, royalty audits, strategic and market assessments, computer forensics, electronic discovery and analysis of computer data.
Computer Forensics / Electronic Discovery

HOW TO REDACT ELECTRONIC FILES

August 2006
Library Sections:

Redaction (hiding information) of documents is needed to (i) protect legal privileges, (ii) protect information that is outside a discovery request, and (iii) avoid display of social security, birth date, and other personal data that is protected by law (due to identity theft risks).   

Redacting documents with a thick black marker does have its advantages.  Once copied to avoid bleed-through of the unwanted information, one can be certain that the undesired information is gone.  But paper has all the limitations of … well, paper.  Consequently, many want to perform their redactions, production, indexing, and retrieval using a computer’s efficiency.   

The problem is that electronic data has its own tricks.  Unless proper measures are taken, the hidden data may not really be all that hidden.  This occurs because the data still exists in the file.  This article will explain how to perform electronic redactions properly.   

There are two main causes of failing to remove electronic information: 

  1. Attempting to hide content by obscuring or covering the information.  This works with paper, but does not work with electronic documents because the settings can be changed back, or the obstruction removed.  The solution involves not just hiding the sensitive information, but removing it from the file altogether.   

  2. Being unaware that metadata exists, and how to properly remove it.   

Redaction in Microsoft Documents 

Microsoft has a Redaction tool for Word 2003 and later versions, available as a free download from Microsoft’s website.  Using Microsoft’s redaction tool, the redacted text appears to the eye as a black box, with the underlying information converted from the original information to a series of useless vertical lines.  As of this writing, the tool does not support the redaction of graphics, so the additional methods described below may be needed for such items.   

The Microsoft redaction tool automatically accepts any tracked changes, and turns off “Track Changes” in the redacted document.  This will prevent this history from being rediscovered.  However, other meta data will still be available, which is why you will also need the tool described in the next paragraph.   

If metadata is to be removed, Microsoft has a free program (“Remove Hidden Data”) that does this.  The add-on program is available for Microsoft’s XP operating system, and Office 2003 and later.  Once downloaded and installed, the feature is accessed from the “File” commands, as a separate choice.  Running this program removes practically all metadata.    

If you are using a version of Microsoft Office suite earlier than Office 2003, or need to redact graphics, or are using a non-Microsoft word processor, then you will need to alter or remove the data in the original file.  The problem is that, without time-consuming efforts to restore the original spacing, the removal or alteration of the confidential information will usually cause an undesirable reflow of the text and graphics in the document.   Nevertheless, if this approach is necessary, turn-off all “Track Changes” options, and then either remove the confidential information, or replace the information with meaningless marks such as XXXXX’s.  The new information can then be copied into a blank new file to (i) alter the original metadata (except metadata associated with the default template), and (ii) mark the file as having been redacted.   

Another way of sanitizing data is to (i) remove the confidential data in a copy of the file, then (ii) save the results in a simple-text editor (such as Microsoft Notepad).  Files saved in a TXT format can not save hidden code, so what you see is what you get, and nothing more.  As with the approach in preceding paragraph, once the TXT file has been saved, the new information can then be copied into a blank new file to (i) alter the original metadata, and (ii) mark the file as having been redacted.    

Do not attempt to cover data with a black box, change the text color, reduce the font size, or otherwise alter the appearance of what is on the screen without also removing the undesired data.  This is true even if you intend to convert the file to a PDF format.     

Settings needed when creating PDF files 

Conversion of a file to the PDF format will remove (subject to the settings selected, as described here) a great deal of metadata, such as version information and tracked changes.    

When creating a PDF from within a Microsoft document, be sure that the settings used do not cause unwanted information to be retained.  Some PDF software can automatically copy document metadata and properties into the PDF file.  These features should be disabled when creating sanitized documents.  While in the native Microsoft document, select Adobe PDF>Change Conversion Settings.  The resulting dialog box contains multiple tabs.  Perform the following: 

  1. On the “Settings” tab, uncheck “Convert Document Information”.  When unchecked, Document Information removes one source of metadata transferring to the PDF document (but still does not remove all metadata).   

  2. On the “Settings” tab, uncheck “Attach Source File to Adobe PDF”.  Checking this box inserts a copy of the original Microsoft document into the output file.  This is rarely what is desired, so this box is unchecked by default.    

  3. On the “Word” tab, uncheck “Convert displayed comments to notes in the PDF” 

Redaction in Adobe PDF documents 

Suppose that you only have an Adobe PDF file.  The easiest way of handling PDF documents is with an Adobe add-on program, such as Appligent’s Redax.    

If you don’t want to purchase the redaction software, you can accomplish the same result through the following steps: 

  1. Cover each item of confidential information with a black rectangle, or by using black text highlighting.  At this point, all you have done is cover up the confidential information. 

  2. “Flatten” the file with the covered-up information by saving the file as a tiff image.  Select File>Save As, and choose TIFF from the type of file in the dialog box.  Each page will be saved as a separate sequentially numbered TIFF file.  

  3. Reassemble the TIFF files into a separate single PDF file from within PDF by selecting the multiple files that you want combined.   

  4. The reassembled PDF file will have been created from a bitmap, so the underlying data that is to remain confidential will have been lost in the process of creating the TIFF files.   

Profiting from your Opponent's Mistakes 

Fulcrum will provide a no-cost initial consultation if you receive a supposedly redacted file from your opponent, but you suspect that they have not followed the advice contained in this article.  In addition, there is software available to help quickly harvest metadata from large electronic data productions.  Please contact us so we can provide advice as to the possibilities that exist. 

Fulcrum Inquiry performs electronic discovery assistance, forensic investigations, and litigation consulting involving large data collections