Content-Aware Rules (Regular Profile) : Configuring Content Groups : Pattern Content Groups : Creating Custom Pattern Groups
  
Creating Custom Pattern Groups
One can define Content-Aware Rules based on custom content groups if the predefined content groups included with DeviceLock do not meet requirements. Custom Pattern content groups enable administrators to specify any character pattern to use to identify sensitive information within text data.
To create a custom Pattern group
1. If using the DeviceLock Management Console, do the following:
a) Open DeviceLock Management Console and connect it to the computer running DeviceLock Service.
b) In the console tree, expand DeviceLock Service.
If using the DeviceLock Service Settings Editor, do the following:
a) Open DeviceLock Service Settings Editor.
b) In the console tree, expand DeviceLock Service.
If using the DeviceLock Group Policy Manager, do the following:
a) Open Group Policy Object Editor.
b) In the console tree, expand Computer Configuration, and then expand DeviceLock.
2. Expand either the Devices or Protocols node.
3. Under the Devices or Protocols node, do one of the following:
Right-click Content-Aware Rules, and then click Manage.
- OR -
Select Content-Aware Rules, and then click Manage on the toolbar.
This will display a dialog box similar to the following.
4. In the upper pane of the dialog box that appears, under Content Database, click the drop-down arrow next to Add Group, and then click Pattern.
This will display the Add Pattern Group dialog box.
5. In the Add Pattern Group dialog box, do the following:
Name - Specify the name of the group.
Description - Specify a description for the group.
Expression - Set a pattern by specifying one or more Perl regular expressions, one expression per line. The group detects a match in case of a match to any of the expressions specified. For details on regular expressions, refer to the tutorials at perldoc.perl.org/perlrequick.html and perldoc.perl.org/perlretut.html.
Validate - Check regular expression syntax.
Validation - When configured to perform validation, the group detects a match only in case of a match to the selected validation type in addition to the expression specified. To match the group, data needs to match the expression and additionally pass the validation.
If No validation is selected in this field, the group does not perform validation. To match the group in this case, data only needs to match the expression specified.
To configure validation, select the desired type from the drop-down list in this field. The following types of validation are available: ABA Routing Number, American Name (Ex), Austria SSN, Bulgarian EGN, Canadian Social Insurance Number, China National ID, Credit Card Dump, Credit Card Number (All), Credit Card Number (American Express), Credit Card Number (Diners Club Carte Blanche), Credit Card Number (Diners Club En Route), Credit Card Number (Diners Club), Credit Card Number (Discover), Credit Card Number (JCB), Credit Card Number (Laser), Credit Card Number (Maestro), Credit Card Number (Master Card), Credit Card Number (MIR), Credit Card Number (Solo), Credit Card Number (Switch), Credit Card Number (Visa Electron), Credit Card Number (Visa), Danish Personal ID, Date, Date (ISO), Dominican Republic ID, Email Address, European VAT Number, Finnish ID, France INSEE Code, German eTIN, Health Insurance Claim, IBAN, IP Address, Irish PPSN, Japan: Social Security and Tax Number System, LUHN Checksum, Mexican Tax Id Number, Norwegian Birth Number, NPI, Polish ID, Polish National Identity Card, Quebec Healthcare Medical Number, Russian Bank Account Number, Russian classification of enterprises and organizations, Russian Correspondent Account, Russian Health Insurance Number, Russian KPP, Russian main state registration number, Russian OGRN, Russian OGRNIP, Russian OKATO, Russian OKFS, Russian OKOGU, Russian OKOPF, Russian Passport Issuer Department Code, Russian Pension Insurance Number, Russian Social Card Number, Russian Taxpayer Identification Number, South African Id Number, South Korean Resident Registration Number, Spanish NIF, Taiwan ID, Turkish Id Number, UK National Insurance Number, UK NHS Number, UK Phone Number, UK Post Code, UK Tax Code, URL, US Social Security Number.
Condition - Select a condition for triggering content inspection rules that employ this group:
Less than or = - The rule is triggered if the number of matches to the regular expression is no more than the specified number.
Equal to - The rule is triggered if the number of matches to the regular expression is equal to the specified number.
Greater than or = - The rule is triggered if the number of matches to the regular expression is no less than the specified number.
Between - The rule is triggered if the number of matches to the regular expression is within the specified range.
Exact match - The rule is triggered if the regular expression matches the entire content provided for inspection.
Important: The group checks for an exact match no more than the first megabyte of the content provided for inspection. If the content exceeds 1 MB, the rule with the Exact match condition is not triggered even if the first megabyte of the content matches the group’s regular expression.
 
Note: When the Exact match condition is selected, the group detects a match if its regular expression matches the whole content being inspected. As a result, the rule is triggered only if the regular expression matches the entire sequence of characters that make up the given content.
With any condition other than Exact match, the group searches for a character sequence that matches the given regular expression. A match is detected if somewhere in the content being inspected there is a character sequence matching that expression.
Case sensitive - When this check box is selected, the group distinguishes between lowercase and uppercase characters. For example, the words Term and term will be treated differently, so the group can be configured to match the word Term but not term.
When this check box is cleared, the group does not differentiate between uppercase and lowercase characters. For instance, if Term matches the group, then term or even tErM will match it as well.
Visual anti-spoofing - When this check box is selected, the group identifies data matching its expression even if certain data characters are replaced with other ones similar in appearance or meaning, including:
Latin characters in the Russian text (such as Latin b in place of Russian ь)
Latin characters in place of certain numerals (such as Latin S in place of digit 5)
Russian characters in the English text (such as Russian п in place of Latin n)
Russian characters in place of certain numerals (such as Russian З in place of digit 3)
Certain symbols in place of Russian characters (such as * (asterisk) in place of Russian ж)
Numerals in place of certain Latin or Russian characters (such as digit 1 in place of Latin I or digit 4 in place of Russian Ч)
Arabic-Indic (Eastern Arabic) numerals in place of normal Arabic numerals (such as symbol ٣ in place of digit 3 or symbol ٨ in place of digit 8)
When this check box is cleared, the group strictly distinguishes characters regardless of whether or not they are similar in appearance or meaning.
Cyrillic transliteration - When this check box is selected, the group recognizes Cyrillic text to be detected regardless of whether the text is written in Cyrillic or Latin letters. For example, if the Russian word Серия matches the group, then the word Seriya will match it as well.
When this check box is cleared, the match of the text to the group strictly depends upon the alphabet used to spell the text. For example, the group can be configured to match the word Серия but not Seriya.
OCR - Extract text from images for further checking against the regular expression defined in this content group. To do so, select the OCR check box and up to 8 languages.
 
Note: Selection of multiple Asian languages (marked with an asterisk (*) in the GUI) or singular Asian language selections combined with any selection of non-Asian languages may cause a performance degradation of the OCR processing engine.
For optimal performance and recognition quality, we recommend selecting only those languages that are really needed.
Count identical matches as one match - Combine duplicate matches returned by the regular expression into a single match. To do so, select the Count identical matches as one match check box.
Advanced - Quickly test the regular expression pattern on sample data. Click Advanced to display or hide the Test sample box.
Test sample - Enter a test string and view the result. DeviceLock supports real-time color highlighting of test results. All matches are highlighted in green, while strings that do not match the pattern are highlighted in red.
6. Click OK to close the Add Pattern Group dialog box.
The new content group created is added to the existing list of content groups under Content Database in the upper pane of the dialog box for managing content-aware rules.