Recently, I’ve been tasked to create a data mask function.
- A number will be a number
- An upper case letter will be upper case
- A lower case letter will be lower case
- Punctuation and symbols will remain the same
An example would be:
123-00-6789 would become 456-88-4321
Mark would become Trxz
To enhance obfuscation, it will only be used one-way, and a 1:1 character swap is not necessary. Therefore, an uppercase A could be an upper case Z and an upper case X in the same sentence.
Thanks to this article at Novick Software, I was able to come up with the following example for future use.
--Create a table variable
DECLARE @Tbl TABLE (Name VARCHAR(100))
--Insert same name with different case
INSERT INTO @Tbl VALUES ('mark')
INSERT INTO @Tbl VALUES ('Mark')
–Gets the 1st (case-insensitive) result
SELECT DISTINCT Name FROM @Tbl
–Gets case-sensitive distinct results
SELECT DISTINCT Name COLLATE sql_latin1_general_cp1_cs_as FROM @Tbl
–Gets the distinct count (case-insensitive)
SELECT COUNT(DISTINCT Name) FROM @Tbl
–Gets the distinct count (case-sensitive)
SELECT COUNT(DISTINCT NAME COLLATE sql_latin1_general_cp1_cs_as) FROM @Tbl