Company
- All Community Culture Press Recruiting Team Tech Blog
  
  Company
  
  Spokeo’s Vision for a Performance Audit Tool Becomes…
  
  Community
  
  Spokeo Announces 2022 Scholarship Winner
  
  Company
  
  Spokeo CEO Harrison Tang Selected As W3C CCG…
  
  Tech Blog
  
  Spokeo Software Developer Cypress Tech Initiative a Success
  
  Community
  
  Spokeo Announces 2022 Scholarship Winner
  
  Community
  
  People of Pasadena: Aaron Wheeler
  
  Community
  
  Takeaways from General Assembly
  
  Community
  
  GA & Spokeo Present: How to build a…
  
  Culture
  
  Spokeo CEO Shows You How to Crush Any…
  
  Culture
  
  Celebrating Ten Years!
  
  Culture
  
  Spokeo Celebrates 10th Anniversary with #SpokeoTurns10 Sweepstakes
  
  Culture
  
  Spokeo Life: Kids Day
  
  Press
  
  How to Do a Catfish Phone Number Search…
  
  Press
  
  Spokeo on Orange is the New Black
  
  Press
  
  SPOKEO ACHIEVES ISO 27001:2013 CERTIFICATION FOR STATE-OF-THE-ART INFORMATION…
  
  Press
  
  SPOKEO RANKS #110 ON THE 2017 ENTREPRENEUR 360™…
  
  Recruiting
  
  Why Companies Should Support Employee-Led Initiatives
  
  Recruiting
  
  Five Reasons Why Spokeo Loves Pasadena
  
  Recruiting
  
  Spokeo’s Approach to Interviewing
  
  Recruiting
  
  Customer Care Job Fair – 5/19/2015
  
  Team
  
  Spokeo CEO Harrison Tang Selected As W3C CCG…
  
  Team
  
  Honoring our World-Class Customer Care Team
  
  Team
  
  Why People Want to Work at Spokeo
  
  Team
  
  How Spokeo Saved Time With LazyMap
  
  Tech Blog
  
  Spokeo’s Vision for a Performance Audit Tool Becomes…
  
  Tech Blog
  
  Spokeo Software Developer Cypress Tech Initiative a Success
  
  Tech Blog
  
  How to Scrub Identity Data For Privacy Engineering…
  
  Tech Blog
  
  Rails rendering is slow – We use a…
Product
- All Product Updates Product Usage
  
  Businesses
  
  NYC Success Story: How to Get Real Estate…
  
  Guides
  
  How to Search & Find Someone’s Phone Number
  
  Guides
  
  What Can a Cell Number Lookup Find?
  
  Guides
  
  How to Track a Phone Number
  
  Product Updates
  
  Who is Calling Me? How to Tell if…
  
  Product Updates
  
  Spokeo Introduces Investigative Tool for Law Enforcement
  
  Product Updates
  
  New to Spokeo: Property Owners & Residents
  
  Product Updates
  
  Spokeo Expands Address Maps
  
  Product Usage
  
  NYC Success Story: How to Get Real Estate…
  
  Product Usage
  
  How to Search & Find Someone’s Phone Number
  
  Product Usage
  
  What Can a Cell Number Lookup Find?
  
  Product Usage
  
  How to Track a Phone Number
Stories
- All Businesses Connectors Consumers Protectors
  
  Connectors
  
  Hidden Memories
  
  Protectors
  
  Mama Bear To The Rescue
  
  Connectors
  
  How Spokeo Helped Reunite A Woman With Her…
  
  Protectors
  
  Too Good To Be True
  
  Businesses
  
  NYC Success Story: How to Get Real Estate…
  
  Businesses
  
  Spokeo Community Grant Program Highlight: Family Services of…
  
  Businesses
  
  Adoptees Find a Whole New World of Resources…
  
  Businesses
  
  How Spokeo Helped a Mom Find Her Missing…
  
  Connectors
  
  Hidden Memories
  
  Connectors
  
  How Spokeo Helped Reunite A Woman With Her…
  
  Connectors
  
  Reconnecting with my Fraternity Brothers
  
  Connectors
  
  One Last Reunion
  
  Consumers
  
  How Spokeo Helped Bette Prevent Insurance Fraud
  
  Consumers
  
  3 Ways You Can Use Spokeo for Your…
  
  Consumers
  
  Spokeo Helps a 50-year High School Reunion
  
  Consumers
  
  An Old Man, His Son, And a Storm
  
  Protectors
  
  Mama Bear To The Rescue
  
  Protectors
  
  Too Good To Be True
  
  Protectors
  
  How Mr. Right Became Mr. Wrong
  
  Protectors
  
  Investigating Infidelity
Advice & How-To
- All Dating Family Friends Fun Guides Identity Relationships Safety
  
  Guides
  
  Tracking Location on Social Media: How To Update…
  
  Guides
  
  How to Steer Clear of Concert Ticket Scammers
  
  Safety
  
  Travel Plans Ahead? Don’t Fall Victim to These…
  
  Safety
  
  Is Alexa Spying on You? What You Need…
  
  Dating
  
  7 Tips to Surviving Dating in New York…
  
  Dating
  
  Single in New York? Make Her Laugh With These…
  
  Dating
  
  The 10 Male Archetypes You’ll Meet Dating in…
  
  Dating
  
  Dating in Los Angeles? Here’s How To Impress…
  
  Family
  
  What Parents Should Know About Teen Dating Apps
  
  Family
  
  Is TikTok Safe? What Parents Should Know About…
  
  Family
  
  Internet Safety for Kids: Beware of Popular Websites
  
  Family
  
  Tips for Keeping Your Child Safe on Reddit
  
  Friends
  
  How to Plan a High School Reunion
  
  Friends
  
  A Guide to Reconnecting with Old Friends
  
  Friends
  
  Veterans Groups: Which One is Right for You?
  
  Friends
  
  How to Make Friends Online
  
  Fun
  
  The Top 10 Compass Articles of 2021
  
  Fun
  
  How To Scare (or at Least Annoy) a…
  
  Fun
  
  The 10 Most Popular Compass Articles of 2020
  
  Fun
  
  Does a February 29 Leap Year Birthday Make…
  
  Guides
  
  Tracking Location on Social Media: How To Update…
  
  Guides
  
  How to Steer Clear of Concert Ticket Scammers
  
  Guides
  
  Worst Catfish Ever: The Most Notorious Catfishers in…
  
  Guides
  
  Avoid These 6 Airbnb Scams for a Safe…
  
  Identity
  
  How to Press Charges for Identity Theft
  
  Identity
  
  How Many Credit Cards Is Too Many?
  
  Identity
  
  Does the IRS Call You? Yes, But Surprise…
  
  Identity
  
  Tax Fraud? How To Report Someone to the…
  
  Relationships
  
  5 Signs That Your Partner Is Cheating on…
  
  Relationships
  
  How To Catch a Cheater in 4 Easy…
  
  Relationships
  
  Breakup Etiquette: Should You Block Your Ex?
  
  Relationships
  
  15 Ways Your Partner Could Be Cheating Right…
  
  Safety
  
  Tracking Location on Social Media: How To Update…
  
  Safety
  
  How to Steer Clear of Concert Ticket Scammers
  
  Safety
  
  Travel Plans Ahead? Don’t Fall Victim to These…
  
  Safety
  
  Is Alexa Spying on You? What You Need…
Phone Lookup

Company
- All Community Culture Press Recruiting Team Tech Blog
  
  Company
  
  Spokeo’s Vision for a Performance Audit Tool Becomes…
  
  Community
  
  Spokeo Announces 2022 Scholarship Winner
  
  Company
  
  Spokeo CEO Harrison Tang Selected As W3C CCG…
  
  Tech Blog
  
  Spokeo Software Developer Cypress Tech Initiative a Success
  
  Community
  
  Spokeo Announces 2022 Scholarship Winner
  
  Community
  
  People of Pasadena: Aaron Wheeler
  
  Community
  
  Takeaways from General Assembly
  
  Community
  
  GA & Spokeo Present: How to build a…
  
  Culture
  
  Spokeo CEO Shows You How to Crush Any…
  
  Culture
  
  Celebrating Ten Years!
  
  Culture
  
  Spokeo Celebrates 10th Anniversary with #SpokeoTurns10 Sweepstakes
  
  Culture
  
  Spokeo Life: Kids Day
  
  Press
  
  How to Do a Catfish Phone Number Search…
  
  Press
  
  Spokeo on Orange is the New Black
  
  Press
  
  SPOKEO ACHIEVES ISO 27001:2013 CERTIFICATION FOR STATE-OF-THE-ART INFORMATION…
  
  Press
  
  SPOKEO RANKS #110 ON THE 2017 ENTREPRENEUR 360™…
  
  Recruiting
  
  Why Companies Should Support Employee-Led Initiatives
  
  Recruiting
  
  Five Reasons Why Spokeo Loves Pasadena
  
  Recruiting
  
  Spokeo’s Approach to Interviewing
  
  Recruiting
  
  Customer Care Job Fair – 5/19/2015
  
  Team
  
  Spokeo CEO Harrison Tang Selected As W3C CCG…
  
  Team
  
  Honoring our World-Class Customer Care Team
  
  Team
  
  Why People Want to Work at Spokeo
  
  Team
  
  How Spokeo Saved Time With LazyMap
  
  Tech Blog
  
  Spokeo’s Vision for a Performance Audit Tool Becomes…
  
  Tech Blog
  
  Spokeo Software Developer Cypress Tech Initiative a Success
  
  Tech Blog
  
  How to Scrub Identity Data For Privacy Engineering…
  
  Tech Blog
  
  Rails rendering is slow – We use a…
Product
- All Product Updates Product Usage
  
  Businesses
  
  NYC Success Story: How to Get Real Estate…
  
  Guides
  
  How to Search & Find Someone’s Phone Number
  
  Guides
  
  What Can a Cell Number Lookup Find?
  
  Guides
  
  How to Track a Phone Number
  
  Product Updates
  
  Who is Calling Me? How to Tell if…
  
  Product Updates
  
  Spokeo Introduces Investigative Tool for Law Enforcement
  
  Product Updates
  
  New to Spokeo: Property Owners & Residents
  
  Product Updates
  
  Spokeo Expands Address Maps
  
  Product Usage
  
  NYC Success Story: How to Get Real Estate…
  
  Product Usage
  
  How to Search & Find Someone’s Phone Number
  
  Product Usage
  
  What Can a Cell Number Lookup Find?
  
  Product Usage
  
  How to Track a Phone Number
Stories
- All Businesses Connectors Consumers Protectors
  
  Connectors
  
  Hidden Memories
  
  Protectors
  
  Mama Bear To The Rescue
  
  Connectors
  
  How Spokeo Helped Reunite A Woman With Her…
  
  Protectors
  
  Too Good To Be True
  
  Businesses
  
  NYC Success Story: How to Get Real Estate…
  
  Businesses
  
  Spokeo Community Grant Program Highlight: Family Services of…
  
  Businesses
  
  Adoptees Find a Whole New World of Resources…
  
  Businesses
  
  How Spokeo Helped a Mom Find Her Missing…
  
  Connectors
  
  Hidden Memories
  
  Connectors
  
  How Spokeo Helped Reunite A Woman With Her…
  
  Connectors
  
  Reconnecting with my Fraternity Brothers
  
  Connectors
  
  One Last Reunion
  
  Consumers
  
  How Spokeo Helped Bette Prevent Insurance Fraud
  
  Consumers
  
  3 Ways You Can Use Spokeo for Your…
  
  Consumers
  
  Spokeo Helps a 50-year High School Reunion
  
  Consumers
  
  An Old Man, His Son, And a Storm
  
  Protectors
  
  Mama Bear To The Rescue
  
  Protectors
  
  Too Good To Be True
  
  Protectors
  
  How Mr. Right Became Mr. Wrong
  
  Protectors
  
  Investigating Infidelity
Advice & How-To
- All Dating Family Friends Fun Guides Identity Relationships Safety
  
  Guides
  
  Tracking Location on Social Media: How To Update…
  
  Guides
  
  How to Steer Clear of Concert Ticket Scammers
  
  Safety
  
  Travel Plans Ahead? Don’t Fall Victim to These…
  
  Safety
  
  Is Alexa Spying on You? What You Need…
  
  Dating
  
  7 Tips to Surviving Dating in New York…
  
  Dating
  
  Single in New York? Make Her Laugh With These…
  
  Dating
  
  The 10 Male Archetypes You’ll Meet Dating in…
  
  Dating
  
  Dating in Los Angeles? Here’s How To Impress…
  
  Family
  
  What Parents Should Know About Teen Dating Apps
  
  Family
  
  Is TikTok Safe? What Parents Should Know About…
  
  Family
  
  Internet Safety for Kids: Beware of Popular Websites
  
  Family
  
  Tips for Keeping Your Child Safe on Reddit
  
  Friends
  
  How to Plan a High School Reunion
  
  Friends
  
  A Guide to Reconnecting with Old Friends
  
  Friends
  
  Veterans Groups: Which One is Right for You?
  
  Friends
  
  How to Make Friends Online
  
  Fun
  
  The Top 10 Compass Articles of 2021
  
  Fun
  
  How To Scare (or at Least Annoy) a…
  
  Fun
  
  The 10 Most Popular Compass Articles of 2020
  
  Fun
  
  Does a February 29 Leap Year Birthday Make…
  
  Guides
  
  Tracking Location on Social Media: How To Update…
  
  Guides
  
  How to Steer Clear of Concert Ticket Scammers
  
  Guides
  
  Worst Catfish Ever: The Most Notorious Catfishers in…
  
  Guides
  
  Avoid These 6 Airbnb Scams for a Safe…
  
  Identity
  
  How to Press Charges for Identity Theft
  
  Identity
  
  How Many Credit Cards Is Too Many?
  
  Identity
  
  Does the IRS Call You? Yes, But Surprise…
  
  Identity
  
  Tax Fraud? How To Report Someone to the…
  
  Relationships
  
  5 Signs That Your Partner Is Cheating on…
  
  Relationships
  
  How To Catch a Cheater in 4 Easy…
  
  Relationships
  
  Breakup Etiquette: Should You Block Your Ex?
  
  Relationships
  
  15 Ways Your Partner Could Be Cheating Right…
  
  Safety
  
  Tracking Location on Social Media: How To Update…
  
  Safety
  
  How to Steer Clear of Concert Ticket Scammers
  
  Safety
  
  Travel Plans Ahead? Don’t Fall Victim to These…
  
  Safety
  
  Is Alexa Spying on You? What You Need…
Phone Lookup

Home CompanyTech Blog How to Scrub Identity Data For Privacy Engineering Purposes

Home CompanyTech Blog How to Scrub Identity Data For Privacy Engineering Purposes

How to Scrub Identity Data For Privacy Engineering Purposes

by Harv Gill May 24, 2022

by Harv Gill May 24, 2022

Facebook Twitter Linkedin Email

The Liability of Testing with PII Data

How do you create sample data to test your data pipeline in a privacy-preserving way? Do you have sensitive production data lingering around in other environments? Do you have employees walking around with real personally identifiable information (PII) data on their laptops? Can you imagine what a liability this would be to any company’s cybersecurity and data privacy policy? Or what would happen in the event of a hack?

Building the Tool In House was Our Solution

While researching existing third-party tools to scrub PII from data, the Engineering team at Spokeo found it necessary to build an in-house solution. The Software Development/Design Engineer in Test (SDET) team at Spokeo built a data scrubber tool that takes input data containing real values from the production environment and creates anonymized output data in the original format. The input can be a file in text, csv, parquet, or even json object with multiple levels of nested structure. The data scrubber has a configuration file where the user can specify which fields contain sensitive data that needs to be scrubbed and replaced with synthetic values that look real – for instance firstName, lastName, phoneNumber, dateOfBirth, mailingAddress, and numerous other data fields that might be sensitive to the consumer or your business.

This project started out as a much simpler script. The use cases grew very quickly calling for a well engineered tool. We created a factory that can produce a value for any given “column_name” or “key” in json. The user has the option to fine tune the synthetic data generated by the factory by specifying certain parameters for each of the fields. We wanted to make sure that the data integrity is intact even as we scrub the data.

Key Challenges

There were several challenges along the way as we completed the proof of concept (POC) with limited features and subsequently used the design as the solution to implement the remaining features. A few of the challenges that the team resolved quickly are below:

Related columns: Each record must maintain related column values. Ex: the values in first_name, last_name, and full_name must make sense. So we couldn’t simply create a random value for the given column because some values have partial data that has to agree with other columns.
Dependent columns: Certain columns should only have a value if the value of another column is a specific string. Ex: “date_of_marriage” should only have a date if the value for the “married” column is “Yes”.
Logic within the values: There were several columns that needed to be logically correct. Expiration date cannot fall before the Issue date, or updated_date cannot fall before the created_date, date_of_birth of an adult must put their age above 18 years old and below a reasonable number, unless the person is deceased.

Future Uses for the Tool

After tackling the problem of scrubbing PII from test data, we have started working on generalizing the tool to fit other teams’ needs. Some of the teams at Spokeo work with large files of data where values across different records are related. For instance: a group of family members sharing the same address. If we scrub one person’s address, we must maintain consistency by making the family members’ address the same value. Otherwise, we might end up with data integrity issues within the test/sample data. The tool is still in early stages, but as soon as it becomes a valuable enough project that other teams can benefit from it, we’d love to explore the opportunity to open-source it.

Join Spokeo Engineering Team

Spokeo is tackling various big-data challenges across Engineering, QA, Product, and other disciplines to help redefine digital identity. If you are interested in solving meaningful problems about digital identity, please check out our job opportunities: https://www.spokeo.com/careers

Get the latest stories straight to your inbox

Please leave this field empty

Your Email *

Thanks for subscribing! Check your inbox or spam folder now to confirm your subscription.

POPULAR ARTICLES

How to Track a Phone Number
What Can a Cell Number Lookup Find?
How To Find Someone’s Cell Phone Number by Name

RELATED ARTICLES

Spokeo’s Vision for a Performance Audit Tool Becomes Reality
Spokeo Software Developer Cypress Tech Initiative a Success
How to Scrub Identity Data For Privacy Engineering Purposes

POPULAR TOPICS

Guides
(144)
Dating
(80)
Community
(71)
Press
(54)
Team
(43)

FOLLOW US @ SPOKEO

No any image found. Please check it again or try with another instagram account.

RELATED ARTICLES

Spokeo’s Vision for a Performance Audit Tool Becomes Reality
Spokeo Software Developer Cypress Tech Initiative a Success
Rails rendering is slow – We use a Node microservice to achieve greater site performance
What’s the fastest way to store intermediate results in Spark?

About Us

Spokeo is a people search platform helping users know more about the people in their lives. Spokeo provides access to social media profiles, court records, criminal records, names, addresses, phone numbers, email addresses, marital status, and more.

FEATURED ARTICLES

Spokeo’s Vision for a Performance Audit Tool Becomes Reality
Tracking Location on Social Media: How To Update Your Settings for Extra Security

Uncover the truth, search
Spokeo now

The articles on this site are intended for informational or educational purposes only, and the photos may not depict the actual people referenced in the articles. The articles are not intended as legal advice. They do not necessarily describe the kinds of information available from Spokeo or how information from Spokeo should be used. Spokeo provides more information on the specifics of its services and the legal use of the information available from Spokeo in its Terms of Use. None of the information offered by Spokeo is to be considered for purposes of determining or making a decision about a person's eligibility for credit, insurance, employment, rental housing, or for any other purposes covered under the Fair Credit Reporting Act (FCRA). Spokeo is not a consumer reporting agency and does not offer consumer reports. Spokeo gathers information from public sources, which may not be complete, comprehensive, accurate or up to date, so do not use this service as a substitute for your own due diligence, especially if you have concerns about a person's criminal history. Spokeo does not verify or evaluate each piece of data, and makes no warranties or guarantees about any of the information offered.

Terms Privacy People Search Email Lookup Phone Lookup Address Search

Twitter
Facebook
Linkedin
Instagram

Copyright © 2022, Spokeo, Inc.