Clean up and secure WordPress data with WP Hammer

When making copies of a website for development and testing, populating a thorough content and data set is vital for Quality Assurance (QA). The most efficient path typically involves mirroring the entire production site’s database. But this can be problematic: a large site can have tens of thousands of posts (each with many revisions and healthy doses of metadata) and many user accounts.

Those user accounts (and sometimes the site’s content) can contain sensitive data that, if mishandled, can put clients at risk. On top of that, testing, development, and initial imports—often executed on lightweight virtual machines—can be painfully slow when working with very large datasets. Cleaning up imported production data is a must, but has been a tedious, inefficient task.

Enter 10up’s WP Hammer: an open source developer tool that quickly and efficiently reduces—or completely removes—production data and sensitive client information like email addresses and hashed account passwords from a WordPress installation.

data-security

Why cleanse this data?

Storing sensitive or private data on local or staging sites is a security risk. These environments rarely carry the same level of protective monitoring as a live production site. A staging site containing production credentials offers an easier (and often overlooked) target for a malicious attack.

Overtly private data—say, personal information within a medical or financial community—may even require scrubbing within an organization, by law, before being passed around for development or testing. While user tables are clear targets, some sites may also store sensitive personal data in content objects; e.g., a custom post type for “Medical Records” with custom fields.

Exposure of private information isn’t the only cause for concern. When working on a new feature, it’s far too easy for a test message or notification to accidentally be sent out to registered users brought over from the production site. It’s best to not keep contact information like email addresses in test environments.

Further, most test environments don’t require all client data—it isn’t a backup of production—just a workable subset. Why store and run complex, taxing queries on 100,000 posts when 100 posts is a sufficient sample set? In some cases, pruning posts to an exact number can even help developers test features like pagination.

How can I set up WP Hammer?

WP Hammer assumes you have basic command line familiarity, as well as a Linux / UNIX based environment (like VVV).

To install WP Hammer, begin by fetching the package and ensuring its built by running the following commands: cd $PROJECT_WORKING_DIR
git clone https://github.com/10up/wp-hammer.git
cd wp-hammer
composer install

Once available, there are several options for installation:

  • Install it as a plugin
    cd wp-content/plugins
    mv $PROJECT_WORKING_DIR/wp-hammer .
    wp plugin activate wp-hammer
    wp hammer
  • Call it from the command line
    wp --require=$PROJECT_WORKING_DIR/wp-hammer/wp-hammer.php
  • Add it to your WP-CLI config
  • Add it as an alias in your .bashrc
    alias hammer='wp --require=$PROJECT_WORKING_DIR/wp-hammer/wp-hammer.php'

How does WP Hammer work?

WP Hammer adds the wp hammer command (or, if you prefer to save keystrokes, wp ha) to the popular WP-CLI (WordPress Command Line Interface) toolkit. With it, you can easy make sitewide changes to your data, but be aware that all database modifications are final. Be sure to backup your database before running any commands.

With these basic WP Hammer commands, you can:

  • Clean up user emails
    wp ha -f users.user_email='[email protected]'
  • Clean up user passwords
    wp ha -f users.user_pass=auto
  • Remove extra users
    wp ha -l users=10
  • Remove extra posts
    wp ha -l posts=100
  • Replace post content with dummy content
    wp ha -f posts.post_content=markov,posts.post_title=random

You can also can chain tasks together: wp ha -f posts.post_author=auto users.user_pass=__user_email__UMINtHeroJEreAGleC users.user_email='[email protected]' posts.post_title=ipsum posts.post_content=markov

The above string results in the following changes:

  • posts.post_author is set to a random user ID for all remaining users
  • users.user_pass is set to the user email followed by UMINtHeroJEreAGleCusers.user_email='[email protected]'
  • __ID__ is replaced by the user ID
  • posts.post_title=ipsum replaces all Post Titles with auto-generated Lorem Ipsum
  • posts.post_content=markov replaces all Post Content with randomly generated content, using Markov chains

Contributions welcome!

10up is actively developing WP Hammer, but we’ve also released it on Github for the entire open source community to advance. We encourage you to experiment with the tool, submit issues, and make pull requests.

Leave a Comment

Finely crafted websites & tools that make the web better.