Saving & Sharing Things

In this episode, we will explore how to document and export your data cleaning workflows in OpenRefine. As seen, as data projects grow more complex, keeping track of each transformation becomes essential—not just for your own reference, but also for collaborating with others and ensuring reproducibility.

OpenRefine offers powerful features that allow you to capture every step of your data cleaning process in a structured format. You’ll learn how to export this process as a reusable script and share both your cleaned dataset and the workflow that produced it. This ensures that your work can be replicated, audited, or continued by colleagues or future you.

Exporting & Saving

As previously mentioned you can export your clean dataset or subsets of the cleaned dataset using the Export menu and choosing you preferred extension/format. As a reminder the raw data was never touched,

One of OpenRefine’s key strengths is that the raw data remains untouched, while all changes made to the working copy are meticulously tracked—except for modifications to project metadata, such as the project name. By this point, anyone following along should have a fairly extensive list of actions recorded in the Undo/Redo panel, beginning from the project’s creation.

Clicking Extract.. should prompt the extract operating history, which is essentially a compilation of all your cleaning actions in a JSON (JavaScript Object Notation) file, which stores data in a structured, human-readable text format using key-value pairs and arrays.

You can choose to export specific actions or the entire file.

Supporting Reproducibility

By extracting this set of operations (edits) from a single project, you make your data cleaning process more transparent and reproducible. These operations can then be reapplied to other projects—or reused within the same project—enabling consistent transformations across datasets. This approach also makes it easier to share your workflow with others or revisit it yourself in the future.

This export is useful not only for reapplying edits to similar datasets, but also for enhancing transparency and reproducibility by clearly documenting the transformation from raw to analysis-ready data. You may reuse operations, by clicking Apply... and uploading the file.

Let’s see how does that work? Export the file and save it. Then, go to the Undo/Redo panel and go back to the step where the project was created and apply the file with all the actions you have performed. Cool, right?

Upgrades & Backups

OpenRefine is continuously evolving—bringing new features, performance upgrades, and sometimes changes to how projects are handled. While these updates improve the tool, they may also introduce compatibility issues with older project files.

If you’re upgrading from an older version of OpenRefine and already have projects saved on your computer, it’s strongly recommended that you back them up before installing the new version.

During these transitions, to protect your work, we advise you to first locate your workspace directory (this is where all your project files are stored). Then, copy everything in that folder and paste it into a separate backup location on your computer. This simple step ensures that your data remains safe, no matter what changes come with the upgrade.