Reprocessing Data
Rebuild your dataset with updated calculations and structure
When you make structural changes to a dataset—like adding calculated columns, modifying formulas, or changing column mappings—you need to reprocess the dataset to apply those changes to all your historical data.
What Reprocessing Does
Reprocessing rebuilds your dataset from scratch:
- Clears existing data - All processed rows are removed from the dataset
- Replays source reports - Every source file/report is re-ingested in chronological order
- Applies current calculations - Your new or modified calculated columns run against all data
- Rebuilds snapshots - If snapshotting is enabled, snapshot history is preserved and recalculated
This ensures your entire dataset—including historical data—reflects your current column definitions and formulas.
When to Reprocess
You need to reprocess when you:
- Add a new calculated column
- Modify an existing calculated column formula
- Change column include/exclude settings
- Update a VLOOKUP table used in calculations
- Fix a formula error
You don't need to reprocess when you:
- New source data arrives (this processes automatically)
- You create charts or visualizations
- You change dataset name or description
- You modify column display settings (like hiding columns in Explorer)
How to Reprocess
Step 1: Make Your Changes
First, make your structural changes in the dataset settings:
- Go to your dataset
- Click Settings → Calculated Columns
- Add or modify your columns and formulas
- Click Save Changes
Step 2: Initiate Reprocessing
After saving, click the Reprocess Data button.
Step 3: Monitor Progress
A progress indicator appears on the Datasets page showing:
- Number of source reports being processed
- Current processing stage
- Estimated time remaining
Step 4: Verify Results
Once complete:
- Go to the Preview Data tab to spot-check your calculations
- Open Explorer to verify your new columns appear correctly
- Check that historical data has the expected values
Processing Time
Reprocessing time depends on:
| Factor | Impact |
|---|---|
| Number of source reports | More reports = longer processing |
| Dataset size (rows) | Larger datasets take more time |
| Formula complexity | Complex VLOOKUPs add processing time |
| External sheet references | Google Sheets lookups add latency |
Typical processing times:
- Small dataset (< 10K rows, 10 reports): 30 seconds - 2 minutes
- Medium dataset (10K-100K rows, 50 reports): 2-10 minutes
- Large dataset (100K+ rows, 100+ reports): 10-30 minutes
For very large datasets, consider processing during off-peak hours.
Best Practices
Test Before Full Reprocess
Before reprocessing a large dataset:
- Use the Preview Data feature to test your formula on sample rows
- Verify the formula logic is correct before committing to a full reprocess
- Fix any errors in preview mode—it's much faster than reprocessing twice
Plan for Blended Datasets
If your dataset is a source for other blended datasets:
- Those downstream datasets may also need reprocessing
- Check the "Reprocess when source updates" option on blended datasets to automate this
Snapshot Considerations
Reprocessing a snapshot dataset:
- Preserves your snapshot history
- Recalculates values within each snapshot based on new formulas
- Does NOT change which records were captured in each snapshot
Avoid Reprocessing Loops
If Dataset A blends with Dataset B, and both have "auto-reprocess" enabled:
- Be careful not to create circular dependencies
- One dataset should be the "source" and the other the "consumer"
Troubleshooting
Reprocessing Takes Too Long
- Simplify complex formulas where possible
- Consider breaking large calculated column sheets into smaller, focused columns
- Check if external VLOOKUP references are slowing things down
Calculated Column Shows Errors After Reprocess
- Open the Calculated Columns editor and check the Values preview row
- Look for #REF!, #VALUE!, or #N/A errors
- Verify that column references still exist (columns may have been renamed in source)
Data Looks Different After Reprocess
- Check if source report format changed between uploads
- Verify column mappings are still correct
- Review any VLOOKUP tables to ensure they have the expected values
Related Topics
- Adding Calculated Columns - How to create formulas
- Blending Two Datasets - Combining data from multiple sources
- Snapshot Datasets - Tracking data changes over time
Updated 2 days ago