Best Practices for Source Reports
Design reports that maximize Scoop analysis capabilities
The quality of your analysis depends on the quality of your source data. Well-designed source reports enable powerful snapshotting, accurate change tracking, and seamless blending with other data sources. This guide covers best practices for creating reports that work optimally with Scoop.
Core Principles
| Principle | Why It Matters |
|---|---|
| Include unique identifiers | Enables snapshotting and change tracking |
| Add relevant dates | Powers time-based analysis |
| Keep data granular | Aggregation can happen in Scoop |
| Use consistent naming | Ensures reliable matching and blending |
| Filter appropriately | Balances completeness with performance |
Essential Fields to Include
Unique Identifiers
Every report should include a column that uniquely identifies each record:
| Data Type | Typical ID Field |
|---|---|
| Sales opportunities | Opportunity ID |
| Support tickets | Ticket Number |
| Customers | Account ID |
| Orders | Order Number |
| Inventory | SKU or Item ID |
Why it matters: Without a unique identifier, Scoop cannot:
- Track how individual records change over time
- Enable process analysis (Sankey diagrams)
- Calculate accurate conversion rates
Date Fields
Include all relevant dates for comprehensive time analysis:
| Date Type | Analysis Enabled |
|---|---|
| Created Date | Age analysis, volume trends |
| Modified Date | Activity tracking, freshness |
| Close/Complete Date | Cycle time calculation |
| Stage Change Date | Velocity analysis |
Status/Stage Fields
For process analysis, include fields that track lifecycle:
- Current status or stage
- Previous status (if available)
- Substatus or reason codes
Owner/Assignment Fields
For workload and performance analysis:
- Assigned user or team
- Created by
- Last modified by
Report Design Patterns
For Snapshot Datasets
When tracking entities that change over time (deals, tickets, inventory):
Recommended Filter Pattern
Include records that are either:
- Currently active/open, OR
- Recently modified (last 1-7 days)
Example: Salesforce Opportunities
Filter: IsClosed = FALSE OR LastModifiedDate >= LAST_N_DAYS:2
Why: This captures both current state and recent transitions.
Include All Lifecycle Stages
Don't filter out closed/completed records entirely—you need to see the final transition:
❌ Bad: Stage != "Closed Won" AND Stage != "Closed Lost"
✅ Good: IsOpen = TRUE OR CloseDate >= LAST_N_DAYS:7
For Transactional Datasets
When tracking events/transactions that don't change:
Include Transaction Date
Always include when the transaction occurred:
| Transaction Type | Date Field |
|---|---|
| Orders | Order Date |
| Payments | Payment Date |
| Events | Event Timestamp |
| Activities | Activity Date |
Consider Date Range
For large transaction volumes:
- Filter to relevant time period
- Use incremental loading for ongoing ingestion
For Reference Data
When loading lookup tables or master data:
Include All Records
Don't filter reference data—you need complete lookup coverage:
✅ Include: All products, all regions, all categories
❌ Don't filter: Only active products (breaks historical lookups)
Add Effective Dates
If values change over time, include effective dates:
| Field | Purpose |
|---|---|
| Effective From | When this value became valid |
| Effective To | When this value expired |
Column Best Practices
Naming Conventions
| Practice | Example |
|---|---|
| Clear, descriptive names | Customer_Region not CRgn |
| Consistent prefixes | Opp_Amount, Opp_Stage |
| Avoid special characters | Underscores OK, avoid #, %, / |
| No leading numbers | Q1_Revenue not 1Q_Revenue |
Data Types
Ensure columns have appropriate data types:
| Data | Correct Type | Avoid |
|---|---|---|
| Amounts | Number | Text with currency symbols |
| Dates | Date/DateTime | Text strings |
| IDs | Text (consistent format) | Mixed formats |
| Percentages | Number (0.15) | Text ("15%") |
Avoid These Patterns
| Anti-Pattern | Problem | Solution |
|---|---|---|
| Pivoted data | Hard to analyze | Keep data in rows |
| Merged cells | Breaks parsing | Use flat structure |
| Multiple header rows | Confuses column detection | Single header row |
| Totals/subtotals | Duplicates values | Remove summary rows |
| Empty rows | Breaks continuity | Remove blank rows |
Application-Specific Guidance
Salesforce
Essential fields for Opportunities:
Id(unique key)StageNameAmountCloseDateOwnerIdorOwner.NameCreatedDateLastModifiedDate
Recommended filter:
(IsClosed = FALSE) OR (LastModifiedDate >= LAST_N_DAYS:7)
HubSpot
Essential fields for Deals:
hs_object_id(unique key)dealstageamountclosedatehubspot_owner_idcreatedatehs_lastmodifieddate
Jira
Essential fields for Issues:
key(unique key)statusissuetypepriorityassigneecreatedupdatedresolutiondate
NetSuite/ERP
Essential fields for Transactions:
internalid(unique key)typetrandatestatusentity(customer/vendor)amount
Testing Your Report
Before automating, verify your report works well:
Check for Unique Key
- Export a sample
- Count total rows
- Count distinct values in your ID column
- If counts don't match, investigate duplicates
Verify Filter Coverage
- Manually close/complete a test record
- Run report the next day
- Confirm the transition is captured
Validate Data Types
- Check dates parse correctly
- Verify numbers don't have formatting issues
- Confirm IDs are consistent format
Common Pitfalls
Pitfall: Missing Transitions
Symptom: Process analysis shows gaps or missing conversions
Cause: Recently closed items not in report
Fix: Adjust filter to include recently modified records
Pitfall: Duplicate Records
Symptom: Counts are inflated, unique key not detected
Cause: Report includes line items or is at wrong granularity
Fix: Aggregate to correct level or filter to parent records
Pitfall: Changing Column Names
Symptom: New table created, historical data disconnected
Cause: Column renamed in source report
Fix: Keep column names consistent; use calculated columns to map if needed
Pitfall: Inconsistent Filters
Symptom: Data volume fluctuates unexpectedly
Cause: Dynamic filters that change over time
Fix: Use absolute date ranges or relative ranges consistently
Related Topics
- Snapshot Datasets - Track entity changes
- Handling Report Changes - When reports change
- Adding Calculated Columns - Transform and enrich data
- Blending Datasets - Combine multiple sources
Updated about 1 month ago