Dataset Mastery: Your Gateway to Insights

Master the art of dataset management to unlock the full power of Scoop's analytics capabilities.

Understanding the Dataset Ecosystem

🌐 Three Types of Datasets

1. Organization Datasets 🏢

Company-wide data sources
Live connections to business systems
Automatic refresh schedules
Shared across teams
Examples: CRM, ERP, Marketing platforms

2. Personal Datasets 👤

Your uploaded files
Private by default
Full control over sharing
Perfect for ad-hoc analysis
Examples: Excel reports, CSV exports

3. Channel Datasets 📣

Auto-mapped to specific channels
Context-aware selection
Team-aligned data
Admin configured
Examples: Sales data in #sales

![Screenshot: Dataset selector showing different dataset types]

Navigating Datasets

🎯 Quick Selection Commands

See Available Datasets

@Scoop show datasets
@Scoop list all data sources
@Scoop what data can I analyze?

Switch Datasets

@Scoop use sales dataset
@Scoop switch to marketing data
@Scoop change to customer analytics

Check Current Dataset

@Scoop current dataset
@Scoop what am I analyzing?
@Scoop status

![Screenshot: Dataset selection dropdown interface]

📊 Understanding Dataset Cards

Each dataset displays rich metadata:

📊 Customer Analytics Dataset
━━━━━━━━━━━━━━━━━━━━━━━━━━━━━
Type: 🏢 Organization Dataset
Source: Salesforce + Support System
Records: 45,832 customers
Updated: 2 hours ago (Live sync)
Quality: 98% complete

Key Metrics:
• Total Revenue: $45.2M
• Active Customers: 3,421
• Avg Customer Value: $13,200
• Churn Rate: 12%

Top Tables:
• accounts (customer master)
• opportunities (sales pipeline)
• cases (support tickets)
• activities (engagement log)

[📥 Use This Dataset] [ℹ️ More Info]

Organization Datasets Deep Dive

🔗 Connected Systems

CRM Platforms

Salesforce: Accounts, Opportunities, Leads
HubSpot: Contacts, Deals, Activities
Pipedrive: Deals, Organizations, People
Microsoft Dynamics: Customers, Sales

Support Systems

Zendesk: Tickets, Satisfaction, Agents
Intercom: Conversations, Users, Tags
Freshdesk: Tickets, Contacts, Groups

Marketing Tools

Google Analytics: Traffic, Conversions
Marketo: Campaigns, Leads, Programs
Mailchimp: Campaigns, Subscribers

Financial Systems

QuickBooks: Invoices, Customers
Stripe: Payments, Subscriptions
NetSuite: Transactions, Accounts

🔄 Data Freshness

Dataset: Sales Pipeline
Last Sync: 10 minutes ago
Next Sync: In 20 minutes
Sync Status: ✅ Healthy

Recent Changes:
• 12 new opportunities
• 34 updated stages
• 5 closed deals

[🔄 Refresh Now] [⚙️ Sync Settings]

🔐 Permission Model

Access Levels:

Full Access: All data, no restrictions
Department: Your team's data only
Role-Based: Based on Slack groups
Custom: Admin-defined rules

Security Features:

Row-level security
Column masking for PII
Audit trail of access
Compliance controls

Personal Dataset Management

📤 Creating Personal Datasets

From File Upload:

You: [Uploading quarterly_review.xlsx]

Scoop: 📊 Creating personal dataset...
✅ "Q4 Review Data" ready for analysis

This dataset includes:
• 15,420 records
• 12 analysis-ready columns
• Date range: Oct-Dec 2024

What would you like to explore?

From Analysis Results:

You: Save this filtered view as a dataset

Scoop: 💾 Saved as "High-Value Customers"
This personal dataset contains:
• 342 customers
• Filtered: LTV > $50,000
• All original columns preserved

[📊 Switch to New Dataset] [🔙 Keep Current]

![Screenshot: Personal dataset created from uploaded file]

🗂️ Organizing Personal Datasets

Naming Best Practices:

✅ Good Names:
• "2024_Q4_Sales_Analysis"
• "Customer_Segmentation_Dec"
• "Marketing_Campaign_Results"

❌ Avoid:
• "data"
• "test"
• "final_final_v2"

Dataset Actions:

@Scoop rename dataset to "Executive Dashboard Data"
@Scoop add description "Monthly KPIs for board meeting"
@Scoop tag dataset with #finance #monthly
@Scoop delete old datasets

🔄 Dataset Lifecycle

Personal Dataset: Marketing Leads
Created: Dec 1, 2024
Last Used: Dec 15, 2024
Size: 2.4 MB

⚠️ This dataset hasn't been used in 14 days

Options:
[📊 Use Dataset] [🔄 Update Data] [🗑️ Delete] [📤 Share]

Channel-Mapped Datasets

🎯 Automatic Context

How Mapping Works:

#sales-team → CRM Dataset
#marketing → Campaign Dataset
#support → Ticket Dataset
#finance → Revenue Dataset
#product → Usage Dataset

Smart Detection:

Channel name analysis
Member role detection
Historical query patterns
Admin preferences

🔧 Configuration

For Admins:

@Scoop map dataset "Enterprise CRM" to #enterprise-sales
@Scoop set default dataset for #analytics
@Scoop show channel mappings

Override When Needed:

In #sales channel:
@Scoop use marketing dataset for this query
@Scoop temporarily switch to support data

Advanced Dataset Features

🔍 Dataset Discovery

You: Which dataset has customer churn data?

Scoop: 🔍 Found churn data in 3 datasets:

1. Customer Analytics (Primary)
   • churn_flag column
   • churn_date column
   • churn_reason field

2. Sales CRM
   • account_status (includes "Churned")
   • close_lost_reason

3. Support Tickets
   • cancellation_tickets table

Recommend using Customer Analytics for comprehensive analysis.

[Use Customer Analytics] [Compare All Three]

🔄 Dataset Relationships

You: Show relationships in this dataset

Scoop: 🔗 Analyzing data structure...

Key Relationships Found:
• Customer ID links to multiple orders
• Product categories have hierarchies
• Time-based patterns detected

Understanding these helps you:
- Ask more sophisticated questions
- Navigate data relationships
- Build better queries

[Explore Relationships] [View Schema]

📊 Dataset Profiling

@Scoop profile this dataset

Dataset Profile: Sales Pipeline
━━━━━━━━━━━━━━━━━━━━━━━━━━━━

Table Structure:
• 8 tables, 127 total columns
• Primary table: opportunities (45K rows)
• Most joined: accounts (92% queries)

Data Quality:
• Completeness: 94%
• Missing: phone (45%), website (23%)
• Duplicates: 0.2% (78 records)

Usage Patterns:
• Top queries: Revenue, Pipeline, Forecast
• Peak usage: Mon 9-11 AM
• Power users: Sarah, Mike, Lisa

Optimization Suggestions:
• Add index on close_date
• Archive records > 2 years
• Standardize industry field

🎨 Custom Views

Create Filtered Views:

You: Create a view of enterprise accounts only

Scoop: 🎨 Creating custom view...

View: "Enterprise Accounts"
Filters: 
- revenue > $1M
- employees > 500
- type = "Enterprise"

Results: 342 accounts (8% of total)

This view is now available as a virtual dataset.
[Use Enterprise View] [Edit Filters]

Saved Calculations:

You: Add customer health score to this dataset

Scoop: 🧮 Adding calculated field...

Health Score = 
  (Usage × 0.4) + 
  (NPS × 0.3) + 
  (Support × 0.2) + 
  (Payment × 0.1)

✅ Field added and available for all queries

Dataset Best Practices

🎯 Choosing the Right Dataset

Match Dataset to Question:

Revenue questions → Financial dataset
Customer behavior → CRM dataset
Campaign performance → Marketing dataset
Product usage → Analytics dataset

Start Broad, Then Narrow:

1. Use comprehensive dataset
2. Explore available fields
3. Create filtered view if needed
4. Save as personal dataset for reuse

🔍 Dataset Exploration

First Time with Dataset:

"describe this dataset"
"show me all tables"
"what are the key metrics?"
"show sample records"
"what questions can I answer?"

Understanding Relationships:

"how are tables connected?"
"show me the data model"
"what are the primary keys?"
"explain the relationships"

⚡ Performance Tips

Query Optimization:

Start with recent data
Use dataset-specific filters
Leverage pre-aggregated fields
Avoid SELECT * patterns

Large Dataset Strategies:

"sample 1000 records first"
"analyze last 30 days only"
"use the summary table"
"what's the optimized query?"

Troubleshooting Datasets

🚨 Common Issues

"Dataset not found"

Solutions:
1. Check exact dataset name
2. Verify permissions
3. Confirm workspace access
4. Try partial name match

"No data returned"

Debug steps:
@Scoop check dataset status
@Scoop when was last refresh?
@Scoop show sample data
@Scoop verify my filters

"Access denied"

Options:
• Request access from admin
• Check Slack group membership
• Verify organization settings
• Use personal dataset instead

🔧 Dataset Maintenance

For Admins:

@Scoop audit dataset usage
@Scoop clean unused datasets
@Scoop optimize slow datasets
@Scoop check sync health

For Users:

@Scoop report data issue
@Scoop suggest new dataset
@Scoop request refresh
@Scoop why is data missing?

Advanced Workflows

🔄 Dataset Comparison

You: Compare sales in CRM vs Finance dataset

Scoop: 📊 Comparing datasets...

CRM Sales Dataset:
• Total: $12.4M
• Deals: 1,234
• Period: Real-time

Finance Dataset:
• Total: $11.9M
• Invoices: 1,198
• Period: Month-end close

Variance: $500K (4%)
Missing in Finance: 36 deals (pending invoice)

[View Details] [Reconcile] [Export Differences]

🎭 Dataset Personas

Create Role-Based Views:

Executive View:
- High-level KPIs only
- Trends and forecasts
- Exception reporting

Analyst View:
- All detailed data
- Custom calculations
- Raw access

Sales View:
- Territory filtered
- Real-time pipeline
- Commission relevant

🤖 Smart Dataset Selection

You: Analyze customer satisfaction

Scoop: 🤔 Found relevant data in multiple datasets:

1. Support Tickets (CSAT scores)
2. NPS Survey Results  
3. Product Usage (engagement)
4. CRM (renewal data)

Would you like to:
[Analyze Support CSAT] [Combine All Sources] [Compare Datasets]

Next Steps

Ready to become a dataset power user?

📤 Upload Your First File - Create personal datasets
📊 Master Visualizations - Beautiful charts from any dataset
🤖 ML on Datasets - Advanced analytics
🚀 Advanced Features - Deep reasoning capabilities

Pro tip: The right dataset makes all the difference. Spend 30 seconds choosing the correct dataset and save 30 minutes of analysis time! 🎯