Working with Datasets in Scoop for Slack

Dataset Mastery: Your Gateway to Insights

Master the art of dataset management to unlock the full power of Scoop's analytics capabilities.

Understanding the Dataset Ecosystem

🌐 Three Types of Datasets

1. Organization Datasets šŸ¢

  • Company-wide data sources
  • Live connections to business systems
  • Automatic refresh schedules
  • Shared across teams
  • Examples: CRM, ERP, Marketing platforms

2. Personal Datasets šŸ‘¤

  • Your uploaded files
  • Private by default
  • Full control over sharing
  • Perfect for ad-hoc analysis
  • Examples: Excel reports, CSV exports

3. Channel Datasets šŸ“£

  • Auto-mapped to specific channels
  • Context-aware selection
  • Team-aligned data
  • Admin configured
  • Examples: Sales data in #sales

![Screenshot: Dataset selector showing different dataset types]

Navigating Datasets

šŸŽÆ Quick Selection Commands

See Available Datasets

@Scoop show datasets
@Scoop list all data sources
@Scoop what data can I analyze?

Switch Datasets

@Scoop use sales dataset
@Scoop switch to marketing data
@Scoop change to customer analytics

Check Current Dataset

@Scoop current dataset
@Scoop what am I analyzing?
@Scoop status

![Screenshot: Dataset selection dropdown interface]

šŸ“Š Understanding Dataset Cards

Each dataset displays rich metadata:

šŸ“Š Customer Analytics Dataset
━━━━━━━━━━━━━━━━━━━━━━━━━━━━━
Type: šŸ¢ Organization Dataset
Source: Salesforce + Support System
Records: 45,832 customers
Updated: 2 hours ago (Live sync)
Quality: 98% complete

Key Metrics:
• Total Revenue: $45.2M
• Active Customers: 3,421
• Avg Customer Value: $13,200
• Churn Rate: 12%

Top Tables:
• accounts (customer master)
• opportunities (sales pipeline)
• cases (support tickets)
• activities (engagement log)

[šŸ“„ Use This Dataset] [ā„¹ļø More Info]

Organization Datasets Deep Dive

šŸ”— Connected Systems

CRM Platforms

  • Salesforce: Accounts, Opportunities, Leads
  • HubSpot: Contacts, Deals, Activities
  • Pipedrive: Deals, Organizations, People
  • Microsoft Dynamics: Customers, Sales

Support Systems

  • Zendesk: Tickets, Satisfaction, Agents
  • Intercom: Conversations, Users, Tags
  • Freshdesk: Tickets, Contacts, Groups

Marketing Tools

  • Google Analytics: Traffic, Conversions
  • Marketo: Campaigns, Leads, Programs
  • Mailchimp: Campaigns, Subscribers

Financial Systems

  • QuickBooks: Invoices, Customers
  • Stripe: Payments, Subscriptions
  • NetSuite: Transactions, Accounts

šŸ”„ Data Freshness

Dataset: Sales Pipeline
Last Sync: 10 minutes ago
Next Sync: In 20 minutes
Sync Status: āœ… Healthy

Recent Changes:
• 12 new opportunities
• 34 updated stages
• 5 closed deals

[šŸ”„ Refresh Now] [āš™ļø Sync Settings]

šŸ” Permission Model

Access Levels:

  • Full Access: All data, no restrictions
  • Department: Your team's data only
  • Role-Based: Based on Slack groups
  • Custom: Admin-defined rules

Security Features:

  • Row-level security
  • Column masking for PII
  • Audit trail of access
  • Compliance controls

Personal Dataset Management

šŸ“¤ Creating Personal Datasets

From File Upload:

You: [Uploading quarterly_review.xlsx]

Scoop: šŸ“Š Creating personal dataset...
āœ… "Q4 Review Data" ready for analysis

This dataset includes:
• 15,420 records
• 12 analysis-ready columns
• Date range: Oct-Dec 2024

What would you like to explore?

From Analysis Results:

You: Save this filtered view as a dataset

Scoop: šŸ’¾ Saved as "High-Value Customers"
This personal dataset contains:
• 342 customers
• Filtered: LTV > $50,000
• All original columns preserved

[šŸ“Š Switch to New Dataset] [šŸ”™ Keep Current]

![Screenshot: Personal dataset created from uploaded file]

šŸ—‚ļø Organizing Personal Datasets

Naming Best Practices:

āœ… Good Names:
• "2024_Q4_Sales_Analysis"
• "Customer_Segmentation_Dec"
• "Marketing_Campaign_Results"

āŒ Avoid:
• "data"
• "test"
• "final_final_v2"

Dataset Actions:

@Scoop rename dataset to "Executive Dashboard Data"
@Scoop add description "Monthly KPIs for board meeting"
@Scoop tag dataset with #finance #monthly
@Scoop delete old datasets

šŸ”„ Dataset Lifecycle

Personal Dataset: Marketing Leads
Created: Dec 1, 2024
Last Used: Dec 15, 2024
Size: 2.4 MB

āš ļø This dataset hasn't been used in 14 days

Options:
[šŸ“Š Use Dataset] [šŸ”„ Update Data] [šŸ—‘ļø Delete] [šŸ“¤ Share]

Channel-Mapped Datasets

šŸŽÆ Automatic Context

How Mapping Works:

#sales-team → CRM Dataset
#marketing → Campaign Dataset
#support → Ticket Dataset
#finance → Revenue Dataset
#product → Usage Dataset

Smart Detection:

  • Channel name analysis
  • Member role detection
  • Historical query patterns
  • Admin preferences

šŸ”§ Configuration

For Admins:

@Scoop map dataset "Enterprise CRM" to #enterprise-sales
@Scoop set default dataset for #analytics
@Scoop show channel mappings

Override When Needed:

In #sales channel:
@Scoop use marketing dataset for this query
@Scoop temporarily switch to support data

Advanced Dataset Features

šŸ” Dataset Discovery

You: Which dataset has customer churn data?

Scoop: šŸ” Found churn data in 3 datasets:

1. Customer Analytics (Primary)
   • churn_flag column
   • churn_date column
   • churn_reason field

2. Sales CRM
   • account_status (includes "Churned")
   • close_lost_reason

3. Support Tickets
   • cancellation_tickets table

Recommend using Customer Analytics for comprehensive analysis.

[Use Customer Analytics] [Compare All Three]

šŸ”„ Dataset Relationships

You: Show relationships in this dataset

Scoop: šŸ”— Analyzing data structure...

Key Relationships Found:
• Customer ID links to multiple orders
• Product categories have hierarchies
• Time-based patterns detected

Understanding these helps you:
- Ask more sophisticated questions
- Navigate data relationships
- Build better queries

[Explore Relationships] [View Schema]

šŸ“Š Dataset Profiling

@Scoop profile this dataset

Dataset Profile: Sales Pipeline
━━━━━━━━━━━━━━━━━━━━━━━━━━━━

Table Structure:
• 8 tables, 127 total columns
• Primary table: opportunities (45K rows)
• Most joined: accounts (92% queries)

Data Quality:
• Completeness: 94%
• Missing: phone (45%), website (23%)
• Duplicates: 0.2% (78 records)

Usage Patterns:
• Top queries: Revenue, Pipeline, Forecast
• Peak usage: Mon 9-11 AM
• Power users: Sarah, Mike, Lisa

Optimization Suggestions:
• Add index on close_date
• Archive records > 2 years
• Standardize industry field

šŸŽØ Custom Views

Create Filtered Views:

You: Create a view of enterprise accounts only

Scoop: šŸŽØ Creating custom view...

View: "Enterprise Accounts"
Filters: 
- revenue > $1M
- employees > 500
- type = "Enterprise"

Results: 342 accounts (8% of total)

This view is now available as a virtual dataset.
[Use Enterprise View] [Edit Filters]

Saved Calculations:

You: Add customer health score to this dataset

Scoop: 🧮 Adding calculated field...

Health Score = 
  (Usage Ɨ 0.4) + 
  (NPS Ɨ 0.3) + 
  (Support Ɨ 0.2) + 
  (Payment Ɨ 0.1)

āœ… Field added and available for all queries

Dataset Best Practices

šŸŽÆ Choosing the Right Dataset

Match Dataset to Question:

Revenue questions → Financial dataset
Customer behavior → CRM dataset
Campaign performance → Marketing dataset
Product usage → Analytics dataset

Start Broad, Then Narrow:

1. Use comprehensive dataset
2. Explore available fields
3. Create filtered view if needed
4. Save as personal dataset for reuse

šŸ” Dataset Exploration

First Time with Dataset:

"describe this dataset"
"show me all tables"
"what are the key metrics?"
"show sample records"
"what questions can I answer?"

Understanding Relationships:

"how are tables connected?"
"show me the data model"
"what are the primary keys?"
"explain the relationships"

⚔ Performance Tips

Query Optimization:

  • Start with recent data
  • Use dataset-specific filters
  • Leverage pre-aggregated fields
  • Avoid SELECT * patterns

Large Dataset Strategies:

"sample 1000 records first"
"analyze last 30 days only"
"use the summary table"
"what's the optimized query?"

Troubleshooting Datasets

🚨 Common Issues

"Dataset not found"

Solutions:
1. Check exact dataset name
2. Verify permissions
3. Confirm workspace access
4. Try partial name match

"No data returned"

Debug steps:
@Scoop check dataset status
@Scoop when was last refresh?
@Scoop show sample data
@Scoop verify my filters

"Access denied"

Options:
• Request access from admin
• Check Slack group membership
• Verify organization settings
• Use personal dataset instead

šŸ”§ Dataset Maintenance

For Admins:

@Scoop audit dataset usage
@Scoop clean unused datasets
@Scoop optimize slow datasets
@Scoop check sync health

For Users:

@Scoop report data issue
@Scoop suggest new dataset
@Scoop request refresh
@Scoop why is data missing?

Advanced Workflows

šŸ”„ Dataset Comparison

You: Compare sales in CRM vs Finance dataset

Scoop: šŸ“Š Comparing datasets...

CRM Sales Dataset:
• Total: $12.4M
• Deals: 1,234
• Period: Real-time

Finance Dataset:
• Total: $11.9M
• Invoices: 1,198
• Period: Month-end close

Variance: $500K (4%)
Missing in Finance: 36 deals (pending invoice)

[View Details] [Reconcile] [Export Differences]

šŸŽ­ Dataset Personas

Create Role-Based Views:

Executive View:
- High-level KPIs only
- Trends and forecasts
- Exception reporting

Analyst View:
- All detailed data
- Custom calculations
- Raw access

Sales View:
- Territory filtered
- Real-time pipeline
- Commission relevant

šŸ¤– Smart Dataset Selection

You: Analyze customer satisfaction

Scoop: šŸ¤” Found relevant data in multiple datasets:

1. Support Tickets (CSAT scores)
2. NPS Survey Results  
3. Product Usage (engagement)
4. CRM (renewal data)

Would you like to:
[Analyze Support CSAT] [Combine All Sources] [Compare Datasets]

Next Steps

Ready to become a dataset power user?

  1. šŸ“¤ Upload Your First File - Create personal datasets
  2. šŸ“Š Master Visualizations - Beautiful charts from any dataset
  3. šŸ¤– ML on Datasets - Advanced analytics
  4. šŸš€ Advanced Features - Deep reasoning capabilities

Pro tip: The right dataset makes all the difference. Spend 30 seconds choosing the correct dataset and save 30 minutes of analysis time! šŸŽÆ