Analyzing Data with Explorer


Once data has been ingested into Scoop, it can be analyzed and summarized to provide insights into the data itself. The tool for doing this is Explorer, the second icon in the main toolbar. Explorer has two goals:

  1. Allow you to do ad hoc exploration of your dataset
  2. Allow you to create highly formatted visual summaries of slices and cuts of your data in either charts or tables that can be placed on your canvases

When using Explorer it is important to understand how Explorer summarizes dataset data. To create a summary, Explorer has to do two things:

  1. Allow you to specify what data slice is interesting to you. In other words, what metrics do you care about, how are they aggregated and filtered, and what attributes do you want to group them by. When doing this, Scoop has a very deep understanding of time and how it relates to your data. It has a lot of extra intelligence when dealing with dates embedded in your data provides unparalleled easy in analyzing your data by all of that. See Visualizing Time Series Data to learn more about creating data summaries over time. That said, sometimes you want to look at a defined data set and summarize it not by date/time but by something else, and Scoop gives you other options when not analyzing by time. See Visualizing Non-time-series Data for more.
  2. How that data summary is visualized: chart vs. table, colors, fonts, chart types, table header formats, etc. All this allows you to create highly compelling visuals to display the data summary that you have specified.

Essentially every data summary, is composed of the following:

  • Source metrics: These are basically the numbers in your dataset. Scoop detects numbers in your datasets and allows you to summarize them by aggregating them. Scoop attempts to intelligently assess whether a column in your data should be summed, counted, averaged, etc. It does this especially well when your source reports have aggregated totals on them so it can see how you like to aggregate them. However, should the defaults not be what you need, you can always create a KPI on that source column to change that. See Creating KPIs for more on that.
  • Attribute columns: These are columns by which you generally want to group your data. They are typically Strings (textual data) or Dates.
  • Key Performance Indicators (KPIs): These sophisticated elements allow you to precisely control how a data column is aggregated and even create complex formulas that combine simpler KPIs into complex aggregations. See Creating KPIs for more detail
  • Filters: When you are looking at data from a dataset, often you want to include only certain parts of it, or exclude others. You can create filters on your queries to narrow down the data you are summarizing. You can even save those filters to be used later as a re-usable definition of a set within your data.

The process of creating a data summary is slightly different whether you are creating a summary over time vs. a summary that does not include time. See Visualizing Data by Time and Visualizing Data not by Time to see how both of those work. Essentially when analyzing by time there are some powerful capabilities, but the way you choose how to group data is slightly different given that the data is a time series.