Understanding Scoop Datasets

This section explains in more detail how Scoop turns raw reports into well structured analytical datasets. When Scoop ingests a report, it attempts to determine as much information as it can based on the structure of the report itself as well as the data. It then creates a definition of a dataset that can hold the report as well as any future reports that might also be ingested. That definition includes a "fingerprint" of the report based on its contents. Because of this, Scoop can do some pretty powerful things automatically that with any other analytical tool set would require detailed configuration and setup. The goal is to automatically handle most things that a data analyst would otherwise have to do and eliminating the need to have a deeply technical data team as a result.

To understand this, we will first explain how Scoop analyzes a report and turns that into a dataset. We will then explain how Scoop can intelligently manage that dataset over time. Then we will explore the sophisticated things Scoop does when snapshotting data to be able to analyze changes. And lastly we will discuss how Scoop handles dates present in data to allow maximum flexibility in analyzing data over time.

Additionally, Scoop can also connect to databases to extract data from queries. In some cases this is required as there is no reporting layer built with that data. See Connecting to a Database for more detail.