NEW: Managing your data sources

Overview


To ensure the stability and performance of your Klipfolio dashboards, there is a 10 MB limit on Klipfolio data sources. A Klipfolio data source is a subset of data for one or more metrics from a service like Google Analytics that is used in a Klip or multiple Klips. Surpassing the limit results in data sources becoming unavailable. There are steps you can take to configure your data source to be less than 10 MB and avoid disrupting your dashboard.


Klipfolio does not specify limits for the number of rows and columns in Excel or CSV files or for the number of arrays and objects in XML and JSON data sources. The overall size and memory required for the data source is what is important.


If your data source is increasing in size over time, it is advisable to proactively take action to maintain your data source below the 10 MB limit. At a high level, some of the most common things that impact the size of your data sources include: number of metrics included (are all the metrics required for the associated Klip(s)?), time-frame (are all time periods required for this source at the same time or can they be divided into historical versus current or by time period). Below, we provide some options for reducing or segmenting your data source depending on the data source type.


Query-based data sources

Query-based data sources may be static or dynamic. A static data source is based on a static query with hard-coded values that retrieve data concurrently. For example, you can build a static query that will retrieve all of your sales data at once. A dynamic data source, on the other hand, uses one or more variables in its query instead of hard-coded values to enable the dynamic retrieval of data based on changing variable values - each value in the variable creates a new instance of the data source that was filtered by the variable, which results in smaller data source instance sizes than a similar unbounded static query.


Both static and dynamic data sources can be reduced by removing query dimensions, removing metrics, applying a filter, or changing the granularity of the data (for example, from daily to monthly data). These steps will essentially reduce the number of rows or columns which in turn decreases the overall size of the data source.


If you do need all the data, but not all at once, a dynamic data source may be a good option. If you have a static data source that is growing too large, you could replace it with a dynamic data source. To achieve this, you will need to create a new data source that is configured as dynamic, and update your formulas to replace the static data source references with references to the new dynamic data source. Examples of manipulating both static and dynamic data sources are provided in this article.


Query-based data sources include:

  • Adobe SiteCatalyst

  • comScore

  • Facebook*

  • Google Adwords (Custom query mode)*

  • Google Analytics*

  • Marketo

  • Radian6

  • Salesforce

  • Searchmetrics

  • SQL*

  • Web Accessible Resource*

  • XMLA


*Support dynamic data source queries



  1. Query-based static data sources

If you are using query-based data sources, you can apply filters to your query to reduce the size of the data set returned. If the data source exceeds the 10 MB limit, create multiple data sources grouped by a supported filter (for example, a time period filter may be applied).


A static data source query without a filter applied may look like the following query. In your query you could change the granularity of the data in the data source (for example, changing from daily to monthly aggregation of the data).


In this example, a large amount of data will be returned with this static query:


https://www.googleapis.com/analytics/v3/data/ga?ids=ga:12345&dimensions=ga:isMobile,ga:country&metrics=ga:users,ga:sessions&start-date=365daysAgo&end-date=today


To reduce the size of the data source, we created a static data source query with a filter applied to limit the amount of data returned. In this case, we are creating one data source per month. Each one of the following queries is a separate data source:


https://www.googleapis.com/analytics/v3/data/ga?ids=ga:12345&dimensions=ga:isMobile,ga:country&metrics=ga:users,ga:sessions&start-date={date.startOfMonth.addMonths(-12).format()}&end-date={date.endOfMonth.addMonths(-12).format()}


https://www.googleapis.com/analytics/v3/data/ga?ids=ga:12345&dimensions=ga:isMobile,ga:country&metrics=ga:users,ga:sessions&start-date={date.startOfMonth.addMonths(-11).format()} &end-date={date.endOfMonth.addMonths(-11).format()}



https://www.googleapis.com/analytics/v3/data/ga?ids=ga:12345&dimensions=ga:isMobile,ga:country&metrics=ga:users,ga:sessions&start-date={date.startOfMonth.addMonths(-1).format()} &end-date={date.endOfMonth.addMonths(-1).format()}



https://www.googleapis.com/analytics/v3/data/ga?ids=ga:12345&dimensions=ga:isMobile,ga:country&metrics=ga:users,ga:sessions&start-date={date.startOfMonth.addMonths().format()}&end-date=today


  1. Query-based dynamic data sources

Variables are used to facilitate reducing the size of your dynamic data sources instead of hard-coded filter parameters. This results in one data source definition that will generate multiple instances based on the values provided via the variable. For example, rather than one large data source that needs filtering using formulas in your Klips, include a drop down user input control indicating the various time periods important to your Klips. The value is used to populate the variable in the dynamic data source query definition and will create an individual instance for the relevant time period. Each instance can be significantly smaller than one large data source.


In the Query URI, a variable is used in place of a hard-coded value, using the format {props.varname}.


For example, in this query:


https://www.googleapis.com/analytics/v3/data/ga?ids=ga:12345&dimensions=ga:isMobile,ga:country&metrics=ga:users,ga:sessions&start-date=date.startOfYear&end-date={date.today}


replace ga:country with {props:country} and date.startOfYear with {props.vardate}


https://www.googleapis.com/analytics/v3/data/ga?ids=ga:12345&dimensions=ga:isMobile,{props.country}&metrics=ga:users,ga:sessions&start-date={props.vardate}&end-date={date.today}



About uploadable data sources


If you are using XLS or CSV files, you will need to segment the files if they are near to or exceed the 10 MB limit. See below for instructions on how to segment your data source file.


Uploadable data sources supported by Klipfolio include:

  • Upload a File

  • Google Drive

  • Email Attachment

  • Dropbox

  • Box

  • FTP/SFTP


Segmenting CSV or XLS data sources by column


In this example, we show you how to segment a single data source (shown below) into two files using the LOOKUP function to combine the data.


A table with three columns - client ID, hours and category - was created using a single data source called Sample Data:




A table with the same three columns, client ID, hours and category, is created using two data sources, Sample Data 1 and Sample Data 2:



Divide the original data source into two or more (as needed) data sources, all of which contain the same key or index column. In this example, the client ID column exists in both data sources.


The data you need from Sample Data 1 is selected the same way as in the single data source example. To select data from Sample Data 2, you need to ensure it is aligned correctly with the data in Sample Data 1. To align the data, use the LOOKUP function as shown below:


LOOKUP(SLICE(A:A), SLICE(A:A), SLICE(C:C))


where A:A is selected from  Sample Data 1

and A:A and C:C are selected from Sample Data 2


For further instructions on using the LOOKUP and SLICE functions, see the following articles:


LOOKUP function

SLICE function


Segmenting multiple data sources by rows


In this example, we show you how to segment a single data source (shown below) into two files using the ARRAY function to combine the data.


A table with three columns - client ID, hours and category - created using a single data source, Sample Data:




A table with the same three columns - client ID, hours & category - is created using two data sources, Sample Data 1 and Sample Data 2:


Divide the original data source into two or more (as needed) data sources by rows. A recommended approach is to divide into logical groupings. In this example, the data sources are divided by month (in column B).



Use the ARRAY function to combine your data sources: select the data required from one data source, Sample Data 1, and wrap the ARRAY function around it. Then add data from the next data source, Sample Data 2.


ARRAY(SLICE(A:A), SLICE(A:A))


where A:A is selected from  Sample Data 1

and A:A is selected from Sample Data 2


For further instructions on using the SLICE and ARRAY functions, see the following articles:


SLICE function

ARRAY function


Spreadsheets with multiple sheets or workbooks


Spreadsheets with multiple worksheets can be reduced in size by converting the worksheets into individual files.  


Important concepts





Feedback and Knowledge Base