If you want more direct access to the raw analytics data that Treefort collects, one way to get it is to set up an integration with Google BigQuery. In this way, we can sync your analytics directly to your BigQuery project, where you can run custom queries.
š This guide assumes that you already have a BigQuery project set up. If you don't, please first follow the instructions found here.
š§ You will be charged by Google based on the amount of data being stored by BigQuery. The cost will vary based on your usersā usage. If you are concerned about the amount of data to be transferred, reach out to Treefort support and we can provide an estimate of roughly how much data will be sent per day.
šØ Treefort will make best-effort attempts to retry analytics batches which fail to reach your BigQuery instance, but you are responsible to ensure the connection details are correct and the Treefort service account has the necessary permissions. We will retry failed analytics batches for at least 3 days, but if your connection remains down for longer than that then analytics events will be lost. We cannot backfill missing analytics data due to misconfiguration.
Setup Instructions
Create a new dataset:
In the BigQuery explorer, expand your project, click "Datasets", then click the "+ Create dataset" button on the right.
Enter a valid dataset ID and hang on to it - you'll want it later.
Select the location of your choice to store data, but remember that location - you'll need it later.
You can accept the defaults and click "Create dataset".
Add a service account for Treefort:
Navigate to "IAM & Admin" -> "Service accounts"
Click "+ Create service account"
Give the account a descriptive name (such as
treefort-analytics, but this can be whatever you like)Click "Create and continue"
Add the following roles:
BigQuery Data Editor
BigQuery Job User
Storage Object User
Click "Continue".
Click "Done".
Generate a key for the service account: you should now be on the "Service accounts" page in "IAM & Admin" with a table of accounts and the newly-created account listed there.
Click the three-dot menu to the right of the service account
Click "Manage keys"
Click "Add key"
Click "Create new key"
Select "JSON"
The new key will be automatically downloaded to your computer. Note its location, as you'll need it later.
Create a Google Cloud Storage bucket. This is where Treefort will upload analytics event batches to, and Google BigQuery will ingest them from this bucket.
Go to "Cloud Storage" -> "Buckets".
Click "Create".
Enter a suitable bucket name, and take note of it.
Click "Continue".
Select "Region" and choose the same region you selected for your BigQuery dataset earlier.
Click "Continue" to accept the default storage class of "Standard".
Click "Continue" to accept the default access control settings, which makes your bucket non-public.
Click "Create" to accept the default data protection policy.
Click "Confirm" if prompted to indicate that public access will be prevented.
Go to the BigQuery service integration in your Treefort dashboard.
Enter your BigQuery Project ID (this is the top-level project that contains the dataset)
Enter the dataset ID that you created earlier
Enter the storage bucket name
Copy the entire contents of the JSON file you downloaded earlier and paste it into the JSON Private Key Credentials field
Click Save. This will validate the connection and immediately create the necessary tables in your BigQuery database.
Set up a Cloud Storage event-driven transfer to move data from Google Cloud Storage into Google BigQuery
Go to "BigQuery" -> "Data transfers".
Click "+ Create Transfer".
Select "Google Cloud Storage" from the "Source type" dropdown.
Give the transfer config a name, e.g. "treefort-analytics"
Select a desired transfer frequency. (This is how often BigQuery will check for new events in the Cloud Storage bucket; it does not control how often Treefort sends batches to the bucket. The Treefort frequency is not guaranteed but is usually every 30 seconds to few minutes.)
Under "Dataset", select the dataset you created earlier.
In "Destination table", enter
analytics_event.In "Cloud Storage URI":
Click "Select bucket"
Click the right-arrow chevron next to the bucket you created earlier
Enter
*under "Filename"Click "Select"
Check "Delete source files after transfer"
Change "File format" to "JSON"
If prompted, authenticate and grant permission to BigQuery Data Transfer Service to access your cloud storage bucket.
