CoolaData provides an integration option that enables you to upload the content of any CSV or JSON file from a Google Cloud Storage bucket into your CoolaData project. This type of integration is typically used to enrich your CoolaData events with your own data.
CoolaData creates a dedicated bucket for you in Google Cloud Storage and then automatically uploads the file name and file type that you specify every hour.
To upload content into CoolaData from a Google Cloud Storage Bucket –
- Contact your CoolaData customer success representative or write to support@CoolaData.com. Ask them to create a Google Cloud Storage bucket for you into which you can dump your CSV and JSON files to be integrated into CoolaData.
- Upload your files into this Google Cloud Storage bucket. For example, by using the UPLOAD FILES button in the Google Cloud Platform user interface. You could also use the gsutil tool in order to automate this process. See the following link for some useful commands: https://cloud.google.com/storage/docs/gsutil
- Follow the instructions in Integrating with Predefined Data Sources and then select Google Cloud Storage . The following displays –
- Fill in the following –
- Integration Name – The name of this integration.
- Load data only once – Check this option to specify that CoolaData only loads the data from your CoolaData bucket once. Otherwise, data is loaded every hour, if available.
- Upload Data Format – Select CSV or JSON to specify the format of the files to be uploaded from your CoolaData bucket. JSON files must be flat (not nested) newline – delimited.
- File Name – The file name to be uploaded. Make sure to add the file type ending, such as csv/json.
- Use File Name as Table Name – Check this box to specify that the name of the table that is created in CoolaData is the same as the File Name (described above). This is the table name to be used in the queries that you will perform on the uploaded data. If you choose this option, then make sure that the file name is a valid table name.
- Table Name – If you did not select the option above, then enter the name of the table to be created in CoolaData to contain the data that is uploaded. Like all table names, it is case sensitive and cannot include spaces or special characters.
- Append Date to the Table Name – Appends the date when the table is created to the table name. A new table partition is created for each date. Its format is TableName_YYYYMMDD. Selecting this feature enables you to use Google’s BigQuery Data Partitioning feature. You may contact your CoolaData’s customer success representative to hear more about this feature.
- File Scheme – Define the scheme of the columns of the table to be uploaded by defining the name and data type of each column. Click the Add + button to add each new column. The following data types are supported – string, integer, float, Boolean and timestamp. For example, name:STRING, id:INTEGER, birthdate:TIMESTAMP.
- In the Insert Strategy field, select either –
- Append – New data is added to the table each time data is uploaded.
– OR –
- Replace – The table is overwritten each time the data is uploaded.
- In the Emails to Notify If the Failure field, type in the emails to whom to send integration upload and failure notifications.
- [Optional] You can use the Google Project ID and Google Dataset fields to define that CoolaData uploads data from your CoolaData bucket into your own Google project instead of into your CoolaData project. Contact your CoolaData customer success representative for more information.
- Click the Save button. Each hour CoolaData will then integrate the files that are dropped into this bucket into your project. The first integration process should take place within a few minutes.
After CoolaData has integrated the files into your CoolaData project, the CoolaData Google bucket is emptied. The files are moved to a subfolder named Uploaded in the bucket and a timestamp is appended to the beginning of the file name. The syntax is – YYYYMMDD_filname.filetype. For example, 20160823_stats.csv.
The Status column of the Integrations list changes to show Data Received.
If the integration process fails for any reason, then the file is moved to a subfolder in the bucket named Failed. An email is sent to the specified recipient(s) (described above) alerting them regarding the integration failure. The same filename convention would be applied in the Failed folder as well.
- Data can now be queried using the following syntax.
SELECT * from tablename limit 100