Sending Data

Is there any size limit for each event I send?

Event size should not exceed 100KB.

I’ve noticed events on ‘Live Events’ view but cannot see them when I query.

There are two reasons that this is likely to happen:

  1. The events haven’t been loaded to the data warehouse yet (Google BigQuery), and thus cannot be queried. In this case, wait for about an hour until the data loading is completed
  2. Some of the events’ properties were not sent in the correct format (either wrong type, missing etc.) and were consequently sent as invalids. You can query all invalids up to the previous 7 days, using the following query:

Why am I getting the following error, when I am sure this property exists in my project: “Field ‘session_duration’ not found in project, Coolalog no:5522517”?

This error occurs when trying to pull data from two different partitioned tables: events table, and session table. To solve it, join both tables using mutual property appears in both, e.g. user_id.
For instance, the following query will produce this error as event_name is a user scope property (saved in events tables), whilst session_duration is a session property (saved in sessions tables):

Does Cooladata support multiple customer identities?

A user can start as anonymous user (hash key generated automatically), and then become a registered user. We support one old identity per user.
Once the user is sending both identities within the same event, we know how to convert it to the new identity.

Is session_id mandatory?

No

Which columns are automatically generated?

based on session_ip we are generating ip_country, ip_region, ip_city, ip_longtitue, ip_latitude.
based on DUA (device user agent) we are generating brand and model.
based on timestamp, we are generating multiple columns (hour, day, month, year, week…)

Is the API Token the same as the App Key for using SDK implementation?

The AppKey is used to send events. It is the same regardless of how you send the events (REST, JavaScript etc.). However, the Query API is your user token and is used for querying the system. This can be retrieved by logging into Cooladata and clicking on the avatar in the top right-hand side of the screen.

How can I differentiate test data and organic data sent from my users?

There are several ways of differentiating between test data and real data:
A different project
A property
Both above methods require code intervention (in your app) to distinguish between real and test data.
In addition, if you have a distinct (and not too large) set of either devices or users that are generating test data, you can build segments that reflect those “test users”/”test devices” and filter them out in the dashboard slicers.

Handling Personally Identifiable Information (PII)

Cooladata takes the utmost precautions to ensure the security of your data in the cloud and continually upgrades with the latest security options.

Cooladata accepts any event properties that you send without filtering them. However, even so, we advise you not to send sensitive personal information (such as credit card numbers) that may help a malicious entity identify someone.

Here are a few tips for protecting personal information:

  • Conceal personally identifiable information. For example, by scrambling, cloaking, encrypting, faking or hashing it.
  • Send a person’s location, instead of their IP address.
  • Send only partial information, such as a person’s country instead of their IP address.
  • Do not send combinations of information that may help someone piece together who the person is, such as session IP, address, gender and age.
Print Friendly, PDF & Email

Handling Invalid Events

All events arriving at CoolaData are marked as either valid or invalid. Invalid events are stored in a separate table, not included in queries.

Validity information can be seen in the Live Events page. The following three columns can be selected from the columns list:

  • validation: valid/invalid/pending (a new event that will be added to the schema if valid)
  • invalidComments: will specify the reason if the event is invalid.
  • extraComments: will specify any changes made to the event data.

Once you resolve the validity issues, newly received events will be valid. Previously received events remain invalid in the CoolaData database.

 

Reasons an event could be invalid and method of handling

  • Event Structure (JSON): If the event JSON is not a valid JSON structure the event will be invalid.
  • Missing Data: CoolaData trackers manage user_id and event_timestamp_epoch automatically. However, when sending events via REST API all properties must be included. Each event must be sent with the following 3 mandatory properties:
    • event_name
    • user_id
    • event_timestamp_epoch
  • Wrong Data Type: If the data received with an existing property is of a different data type, the property will be invalid. For example, an integer property sent with quotation marks (as a string) will be invalid. To update/add properties go to the Project – Properties page.
  • Property Quota: Each project is set up with a certain quota on the number of properties that can be used in the Data Scheme. If you reach the limit of the Data Scheme new properties will be invalid. To extend your Data Scheme quota, please contact your customer success manager.
  • Date Out of Valid Range: A time stamp in the future or in the far past will be considered an invalid time stamp and the entire event will be invalidated. The default values of future/past events are: past – 30 days, future – 1 hour. This setting can be configured for your specific project. Contact support@cooladata.com for further assistance.
  • Mapping Modes: A project can have one of two mapping modes:
    • In Locked mode, new events and properties are marked as invalid and will not be stored. If any of the properties in the event is invalid for any reason, the entire event will be marked invalid and will not be saved. To update/add events go to the Project – Events page.  To update/add properties go to the Project – Properties page.
    • In Self-Learned mode, new events and properties are automatically added to the project schema. Recognizing and registering new events and properties to the schema in Self-Learned mode can take up to 5 minutes, during which time new events will be marked as invalid.
      For more information on mapping modes see Project Configuration.

 

Querying Invalid Events

  1. Wait an hour or two after sending the events to CoolaData using a Tracker.
  2. Open a new CQL report (click the CQL button in the main menu).
  3. Select a timeframe in the top, right of the window.
  4. Paste in the following query:
  5. To group events by failure reason, use the following query:
Print Friendly, PDF & Email

Common reasons for data discrepancies

Companies often use several tools to understand their user’s behaviour. Alongside Cooladata, most companies also use Google Analytics or their own DB for comparison. 

We suggest investigating discrepancies if there is more than a 5% difference between Cooladata and other tools. Any less is likely not material enough to warrant a full tracking audit, since often analytics are used to identify trends (e.g. how fast are we growing?), rather than exact numbers. If the difference is greater than 5% across all events, or specific events don’t match between systems, then further investigation is called for.

Check this guide to identify common reasons for data discrepancies between various systems:

Timezone

Cooladata’s default timezone is UTC. When comparing data between Cooladata and other system such as Google Analytics, consider the time zone differences.

Double-check your query

When weird numbers appear in a report, it’s not always a data issue. Sometimes the report is just not querying what we intend it to query. Make sure you are querying the correct date range, and no filters are applied, filtering out relevant data.

Invalids

Cooladata automatically validates the data sent to Cooladata in order to prevent Garbage In – Garbage Out situations. Events marked as invalid are not stored with the rest of the valid events, but in a separate, designated table for invalid events, which you can query to check what went wrong. See Handling Invalid Events to learn more.

Data Sampling

Google Analytics sometimes use sampled data in reports, causing discrepancies in the numbers from Cooladata. Google Analytics sampling occurs automatically when more than 500K sessions are collected for a report. Google Analytics state that a report is based on sampling in text above the report. When comparing Cooladata to a sampled Google Analytics report, discrepancies are expectable.

Session Definition

A session in Cooladata starts when someone visits your site or app, sending an event, and ends after thirty minutes of inactivity. The session duration is calculated as the difference between the first and last event in that session. This thirty minutes timeframe is a configurable parameter. If you are the project’s admin, you can see this parameter under ‘Session timeout“ in your project settings page. Most analytics tools, such as Google Analytics use this thirty minute definition, which might cause discrepancies when comparing session duration or number of sessions if you set this parameter to be different than thirty minutes in Cooladata. Also, consider that Google Analytics will count additional sessions for clicks on AdWords campaigns, and will hard stop all sessions at midnight, whereas in Cooladata sessions occurring across midnight (starting before midnight and ending afterwards) would be stored as one session. Other mobile analytics tools platforms also end sessions if the user moved the app to the background for more than a minute.

Events are sent differently

A common cause for discrepancy is the way the events are sent to Cooladata and other tools. For instance, if one tool is receiving events from the server-side and the other from the client side, differences in numbers will most likely occur.

Even if both tools are sent from the client side, the code needs to be checked. Sometimes, there is a logical condition for sending an event to one tool which is not the same as the code sending the event to the other tool. When using JS SDK, the location of the trackEvent code is important. If the call sending the event to one tool is at the top of the code and the call sending the event to another tool is at the bottom of the code, there might be some discrepancies, due to an error in the code or if the user manually closed the window before the trackEvent function was called.

Bots and Test Users

Some tools automatically filter out events created by bots, Cooladata does not. Cooladata does have several solutions available if you wish to slice out bad IP’s or bots. To find the best solution for you, contact your Customer Support Manager.   

If your operational DB automatically cleans test user’s activity, make sure you filter out test users in Cooladata as well.

Funnel and Conversion Definitions

Funnels in Cooladata count distinct (unique) users who completed the funnel in the date range in question, in the time window set in the report. Conversions are sometimes defined differently in other tools. For instance, Google Analytics count the number of sessions in which the funnel’s steps were completed.  

Notice that if you choose to set the funnel in Cooladata to show users who completed the funnel by X days, the funnel will only include users who did the first event X days before the end of the report date range when looking at the last week or month. This is done in order to give users who came in at the beginning of the date range, the same “chance” to complete the funnel as users who came in at the end of the date range.

Redirects and Self Referrals

When looking at Cooladata’s “referring_url” and “reffering_domain” you should see the url and domain the user was referred from. Sometimes, you see url’s and domain’s that you do not expect, such as your own url. This happens when your site uses redirecting rules, usually set up by your site admin.

Cooladata Sessions Table

Cooladata stores your data in two separate tables: one is your event table, in which every row is an event, and the other is your sessions table, in which every row is a session. The event table holds all the event-level data and event-scope properties, as well as user and session scope properties. The sessions table holds session-specific properties (such as session duration and the session path) as well as session and user scope properties. In order to optimize performance, Cooladata automatically shifts your queries to run on top of the sessions table if all the data you are searching for is there. For instance, Daily Active Users (DAU), counting number of unique users per day will run over the sessions table, but if you want to count Daily Active Payers (pDAU), you will have to run over the events table in order to add a filter for the payment event. Shifting between the sessions and events table might cause for slight differences due to the fact that when running over the sessions table, the date range is filtered according to the session start time (session_start_time_ts) whereas when running over the events table the date range is filtered according to the event (event_time_ts)  timestamp.

Print Friendly, PDF & Email

Managing User IDs

There are times when the same user may be active on several devices (Mobile, PC, Tablet, etc). Being able to associate user activities, across different devices or applications, to the same user may reveal important insights, and have an affect on overall business performance.

Cooladata provides a mechanism to associate user activities from various devices to the same user identity. We will detail this mechanism below. You are welcome to contact support@cooladata.com with any additional questions you might have.

The Challenge

Consider the following business scenario:

  1. A user installs a game on their Mobile device.
  2. The user plays the game and registers.
  3. The user then installs the same game onto their Tablet.
  4. The user plays and logs in with their Facebook identity.

How many users would you count in this scenario? The worst case scenario is that your analytics platform treats the interactions of this single user as four different users:

  1. Interactions on Mobile pre-registration would be counted as one user (1).
  2. Interactions on Mobile post-registration would be counted as one user (2).
  3. Interactions on Tablet pre-login would be counted as another user (3).
  4. Interactions on Tablet post-login would be counted as another user (4).

A better strategy would be to track all these interactions as a single user.

How should you track this with Cooladata?

Any Event that is sent to Cooladata must provide the Property user_id. Cooladata expects you to use the user_id property to send a constant anonymous user ID, which is available for the application at any time.

In addition to that, with any event you can provide a user_alternative_id (Alias) as an Event Property. Cooladata expects you to provide the user_alternative_id when the user registered and you can gain access for a registered user ID.

Once an event is sent with a user_id and user_alternative_id together, Cooladata will associate all events from this registered user, based on user_alternative_id, to the pre-registered user, based on user_id.

Let’s review the five steps scenario as Cooladata expects you to track it:

Scenario Stepuser_iduser_alternative_id (Alias)uid
11234NACoolaData generated 'abc'
212349991234 and 999 events are mapped to 'abc'
36789NACoolaData generated 'xyz'
46789999Mapped to 'abc'

When you use one of CoolaData’s Trackers, CoolaData automatically enriches each event with a user_id that uniquely identifies the end user who performed the event. This user_id is saved as a CoolaData cookie on the device that generated the event and is uniquely associated with that user_id.

Sending your own user ID

In addition, you have the option to add your own user_id to each event. To assign your own user_id to an event add a property named user_id to the JSON sent to CoolaData.

Note – When you use the CoolaData REST API to send events, CoolaData does not automatically generate a user_id. In order to manage user IDs, you must add the user_id property to each event, or send it with the tracker init.

If you added your own user_id to an event then CoolaData stored the user_id that you sent as customer_user_id, the user_id that CoolaData automatically generated (as a cookie on the device) as internal_user_id. Both types are treated as the same user and maintain a one-to-one relationship between them.

 

Consolidating multiple users

In addition to the above, you may want CoolaData to manage multiple users as a single entity. This may be useful when you want CoolaData to consolidate the events of users that log in anonymously (or with a different login name) on the same computer or users that log in using the same name on a different computer.

To consolidate multiple users:

  • Add an alternative_user_id property containing your website/app login name to an event along with a user_id property. This will map multiple customer user IDs to the same internal_user_id.
  • The two IDs only need to be sent once to make the connection. Sending them again will have no impact.

Tips:

  • To display the user_id that you sent to CoolaData, query the customer_user_id property.
  • For better performance, in CQL queries, use user_id whenever possible instead of the customer_user_id or alternative_user_id.

 

User management at Cooladata

Cooladata’s data model manages three user_id fields: customer_user_id, user_id and user_alternative_id.

  • customer_user_id is a mandatory property in any event that is being sent to Cooladata. Use this Property to send the anonymous ID, which is available at all times. In most cases this is an ID that is saved in a cookie called cd_user_id.

  • user_alternative_id (Alias) is an additional user property. You can send it with any event you choose and it’s not mandatory. This Property should be used when your user identifies themselves uniquely for the first time (e.g. via logging-in), and you can send it when available. It should be sent together with the user_id you send with any event.

  • user_id is managed by Cooladata and is used to keep a coherent user identity. The user management module of the Cooladata platform generates the user_id to keep track of the various interactions and map all interactions to a global user ID.

    Important note: the user matching mechanism (user_alternative_id) is currently in beta. to use it, contact support@cooladata.com – This feature will not be active prior to contacting us.

Other scenarios that can impact counts:

IOS – iTunes & App Stores – Automatic Downloads

Apple may install an app of multiple device from a single install event. Users install the app on their mobile device and the user iOS configuration (Settings -> iTunes and App Store) may enable Apple to automatically install the same app on additional devices such as tablets or mac owned by the same user.

iOS settings

Many customers send an install event from their mobile measurement provider (such as AppsFlyer or Kochava). In this case, you will see activity from devices that never had an install event.

If you are using IDfV (ID for Vendor) for the individual user, you will also see two users, where it is the same user with different devices.

As a result you may prefer generating an id per user per app and send the device as a property instead of using the IDfv.

This may also help when users continue a session between different devices using IOS8 Handoff.

How User IDs are Assigned in CoolaData:

CoolaData assigns a user_id so that it will persist across sessions. If applicable, we also persist the user ID across apps and upon installs and uninstalls. The following text details how user ID assigned in various platforms.

JavaScript

The CoolaData JavaScript SDK automatically generates a unique user ID. The user ID is generated the first time the library is called, and stored in a cookie for future usage. The cookie is stored by default for one year, and applicable for the toplevel domain.

iOS

If the app is using the AdSupport.framework, we’ll use the Advertising Identifier.
For apps that do not have the AdSupport.framework included, CoolaData will default to using the identifierForVendor as the user_id.

Android

If the app is using Google Play services 4.0+, then we’ll use AdvertisingIdClient.
If the app working with the READ_PHONE_STATE permission, then we get Secure.ANDROID_ID; otherwise, use UUID.randomUUID().

 

User Time to Expire

The data you tracked is saved in Cooladata indefinitely.

However, user ID associations are kept for up to 90 days. Users that have not been active for more than 90 days will be treated as new users. This means that they will not be associated with any data previously provided about them. Note that users can be assigned multiple IDs, and logging in using any of the ID’s associated with them will keep them active in Cooladata for an additional 90 days. To update user’s data after more than 90 days, send the data again with one of the first events once they’ve logged in to the system.

Print Friendly, PDF & Email

Managing Sessions

What is a session?

In CoolaData, a session starts when a user accesses your website/app and ends after that user has not performed any event for a specified amount of time.

 

Sessions table

CoolaData maintains an aggregated table of session information called the Session Table. This is the source from which CoolaData extracts behavioral analytics, meaning analysis and reporting of the sequence of user events. For example, in Funnel and Cohort reports. Some reports extract data from this Session Table instead of from the Events Table. Extraction from the Session Table is faster than from the Events Table. However, the Session Table does not maintain information about the properties of an event.

 

Session ID

CoolaData automatically assigns a unique identifier (Session ID) to each session.

You can specify your own session ID by including the session_id property in the event sent to CoolaData. If you send your own session ID, then this information is stored as a property called customer_session_ID. CoolaData still maintains its own CoolaData session ID in addition. You can then query using either customer_session_ID or CoolaData’s session_ID.

Note: Properties that are calculated per session are based on CoolaData’s session ID (session_id), and not on your session ID (customer_session_ID).

 

Session properties

Some CoolaData properties are collected per session:

  • session_duration – The duration of the session, in milliseconds.
  • session_path – A session path is a sequence of event names (funnel) in a session. CoolaData provides various visualization options showing the path of events in a user session, such as a Funnel report, Cohort report or Path report.
  • session_id – A unique ID assigned to a specific user for the duration of that user’s visit (session).

 

Ending a Session

CoolaData automatically stores the events of a session when it ends. Therefore, it is important to ensure that a session ends.

CoolaData automatically ends a session after no events have been sent for that user for a specified amount of time – by default, 30 minutes. This can be changed from the Project – Settings page.

If a session does not end within six hours, then CoolaData automatically ends it and stores its data. Therefore, if a session is still getting events, then it may take a while until you are able to query that data.

Best Practices:

  • For a website session, we recommend a timeout of 30 minutes.
  • For an app session, we recommend a timeout of 5 minutes.
  • If your website has videos that include advertisements: if ads are sent as events we recommend using the 30-minute timeout. However, if the ads are not sent as events, we recommend you lengthen the session timeout to longer than the video, which might run more than an hour before sending an event.

 

Include in Path

Each event that you define in CoolaData has an Include in Path setting that specifies whether an event is shown in the CoolaData behavior analytics visualizations (Funnel, Cohort and Path Analysis reports) or not. The Include in Path field should be off for certain types of repetitive events that are not needed for a session analysis.

For example, a keep_alive event that is generated when the device periodically verifies that it is still connected. Because keep_alive and page_load events  are sent repeatedly, the session will only end after six hours unless you turn the Include in Path option off because CoolaData will continue to leave the session active for 30 minutes (default) after each event.

Regardless of the value selected in the Include in Path field (meaning even if it is off), the event still affects the session_duration (which is the duration of the session, in milliseconds) and is still counted in the session_event_counter (which is the number of events in the session).

Print Friendly, PDF & Email