If you run a WordPress site, read how to download and install our plugin.

Content Insights (CI) tracking code is a piece of JavaScript code that will send information about your articles and reader behaviors to the CI servers.

Detailed and customized instructions are sent to each client once they agree to test CI on their site.

The specifics of configuration for your domain can always be found in the application, under Settings → Tracking code.

CI tracking code is asynchronous and it will not affect the speed of the website.

Before you start

The code should be put only on article pages. It shouldn’t be on the homepage, category/tag/sections pages, etc. Also, make sure that your CMS does not load tracking code in the preview mode or on any temporary pages. Proven best practice is to omit our code when staff members are logged in. Be careful not to omit the code from all logged-in users if you have subscriptions system or any other option where you allow your readers to register on your website.

If you have a staging/test server please let us know and we’ll assist you.

Basic principle of JavaScript tracking code is to search the page for global _ain object and to send information found in that object to Content Insights servers. Setting up tracking code is making sure that _ain object is populated with the correct metadata about the article.

Code snippet

This is an example of an empty tracking code that you should put on every article page. It should contain your domain ID.

<script type="text/javascript">
  /* CONFIGURATION START */
  var _ain = {
    id: "1234",
    postid: "",
    maincontent: "",
    title: "",
    pubdate: "",
    authors: "",
    sections: "",
    tags: "",
    comments: "",
    access_level: "",
    article_type: "",
    reader_type: "",
    image: ""
  };
  /* CONFIGURATION END */
  (function (d, s) {
    var sf = d.createElement(s);
    sf.type = 'text/javascript';
    sf.async = true;
    sf.src = (('https:' == d.location.protocol)
      ? 'https://d7d3cf2e81d293050033-3dfc0615b0fd7b49143049256703bfce.ssl.cf1.rackcdn.com'
      : 'http://t.contentinsights.com') + '/stf.js';
    var t = d.getElementsByTagName(s)[0];
    t.parentNode.insertBefore(sf, t);
  })(document, 'script');
</script>

You can set this up however you like. It can be put directly on the page, it can be part of some JS file that you already have or it can be set through Google Tag Manager.

If you put this code into some library on your page, make sure that the _ain object is globally accessible. This can be achieved by declaring it somewhere in the global scope before populating it in the library.

Parameters

URL

In most cases, sites are using canonical URL so you don’t have to specify this parameter, tracker will add it automatically.

<link rel="canonical" href="https://contentinsights.com/blog/changing-the-way-we-approach-editorial-analytics/" />

Only if your site doesn’t have canonical link or og:url parameter, please specify it manually.

It needs to be full URL, with the domain, like this:

url: "https://contentinsights.com/blog/changing-the-way-we-approach-editorial-analytics/",

It should be unique, canonical link to the article. Having variations in the URL of the same article can cause problems in calculations later.

Id (required)

Domain ID is a 4-digit number automatically obtained by CI application after domain registration.

You can always get it from Settings → Tracking code.

Postid (required)

Post ID is unique identificator of the article. We will combine all the statistics for the articles with the same postid. It doesn’t need to be numerical. You can use article ID from the CMS, slug, even whole URL.

For example, WordPress sites have shortlink tag with post ID:

<link rel='shortlink' href='https://contentinsights.com/blog/?p=13' />

In this case, postid parameter should have this value:

postid: "13"

Maincontent (required)

In Content Insights, we have a metric called Read depth.

To calculate how far reader got while reading the article, we look where the content is, calculate its height, count the number of words, and then we measure the time and how far user scrolled while reading the article. All these metrics are then put in the formula that tells us how much of the content is being read by the user for the given time.

To know where the content is so we can count the words and calculate the height and the position relative to the window, we need to know exactly what elements of the page contain the content of the article. To identify those elements, you must specify CSS selectors of those elements in the maincontent parameter.

Example:

maincontent: "#article-title, .article-body"

Pubdate (required)

This is the date when the article is published. It has to be in ISO 8601 format.

Example:

pubdate: "2017-03-22 13:24"

Title

Simple article title.

You can use the value from the title or og:title tag, but you should remove the site title, if present.

If there are quotes in the title they need to be escaped. For example, this title will break the JavaScript:

title: "Did you watch new "Alien" movie?"

Authors

Author name or comma-separated list of authors.

If authors sign articles differently each time (e.g. John Smith, J. Smith, J.S.) then it would be best if we find a way to always send us just one version. There is an option in the system to manually group them afterwards, but it’s best if the data is clean from the start.

authors: "John Smith"

When multiple authors contributed to the article, it’s important that you send us authors as comma-separated string. You cannot use different separator.

Example:

authors: "Jack Black, Jane White"

Our system can work without authors, but valuable insights will be lost.

Sections

Sections are the primary way of organizing content in Content Insights.

We use sections as containers for comparing one article against each other. This parameter should be filled with a single section, sections with hierarchy or multiple sections as comma-separated strings.

Sections in a hierarchy usually reflect the site structure, for example:

  • News

  • Sports

  • Life

  • Fashion

  • Entertainment

Sections can be grouped (nested) with the > separator.

For example, if an article belongs to the section Football, which is part of the Sport section, you can send us this value:

sections: "Sport>Football"

If the article contains embedded video and, for example, you have Video section, you can send us this value:

sections: "Sport>Football, Video"

Tags

Values from the tags parameter are called Topics in Content Insights. Just like the sections, it should be a comma-separated string of topics.

tags: "Premier League, Chelsea, Antonio Conte"

Comments

Number of comments for the article.

Sometimes it can be very tricky to get the number of comments, especially if you use 3rd party services. In that case, you should send an empty string:

comments: ""

Access level

Represents article access level type. Value can be one of the following standardized options: “free”, “preview”, “paid”

access_level: "free"

Article type

Represents the type of article as specified by publisher (e.g. 'news', 'gallery', 'blog', 'essay')…

article_type: "news"

Reader type

Represents type of visitor/reader currently browsing your site.

Value can be one of the following standardized options: “anonymous”, “registered”, “subscribed”

reader_type: "anonymous"

Use 'anonymous' for all guest visitors. Use 'registered' for registered or logged-in visitors. Use 'subscribed' for subscribed or paid visitors.

Image

The URL of the main/lead/featured image associated with the article

image: "http://www.example.com/path_to_image.png"

AJAX powered sites

If your site uses AJAX to load content dynamically (e.g. infinite scroll, sliders, etc.), you’ll need to do some additional steps. AJAX injected content changes the parameters of the page, so you’ll need to update the data sent to the tracking code accordingly. This is done by updating _ain object with the new parameters. Once the parameters are changed, method _ain.track() should be called to register those changes and start tracking the page using modified data.

Example:

_ain.authors = "Jon Johnson, Tom Tomphson";
_ain.url = "http://www.example.com/news/articlename";
_ain.postid = "1234";
_ain.maincontent = "#main-content";
_ain.title = "Article title";
_ain.pubdate = "2015-03-10T13:04:50Z";
_ain.comments: "10";
_ain.sections = "News, Politics";
_ain.tags = "news, politic, white house";
//call the track method
_ain.track();

Waiting for the content to be ready

By default, our tracking code starts to track the page from the moment it’s loaded. If you want to prevent this behaviour, you can set parameter trackauto: false during _ain object initialization.

If trackauto is set to false, you should notify tracking code when the data is ready. This is done by calling _ain.track() method.

Example:

var _ain = {
  id: {DOMAIN_ID},
  ...
  trackauto: false,
};

When content is loaded, you manually call the _ain.track() method. This will most probably happen in some callback function provided by your AJAX script. Just make sure that you call _ain.track() when all content is loaded and all scripts are finished.

Dynamic maincontent parameter

When there are more than one articles loaded, each article must have unique CSS selector(s), so you have to provide there are no duplicate id’s or classes on the loaded content.

Example:

Each article has <div id="article"> tag which contains all of the content (ideal case). This means that static pages would have "#article" as a maincontent parameter. However, on dynamically loaded content, there will be more articles on the same page.

One solution is to add some kind of suffix to the id. This can be postid or some other string that makes an article unique. For example, first article would have #article-1, second #article-2 and so on.

AMP

AMP is an open-source initiative to improve the mobile web experience by allowing web pages to load instantly on mobile devices.

In order to track your AMP articles please implement following script on every AMP page:

<amp-analytics config="https://1e32b3109a3889d6eb04-114932bc2bae9698d2e445432680b599.ssl.cf1.rackcdn.com/amp.json">
    <script type="application/json">
        {
            "vars": {
                "id": "ID",
                "postid": "POSTID"
            }
        }
    </script>
</amp-analytics>

id parameter has a fixed value for a single domain.

postid must match _ain.postid value from the desktop version of an article.

Once AMP tracking has been implemented, you will be able to see google-amp as a new referrer.

AMP referrer URLs will show various sources of AMP traffic.

Notes:

At the moment AMP data is excluded from the Engagement and Loyalty CPIs, in order to prevent a significant impact of AMP traffic on those components.

FIA

Facebook Instant Articles is a mobile publishing format that enables news publishers to distribute articles to Facebook which load up to 10 times faster than the standard mobile web.

To follow these instructions you need to know how Content Insights (CI) tracking code works for web sites and understand its parameters.

The tracking code should be set like this in the source of the instant article:

<figure class="op-tracker">
    <iframe>
        <script type="text/javascript">
            /* CONFIGURATION START */
            var _ain = {
                referrer: "http://ia.facebook.com", // this must be exactly like this on all requests
                id: "__ID__",
                url: "", // URL of the article
                postid: "", // must match the _ain.postid value from the desktop version of the article
            };
            /* CONFIGURATION END */
          (function (d, s) {
            var sf = d.createElement(s);
            sf.type = 'text/javascript';
            sf.async = true;
            sf.src = (('https:' == d.location.protocol)
              ? 'https://d7d3cf2e81d293050033-3dfc0615b0fd7b49143049256703bfce.ssl.cf1.rackcdn.com'
              : 'http://t.contentinsights.com') + '/stf.js';
            var t = d.getElementsByTagName(s)[0];
            t.parentNode.insertBefore(sf, t);
          })(document, 'script');
        </script>
    </iframe>
</figure>

IMPORTANT: If you copying the code, please make sure that SSL URL string is not broken into two lines.

The <figure> and <iframe> tags are needed per FB documentation on FIA.

Since there must be a version of the article already existing on your website, other article metadata parameters are omitted.

Referrer field must be specified as "http://ia.facebook.com".

Domain ID, URL and Post ID must match the ones from the web version of the article. This is needed so Content Insights platform can assign traffic to the proper article.

For details about importing article templates please refer to original Facebook documentation.

Due to limitations imposed by Facebook, it’s not possible to properly track attention time on FIA. Our system is aware of this and it takes that into account when calculates the CPI.

ATEE

ATEE stands for Automatic Topic Extraction Engine, our own solution based on adaptive machine learning algorithm for natural language processing.

Our experience with publishers shows that topics/tags provided by themselves are often of low quality.
Usually, it is the result of:

  • no tag classification in their system

  • using keywords as tags

  • the human factor, where authors assign a wrong or irrelevant value for a topic, misspells it, etc.

This directly influences the quality of data and insights we are able to provide.

ATEE helps publishers to eliminate mentioned factors in order to produce meaningful data and insights.

How to enable ATEE?

ATEE is a core feature, but it’s disabled by default.
If you want to use it, please send us a written consent and specify the domain you’d like to switch the ATEE for. After that, ATEE will be enabled at no additional cost.

Usage and presentation

Detected topics are treated in the same way as those that are specified by the publisher.
It won’t affect provided topics in any way. If the engine recognizes the topic that already exists, the detected topic will be discarded.
The number of detected topics is limited to 5 most relevant within a single post.
Detected topics are clearly marked from those that were sent through our tracker, they all have a laboratory flask icon beside each topic name.

Technical requirements

Content Insights crawler is an additional service that can be turned on per client’s request.
The crawler uses Apify web scraping and automation platform that runs on AWS servers.
Sometimes it happens that crawler requests are being blocked on client’s side, so they have to enable our crawler to access their content. There are two ways how it can be solved, clients can use one or both of them:

  1. Filter by fixed IP range
    Crawler runs from a fixed range of IPs listed in this JSON file.
    Those IPs have to be white-listed so our crawler could access the content and extract relevant information.

  2. Filter by User-Agent
    Our crawler identifies itself with this User-Agent string:
    Mozilla/5.0 (compatible; contentinsights.com data-extractor/1.0; +http://contentinsights.com)
    If possible, client should enable all requests with this header.

Special cases

Drupal

Please make sure that you make _ain object global so our script can access the values properly. This can easy be done by declaring _ain object somewhere in the global scope: var _ain = {};

WordPress

You can get all values from the global post object. Please refer to official guide from the WordPress Codex: https://codex.wordpress.org/Class_Reference/WP_Post

Google Tag Manager

Google Tag Manager provides one of the most flexible ways to add any piece of code to your site. You can choose on which pages to load the tracker, create custom variables and easily make any changes when you switch layout or redesign the site.

WordPress Plugin Installation

  1. Download Content Insights plugin for WordPress.

  2. Go to Plugins and click Add New button.

  3. Click Upload Plugin button, then Choose File, select our plugin file that you downloaded in Step 1.

  4. Go to Plugins in WordPress menu and activate the Content Insights plugin.

  5. Go to Settings in WordPress menu and click Content Insights item.

  6. Enter your Site ID (mandatory) and Maincontent parameter (optional).

  7. Click on Save Changes button to apply the settings.

FAQ

Where is tracking code stored?

Our tracker code (the actual JavaScript) is served through the CDN so the closest location to the visitor will be used.

How does the CI tracking code influence page loading time?

CI tracking code is loaded asynchronously, meaning it does not affect normal page loading in any way, nor any third party scripts, asynchronous or not, that may be on the page. Our JS is outside of the normal queue of the page load (document, images, styles, your own scripts…) and outside the loading queue of any other third party scripts, it cannot block anything. The loading speed of the page and other third party scripts are not affected

How to test CI code on our staging/test server first?

If your staging server is on the same domain as the main site you can use the same instructions. Otherwise, we need to register the domain of the staging server and assign new domain ID to it.

We recommend Chrome for testing, but you can use any browser in a similar way.

  • Load any article page on the site.

  • Open Chrome DevTools (F12)

  • Choose Network tab

  • Enter p? or a? in the search box to filter all requests on the page

The tracker triggers two types of requests:

  1. p request

    First p request triggers a Pageview.

    Second p request triggers after 10 seconds and it gives Article Read.

  2. a request

    One a request is triggered on every 5 seconds if there was any user activity on the article page (cursor move, click, scroll etc) and it tracks Attention Time.

    This means that if user choose other tab in browser or just leave to do something else, attention time will not be measured until he comes back and performs another reading activity.

Click on any of fired requests and Look for the Query String Parameters at the bottom of the Headers tab.

Another option is to just type _ain in the DevTools Console tab and hit Enter. You will see how the _ain object is filled.