Facebook

komli vizisense logoUpdated Below

If Rediff Is Number 5 on Komli’s Vizisense, Will They Validate It?

Komli Media has launched Vizisense, an online audience measurement system. Comscore, despite its reported inaccuracy and lack trasparency of its methodology, is an accepted audience measurement system in India, and IMRB International is also planning to launch their own audience measurement system in the next few months. Both of these systems are likely to be paid, so what’s really interesting is the fact that Komli has decided to make Vizisense available for free

We spoke to Amar Goel, CEO of Komli, who told us that Vizisense is using three levels of data sources:
– Panel: they’re tracking the Internet usage of over 10,000 users
– Analytics: statistics from the websites which install the analytics
– Census Data: from the Indian census, in order to help triangulate findings

Now this data can be mutually exclusive, since not all publishers may be using Komli’s data tracker – but Goel said that they can still make valuable conclusions about websites based on consumer behaviour – for example, “Lets’s say you go to a site that is 90% male, and another site which is 90% male…we can conclude with some probability that you’re male.”

The data is available publically as well…search for a site here for its audience data.

The Differentiator: Demographic Data

Komli is thus providing publishers with key demographic data like Gender, Age, Income, Education level and Marital Status, apart from the usual stats which something like a Google Analytics or Piwik may offer. Mind you, no two statistics packages give the exact same information either. Satish Kadu, VP (Product Development) told MediaNama that Komli is planning to expand the sample base to 100,000 within a few weeks.

According to Goel, users have agreed to download a software on their desktop, and anonymously share their surfing habits. Komli will verify their demographic data using contest forms, or asking them to share inputs on a regular basis. The users were recruited using online marketing. Komli expects publishers to adopt Vizisense because it gives them additional information from this panel.

Accurate Data?

It’s important here to keep in mind that the demographic mix of the panel will have a significant impact on the data available to publishers. More than horizontals, demographic mix has a significant impact on niche portals. I spoke at length with Kadu, to get a better understanding of how this demographic mix is defined. He said that this is an iterative process, and the data will improve with time. They’re doing the maximum to ensure more accuracy:

1. Bigger sample – Planning to expand the panel size from between 10-15,000 to between 100,000 50,000-100,000 in a few weeks
2. Targeting varied audiences for the panel signups for a well diversified sample
3. Removing the self selection bias, by triangulation of information from the panel data, actual publisher data
4. Using advanced techiques like Monte Carlo simulation, for multiple randomly picked up test samples.

Some Hurdles
1. Acceptance from larger publishers: Rediff claims to be number one in India, according to Comscore. At the same time, many other publishers have told us that their own Comscore information does not tally with their internal statistics. Yet, listed entities like Info Edge (disclosure: I own shares of Info Edge) and Rediff continue to cite Comscore in their earnings reports because it is an accepted and independent third party measurement system…The problem that Komli will face is – larger publishers will accept the information if they rank high, else they’ll reject it. So will Rediff accept being ranked Number 5 in India, when Comscore ranks them as Number 1? Not likely. Kadu responded, saying that Vizisense is not positioned to be a ratings system. (I wonder why they’ve got rankings then…)
 
In that context, the key to the acceptance of Vizisense then lies with smaller publishers who haven’t bought Comscores paid services, and the media planners who will probably use Vizisense for demographic inputs on sites – at present they use everything from Alexa, Google Ad Planner, Compete, and inputs from publishers.
 
2. Publishers Want Ambiguity: Not all publishers will want to share exact statistics with advertisers and media planners. While traffic data is key to monetization, it is competitive information, and they would prefer some ambiguity to remain. According to Kadu, “we believe that this information is of clear benefit to publishers, since it will be used to reach out to media planners and advertisers, and it acts as a part of their media kit. At the same time, we’re considering mechanisms to incorporate privacy.”  

*

Apart from the demographic information that we’ve mentioned above, Vizisense has two other differentiators – that they could soon be measuring information from 100,000 50,000-100,000 users, and the fact that it is free.  
IMRB International will most certainly have to rethink their strategy, but at the end of the day, it boils down to what the media buyers accept as the norm.

Update: Kadu had told MediaNama 4-5 times that Komli is targeting a panel of 100,000 users. Komli Media has since written in asking for that number to be changed to 50,000-100,000 users, with the following explanation: “Komli would like to have as many panelists as possible, but is targeting a minimum range of 50,000 to 100,000, which is deemed sufficient for its statistical uses.” We’ve updated the post accordingly.

Update 2: Alok Kejriwal, CEO of Games2Win points out in the comments to this post - there is a fundamental defect in the Vizisense business model in the sense that the want websites to place ad codes on their sites to track the same!

how can anyone believe reports when web site owners have the ability to manipulate the same?

globally, the best practice is to maintain external panels like comScore who measure traffic and demographics and sychographice by inference of the panel baheviour – NOT INFLUENCED by what the website wants!

i had asked the very same question to the 3 folks of Vizisense almost 3 weeks back and i am still awaiting their reply!

Related:
Post a Comment  |   Share this on : buzz facebook facebook facebook Stumbleupon Delicious Yahoo Buzz
Newsletter Newsletter
Subscribe buzz facebook facebook facebook

This website uses IntenseDebate comments, but they are not currently loaded because either your browser doesn't support JavaScript, or they didn't load fast enough.



16 Comments until now.

Alok Kejriwal + December 2nd, 2008 (#):

there is a fundamental defect in the Vizisense business model in the sense that the want websites to place ad codes on their sites to track the same!

how can anyone believe reports when web site owners have the ability to manipulate the same?

globally, the best practice is to maintain external panels like comScore who measure traffic and demographics and sychographice by inference of the panel baheviour – NOT INFLUENCED by what the website wants!

i had asked the very same question to the 3 folks of Vizisense almost 3 weeks back and i am still awaiting their reply!

Abhishek J + December 2nd, 2008 (#):

Rediff’s no 4 in comscore rankings they only say they are number 1 indian internet site
check here
http://ideasmarkit.blogspot.com/2008/07/internet-traffic-in-india-comscore-may.html

Abhishek J + December 2nd, 2008 (#):

and just another fact you are comparing it to wrong guys they have copied Quantcast’s business model completely not comscore’s

Kumar + December 2nd, 2008 (#):

The reports are completely whacky. They are not at all accurate.

Akshay + December 2nd, 2008 (#):

Folks, hi. I’m with Komli. I’d like to respond to some of the comments above.

Alok: You argue above that panel data is the best. With panels alone, how do you account for cyber-cafe data usage? 30-40% of India’s internet usage happens from cyber-cafes, whose computers either don’t have panels installed, and if they do, end up polluting the data. This is one big reason that we use publisher side tags, which can give us an accurate picture on the publisher side.

Let’s face it, there’s no perfect method out there. But based on the unique circumstances in India, we believe our tripartite approach — using data from panels, publishers and census’ is the most appropriate.

An important clarification: you say that Komli wants publishers to place ad codes on their sites. Where do we say that? A publisher needn’t put up Komli’s ad tags at all. For more accurate data, we are only asking publishers to put up our tracking tag. You are correct that a determined publisher can load his pages repeatedly to increase his PV’s, but come on, let’s be realistic.

(a) How much of the other data is fudgable?

(b) Our estimates are arrived at through a combination of panel and publisher-side data. The whole point is that if one is incredibly different from the others, our algorithms normalize the data and/or show the site’s data with low confidence.

(c) We have a number of checks within our back-end to catch publishers who might be fudging data. It’s a bit like catching click-fraud; it’s not perfect, but over time, after analyzing traffic patterns, we can track (and we actively do!) and identify those publishers who might be fudging data.

Thanks for your support. Please feel free to reach out to me at akshay(at)komli.com if you have any further questions. Thanks.

Aki + December 2nd, 2008 (#):

My concern about toolbars, be it alexa or vizisense, is that the typical Indian internet user will NOT install these things. (Walk into the cyber on your block to meet this person.)

India apparently has 81M internet users today. Komli says they need 50,000 to 1,00,000 of those users to install the toolbar to get statistically reliable figures.

I would not at all be surprised if those 50,000 to 1,00,000 users will be college educated, fluent in English, and employed in IT/media/etc. Is that the typical Indian internet user?

Alok Kejriwal + December 2nd, 2008 (#):

Akshay: Thanks for your response.

Note:

You can enrich your panel data by constantly investing in investment on the ground. All global audience reporting agencies who also measure TV, Radio and Hoardings do so. No one stops you from collecting data post usage from cyber cafe users. That’s exactly how post voting poll (exit polls) trends are done. You cannot justify that ‘cyber cafe’ usage is the reason for asking publishers to add tags. Say in 2 years when consumers start using more computers at home, will you change your ‘core’ methodology? Or the other way around if everyone starts using public computers, then what happens to panels? do they become redundant?

I was in a meeting with your Vijay Iyer, Kartik Poddar and who suggested adding tags to games2win.com so that Vizisense could work better. I clearly expressed my view point that there is a COLD STEEL WALL between publishers and agencies that represent data since the former in no way should influence the latter. Its like TAM asking Zee and Sony to add people meters to their channel subscribers to help ‘identify’ their users.

Today, i went upto Invizisense to check on games2win.com. Post getting a userid, i was directed to page that says ‘add this code’ to your site! So, isn’t there a way for a publisher to check his site status without adding code?

There is a discrepancy in your data and that of comScore already emerging (as per my reports) and i am assuming this is because some of your publishers who are adding your tags are influencing your report!

Vijay + December 2nd, 2008 (#):

““Lets’s say you go to a site that is 90% male, and another site which is 90% male…we can conclude with some probability that you’re male.””

This doesnt make any sense.

Let me understand the logic here.

1. You need to come to the first conclusion that Site A has 90% male visitors. Which is the hard part.

2. Once you reach there, you pretty much conclude that the number never changes because you are manipulating all the future data based on that first impression.

3. The first impression is probably going to be using the panel or known data, which is probably going to be assumed, warped or inaccurate, and time doesnt heal the data in anyway.

After that statement, you pretty much lost me. Is it just a wrong example? They’ve gotta do better than that. Alok is probably right.

Kartik Poddar + December 2nd, 2008 (#):

@Vijay – I am with Komli. Let me try and answer your question.

We’ll come to the first level approximation on the basis of our panel. So, if the panel is large enough, we can be very confident of the data for large sites that attract a lot of users from our panel . So if you need say 1000 members from the panel visiting a site to accurately estimate about it’s demographic information, then smaller sites with reach less than 0.1% of India’s online population will need a panel size of more than 1M which is obviously unfeasible to implement.

Here’s where pixel level data will help in increasing the accuracy of smaller sites with lesser traffic as it will help us with estimates of people outside our panel. That’s why we say ““Lets’s say you go to a site that is 90% male, and another site which is 90% male…we can conclude with some probability that you’re male.”” We can’t reveal the secret sauce to you for obvious reasons, but if we combine publisher level data with panel data, we can offer a much higher level of accuracy for small sites, which are intended to be the main beneficiaries of ViziSense.

You can write to me at kartik(at)komli.com if you have further questions.

Yogi + December 3rd, 2008 (#):

Its very clear from above discussion that vizisense is mired in claims and counterclaims from day one.
There is no doubt that current vizisense data is quite shady. It remains to be seen though, if that is becoz of lack of enough data points, or becoz of fundamental flaws in methodology. Right now both seem to be true to me!

Wonder what opinion do people out here have for Google tools like Adplanner (google.com/adplanner ) or trends for websites (trends.google.com/websites ). Given that Google would have huge data on people’s surfing behaviour as well as websites’ traffic coming through its adsense program as well as search engine, it actually may be in the best position to provide a good picture of usage across sites…

Sumant + December 5th, 2008 (#):

Let’s see. Discounting the panel size and composition for the moment (no difference from ComScore right now), tagged websites provide ample opportunities for data manipulation. All it requires is a script that pings the website (or a selection of pages on a site) over a period of time, periodically flushing out its cookies and spoofing its IPs, and voila! you have a site that has tremendously over-reported its traffic.

I’m willing to give the benefit of doubt to Komli and say that they’ve licked the problem, but I’d like more transparency there before I trust my marketing budgets to their data.

My experience with Google Ad Planner is that its data is very flawed, but is directionally believable. I just looked up their Help page, and that’s all they’re offering too. So I’d say don’t quibble over a few millions, focus on whether rankings are reasonably consistent, as well as the percentages in traffic differentials.

Alok Kejriwal + December 5th, 2008 (#):

Thanks to Amar (Goel’s) clarification, i could check the site stats on the toolbar on the top of the page so that was not a proble as i had mentioned earlier in my posting.

Weekly Digest (2-5 Dec) | MediaNama + December 8th, 2008 (#):

[...] demographic information, which we feel is the primary differentiator for Vizisense. The launch sparked off a debate on the efficacy of the data, and whether an agency should collect facts directly via a tracker on [...]

amit + December 17th, 2008 (#):

“ViziSense has recruited a panel of 10,000 people across varying demographic backgrounds and income levels whose online movements are tracked through the application downloaded on their computers upon consent.”

There are around 81million web users in India as pointed out by someone and you are taking a sample of 0.0123% that is way too small to reach to any conclusion. Even if you take 100k users as your sample the percentage would still be very less.
You need to have at least 1% as your sample especially when your have so many segments for each category.

For example (assuming evenly breakup):
In the 10,000 sample space:
10 states – 1000 in each
Connection type 5 more segment – 200 in each
Male / Female – 100 in each
6 Age buckets – 100 / 6 =17
6 Income BUckets – 17/6 =3
5 education buckets — opps we ran out of sample

Even if you multiply your sample by 10 ..we will have around 6 people in each segment..

6 ppl per segment are way to less to infer anything from their visits

The intelligent way to do this would be using web analytics and Modeling

Nikhil Pahwa + December 18th, 2008 (#):

Amit: tomorrow someone will say that there are 150 million Internet users in India…the basis – sampling which gives them an extrapolation of the number of Internet users in India, say, 50,000. Then there are 100 million mobile ENABLED handsets in India…then they go out and say that there are 150 million Internet users in India.

I think that both 10,000 and 100,000 are good samples for indicative usage behaviour, as long as one doesn’t claim this sample represents the entire ecosystem. The industry is short of data anyway, but data is emerging as a means of oneupmanship, not of actually improving user experience or content