On the 12th of June this year, Google filed for a patent in India for “Detecting Repeating Content In Broadcast Media”. The reason for filing this patent is not known, but we think it’s significant:

Google has been in trouble in India over copyright content being published on YouTube – particularly TV shows and Bollywood films that users upload. Often, users upload an entire Bollywood film in 10-15 parts. The “Safe Harbor” argument, which is applicable via the Digital Millenium Copyright Act isn’t applicable in India, and Google has even asked the Indian telecom regulator TRAI for immunity for platforms.

In the past, legal notices have been sent to Google by media companies like STAR TV (News Corp), Sony Entertainment Television and Saregama. Bollywood Music major T-Series has even taken them to court. I’ve attended a couple of these court sessions, during one of which Google had asked T-Series to give them “5000 of their copyrights” (i.e. copyrighted content), so YouTube can compare and remove the content.

So how does the technology work? Essentially, Google generates a database of audio statistics from content and compares it with stored data to determine whether the content has been stored in their database or not. If the content is detected in the stored data, it is identified as “repeating content”.

View the patent details here.

Google Faces More Legal Issues In India; Pushes For Immunity For Platforms
— ContentSutra: YouTube And T-Series Given Time To Settle Copyright Infringement Dispute
— ContentSutra: T-Series Obtains Restraining Order Against Google And YouTube From Delhi High Court