The harvesting of political opinions at scale: An unprecedented development in Sri Lanka?


A bizarre article published in today’s Daily Mirror
caught my attention – not for what it said about three leading presidential candidates, but the basis upon which the article came to its conclusions.

The article, penned by the Daily Mirror’s Editor, claims “an extensive popularity poll on all presidential candidates is being conducted by an international IT company who conducted a similar poll and predicted the victory of Narendra Modi at the just concluded Indian election”, and that the “Daily Mirror reliably learns that this company run by Indian Americans have been invited here by some organizations and have begun conducting their survey in Sri Lanka by studying the social media accounts of an estimated 9.3 million citizens”.

Neither is the company nor those who invited it to Sri Lanka named in the article.

This data is being ostensibly used to to provide predictive political analysis, and “updates to the relevant authorities”. Who these ‘relevant authorities’ in Sri Lanka are is also not mentioned in the article.

The article goes on to note “The predictions have been made by studying the trends on X, formerly known as Twitter, Facebook and Instagram and also by studying the public comments.”

Methodology?

There are fundamental concerns, and questions around all this that the Daily Mirror does not, for whatever reason, ask.

How are 9.3 million social media accounts belonging to citizens studied? What is the basis of selection? How does a foreign company establish accurate interpretation, and meaning around a very complicated question (to ascertain undecided voters) based on content in Sinhala, and Tamil? What language competencies, and political awareness do staff have? Who trained them?

9.3 million accounts would result in exponentially more commentary, and content added on a daily basis – rendering this an exercise well beyond any human-led categorisation, and study. This means machine learning must be involved. That in turn raises the question as to how ML models were trained on local language content to determine if a Sri Lankan voter is undecided – which is a non-trivial problem that would challenge the biggest social media companies in the world.

Since there’s no methodology noted, it’s impossible to determine if this study is based on written expression only or included memes. If it does include memes, the machine learning is even harder, given a determination around a contextual appreciation of sardonic, satirical content that’s impossible to translate outside a very bounded socio-political culture, and language including expletives, ribald, dark humour, and references that have no translation to or comparison with anything in the English speaking discourse. If the study doesn’t include memes, then the findings in the Daily Mirror make even less sense, given how much of socio-political critique in Sri Lanka is based on the production of, and engagement with memes.

This isn’t a hypothetical. I studied hundreds of pink slime, junk news, and meme pages for PhD. The 435 meme pages on Facebook I still track published 54,200 posts from 1 January to 11 June – an average of 2,216 a week. In that period, followers grew by 1.95 million (across all the pages).

In this period, there were over 6.7 million comments against the posts published – an average of 124 per post.

The study presented in the Daily Mirror article would have had to study every single one of them, and in relation to a grounded appreciation of the content presented in memetic form. If the post was a video, a critical appreciation of every frame, and the transcript would be required to answer if the comments in response were from an ‘undecided voter’.

Memes were central to 2022’s aragalaya. After the killing of an unarmed protestor in Rambukkana, memes, and comments in response expressed complete outrage. They continue to be central in political commentary, and will – like in 2019’s presidential, and 2020’s general election – be central to propaganda, and counter-propaganda in 2024’s presidential election campaigns. This is why it’s crucial to determine if the study mentioned in the Daily Mirror article studied memes, and the manifold responses to them.

Which brings me to another question – why isn’t YouTube mentioned in the study? The platform is massive in Sri Lanka, and dwarfs Instagram, X/Twitter. Ada Derana had a million subscribers as far back as 2019. It currently has 2.73 million. Other private, state, and citizen-led accounts have equivalent numbers. YouTube engagement during 2022’s aragalaya broke records.

This high engagement persists. If we map just the Ada Derana channels – based on https://www.youtube.com/@AdaDeranaNews – there’s an entire constellation of political, news, and entertainment channels that would run political content. The study mentioned in the Daily Mirror captures nothing from this constellation of channels, and the comments each video, across every channel would receive. Now imagine similar constellations across all major state, and private TV networks on YouTube, and the comments each video would generate. None of this is included in the study the Daily Mirror quotes as a comprehensive capture of voter intention.

Let’s not even talk about why TikTok – a primary vector of news, and information for a specific first-time voter demographic in Sri Lanka – isn’t included in the study.

What of PDPA?

But there’s a bigger story in the Daily Mirror’s presentation of this clandestine study, conducted by an unnamed company, for undisclosed sums of money, and reporting to mysterious ‘authorities’. It around the harvest of data at scale, and the possible violations of the country’s (very strong) Personal Data Protection Act (PDPA).

The very large-scale, sustained monitoring of social media data for political, and potentially partisan purposes, without any disclosure, consent, transparency, and appropriate safeguards, prima facie, contravenes multiple obligations imposed on data controllers by the PDPA to protect the rights of data subjects (i.e., citizens).

This is potentially a huge, and unprecedented problem for those who publicly comment, without expecting their commentary to feed into what’s essentially a country-scale harvesting of opinion that feeds into political analysis by, and for undisclosed parties, run by a foreign company. There’s no control over how this data is retained or re-used.

Potential issues include,

  • Lack of consent: The Daily Mirror doesn’t mention that the 9.3 million citizens whose social media accounts are being qualitatively studied (for political views, and partisan bias) have provided consent for this processing of personal data. The Act requires consent of data subjects as a key condition for lawful processing.
  • Excessive data collection: Studying the social media accounts of 9.3 million citizens to predict election results risks going far beyond what is adequate, relevant and proportionate for the stated purpose. The Act requires personal data processing be limited to what is necessary.
  • Potential profiling without safeguards: Analysing trends, and making political or electoral predictions based on citizens’ social media activity could constitute a pervasive, and invasive citizen specific profiling. This is extremely chilling in a country that is defined by surveillance, state-led reprisals, extrajudicial killings, and a democratic deficit in general. The PDPA requires getting consent, and conducting impact assessments for any profiling activities.
  • Lack of transparency: There are no indications in the Daily Mirror article that the 9.3 million citizens who are being actively profiled were informed about the processing of their social media data, the identity of the controller (i.e., the name of the foreign company), and their rights under the Act to withdraw consent. The Act mandates transparency and providing specific information to data subjects.
  • Sensitive personal data: Social media activity could reveal sensitive categories of personal data like political opinions, which require extra conditions and safeguards for processing under the Act. What the Daily Mirror article suggests is an unprecedented harvest of public commentary, pegged to individual accounts, that in turn feeds into clandestine political analysis.

I don’t know quite where to start around the implications of all this.

But given that this is Sri Lanka, a national daily running a bizarre, incredibly worrying story like this won’t raise any eyebrows. The only interest in this article will be around what it says about the three presidential candidates. Nothing else will really matter.