Elections 2014

DJ Anti-Showcase: Mismatching Story and Visualization

Traditional media seems to have suddenly woken up to the need of having its “presence” in data journalism. The irony is, few understand the opportunity and how it could potentially impact journalism. But no one wants to give it a miss; it is the coolest thing in the business.

Visualization, the most tangible face of data journalism is the fanciest thing to do. And most of the easy to use and cheap/free tools are slso available to do that. So why not?

This story from Hindustan Times, BJP’s loss in UP worse than it seems, is a great example of how visualization should not be made. The story which analyzes the recent bypolls data to argue that BJP actually lost a lot of vote share is a decent story. It achieves one thing. It negates any argument that BJP’s loss is because of some other party pulling more because of some other factors; or the loss is not because of vote share loss but because of arithmetic reasons. That’s a fairly timely story.

But the visualization that goes with it neither proves this point nor adds any information to the story. It is a completely different comparison between how much votes BJP got in different constituencies and a comparison of those votes, which actually is completely meaningless.

In short, the visualization that accompanies the story is not only unrelated to the storyline, it is of no use. A good visualization, says visualization guru, Alberto Cairo, should be beautiful, functional and insightful. If it is not insightful, it is still a story; a boring story. But this one is not even functional. What’s the point of comparing votes in different constituencies?

A visualization depicting a side by side comparison of BJP’s vote in last election and this one would have simply communicated the idea. In fact, what has been represented as two visuals, could be combined to make the point that the story is making.






DJ Showcase: LiveMint (11 AUGUST 2014)

Were the Afghan elections rigged?

From the headline, it looks like yet another political story from Afghanistan. But take a closer look and you will find it it one of the finest examples of data journalism. The fact that it is in an Indian newspaper and that too, one that primarily covers business and economy, just adds to its charm.

There are several reasons why we like this one so much.

First of all, it sets the expectation straight away. The blurb says what it is: one popular method used to determine whether a data set is doctored is to look at the last digits of the values. The simple sentence does two things: it raises the interest level of those looking forward to a data story; at the same time, it turns away those who are uncomfortable with data but are looking for a spicy story on some new evidence of malpractice being caught on camera. In short, it sets an accurate expectation. That is good journalism.

Second, and that is primarily the reason why it is here is that it actually tests the limits of data journalism. While most data journalism stories are about analyzed results, it is about nature of data sets.  While most are about data analysis, itis about statistics and yes, probability.  It looks at elections data in India, Iran and Afghanistan to suggest that elections in Iran and Afghanistan were probably rigged at the counting stage.


In an election, for example, there is no reason that the distribution of the last digits of the vote counts of various candidates should not be uniform—given the large number of votes that each candidate gets, the last digit is essentially random, and there is no reason that the probability of a 1 in the units place is more than that of a 2 in the units place. Thus, in a free and fair election, it is likely that the last digit is distributed uniformly.

Third, and this aspect often ignored by new age data journalism champions, many of who understand data very well but are not familiar with the basic promises of good journalism. And that is: you have to be fair and balanced, even if that takes a little interest away from the story. You cannot sacrifice these basic journalism values to make a story more interesting. This story adds the ‘note of caution’ in a very clear and prominent way.

Finally, a note of caution. There are several ways in which an election can be rigged. Speaking broadly, it can be rigged at either the voting or the counting stages. This method of looking at the last digits only gives us an indication of the probability of rigging in the counting stages. Methods such as “ballot stuffing” (reportedly not uncommon in India) cannot be caught with such methods.

Great work.

Is Twitter the new official communication channel of the Modi government?

After Prime Minister Narendra Modi asked his ministers not to speak to media directly, there have been media reports that he may follow his Gujarat Model here too.  This is what Scroll.in wrote

These instructions may well be the first serious step to turn New Delhi into Gandhinagar, where during Modi’s three terms as chief minister, members of his cabinet would not speak to the press unless they had obtained permission from him. Even the customary press briefings after the state cabinet meetings – which in other states are addressed by ministers – are either not held at all in Gujarat or are addressed by spokesmen of the state government.

The opposition has, naturally, seen it as disempowering ministers. But the question is: is it really disempowering or is it trying to cut off media from the communications channel? other than, of course, the “official” policy communications, which ministry spokespersons will continue to do?

What the media has failed to notice is that Modi may be directly or indirectly influencing his ministers to “communicate directly” with people, as he himself has done, choosing to speak to media only when he wants and “in his terms”.

Twitter, his preferred channel, is largely one-sided communications; can be better managed than a press conference to show what you want to show; and once in a while, both genuine compliments/suggestions as well as planted ones can be responded to give that “interaction with common people” feeling.

Here is a list of ministers with Twitter accounts and their number of followers. While Modi himself leads the pack with Sushama Swaraj (a veeteran tweeple) as a distant No 2, others are way behind, though some of them are catching up fast. Arun Jaitley, for example, started this account only in November 2013 and has already close to 350,000 followers.

Union Ministers on Twitter (Click to enlarge)


The bar graph, of course, shows the number of followers (it is not exactly to scale but the mentioned numbers are actual), as of  9th June 2014.

But what is more interesting is how long they have been on Twitter, represented in the chart through use of different shades. Out of 32 cabinet ministers and ministers of state with independent charge, as many as 24 (that is 75%) are in Twitter. That is fairly high, as compared to the UPA government.

But what is interesting and supports the theory that they may be trying to impress their leader is the time of their joining. As many as 10 0f those have joined Twitter after Modi was officially anointed the chief of campaign committee in June 2013. As many as 14 have joined after he emerged as the No 1 prime ministerial candidate. And if you leave out genuine prolific users such as Modi himself, Sushma Swaraj, Nirmala Sitharaman and Smriti Irani, there are just a handful of them who have joined after 2011 but before Modi emerged as the No 1 leader; in other words, for natural reasons.  There is absolutely no one among the ministers who is between 2 -3 years old in Twitter.

In fact, those who are wondering what made Modi choose Smriti Irani as a cabinet minister with a plum portfolio (others are either veterans, come with a professional background like Gen VK Singh or who have proven themselves in party work such as Dharmendra Pradhan and Piyush Goyal), the Twitter stats may give a clue. In Modi’s “virtually real”  world, she has delivered the best performance, creating the largest follower base in Twitter after Modi and Sushma Swaraj, the later herself an aspirant for PM post before Modi’s emergence.

Here is a list of all the cabinet ministers and ministers of state with independent charge with their age and Twitter handle, just for your reference.

Minister Age Twitter Handle
Narendra Modi 63 @narendramodi
Rajnath Singh 62 @BJPRajnathSingh
Sushma Swaraj 62 @Sushmaswaraj
Arun Jaitley 61 @arunjaitley
Venkaiah Naidu 64 @MVENKAIAHNAIDU
Nitin Gadkari 58 @nitin_gadkari
D. V. Sadananda Gowda 61 @DVSBJP
Uma Bharti 55 @umasribharti
Najma Heptullah 74 NA
Ram Vilas Paswan 67 @iramvilaspaswan
Maneka Gandhi 57 @ManekaGandhi
Ananth Kumar 54 @AnanthKumar_BJP
Ravi Shankar Prasad 59 @rsprasad_bjp
Ashok Gajapati Raju 62 NA
Anant Geete 62 NA
Harsimrat Kaur 47 @harsimrat_badal
Narendra Singh Tomar 56 NA
Jual Oram 53 @jualoram
Thawar Chand Gehlot 66 NA
Kalraj Mishra 73 @Kalraj_Mishra
Radha Mohan Singh 64 @singhradhamohan
Harsh Vardhan 59 @drharshvardhan
Smriti Irani 38 @smritiirani
Vijay Kumar Singh 63 @Gen_VKSIngh
Inderjit Singh Rao 63 @Rao_InderjitS
Santosh Kumar Gangwar 66 NA
Shripad Yasso Naik 61 NA
Dharmendra Pradhan 44 @dpradhanbjp
Sarbananda Sonowal 51 NA
Prakash Javadekar 63 @PrakashJavdekar
Piyush Goyal 49 @PiyushGoyal
Dr. Jitendra Singh 42 NA
Nirmala Sitharaman 54 @nsitharaman


Which Lok Sabha is more qualified: this one or last one?

Some debates never end. Not just the Mumbai versus Delhi types. But even the knowledge versus action types.

Take for instance, the latest controversy surrounding HRD minister Smriti Irani’s qualification—or rather the alleged lack of it. She proclaims that she should be judged by her work and not qualification and there are quite a few takers for her stance. It seems action is winning this round of battle over knowledge. Not exactly surprising, considering the new prime minister considers and markets himself as a karma yogi.

But does it mean that the current (16th) Lok Sabha lags behind the previous one, when it comes to qualification of its MPs?

Does not look so, if you consider that this Lok Sabha has 33 PhDs as compared to just 9 in the last one and even has slight edge in number of post graduates. But when it comes to graduates and above, it is a little behind.

In the 15th Lok Sabha, every 4 out of 5 members was a graduate or above. That is 80% of the members. The figure is a little less at 76% for the current Lok Sabha.

This Lok Sabha is also a little heavier at both ends. On one hand, the PhDs and post graduates together account for 34% of the total members as compared to 28% in last Lok Sabha. On the other, the share of such members who have not studies beyond class 10th too is more at 13%, as compared to 10% in last Lok Sabha. The comparison is not too different even if you add class 12th pass outs.

Yes, it is the middle (the graduates) that ruled the last Lok Sabha. As many as 52% of all members were just graduates in the last Lok Sabha; that number is drastically lower at 42% in the current Lok Sabha.


And here are how the major parties stack up when it comes to how qualified their MPs are. The figures represent the percentage of MPs who are graduates and above.


Yes, regional parties like TMC, ADMK and BJD are at top while some other regional parties like Shiv Sena and TDP are at bottom.


DJ Showcase (Scroll.in-25 May 2014)

NOTA may have affected outcome in 19 constituencies

This is a simple but very effective example of data journalism, in the new Indian news site, scroll.in.

In a story,  NOTA may have affected outcome in 19 constituencies, the site has analyzed the voting data in the General Elections 2014 to come up with a list of 19 constituencies across India where the results may have been affected by NOTA (None of the above) choices introduced for the first time in this General Elections.

The authors Santosh Sunderesan and Nikita Saxena have analyzed the data to come with interesting insights, such as

  • most of these are rural constituencies
  • all of these are reserved constituencies
  • they all, with one exception,  recorded high voter turnout.

Since many of these are also the Naxalite hit areas, the question to ask is: if the ultra left militants had a role to play in this high NOTA preference?

Elections 2014: The Showcase for Data Journalism

If data journalism were a religion, national elections in a democracy would be its biggest festival. Never ever does mainstream media get so obsessed with data, data analysis and “deriving insights from data”, as it does during the elections season.

In India, the efforts stands out even more, as otherwise, there is very little data that journalists care about. Sure, the heavy economic parameters such as GDP growth or growth in industrial production are regular features in business media, but few normal people even glance through them; not even those who otherwise consumer regular business and corporate news.  The only category of people who have some fascination for numbers are the ardent cricket fans who follow cricket statistics.

What adds both color and complexity to the elections data in India is the country’s diversity: in its demographics, in issues and not to forget number of national and regional parties.

Color, quite literally. Just contrast, in your mind, two visualizations, a pie with two colors  representing Democrats and Republicans and another having at least 7-8 colors, representing BJP, Congress, AIADMK, TMC, BJD, Shiv Sena, TDP, TRS…you can drag it if you like.

With so many parties fighting it out, the vote share that each gets and how that translates into seats is recommended study for those wanting to understand paradoxes.  Add to that so many ways you can analyze the data: party-wise, region-wise, state-wise, constituency-wise, reserved constituency-wise, in terms of which party is ruling the state…and so on

Last week was that once-in-a-five-year festival. The Election results came on 16 May.

Not surprisingly, every TV channels, every newspaper or every online new portal worth its name had a dedicated data analysis section on its website. Here are links to a few among major media brands in India.

  • The Economic Times: The top business newspaper in India and one of the largest in Asia, has a old-fashioned tabular representation of the results. Of course, with so many other elements and ads, it is also extremely cluttered.


  •  Firstpost: A sleek site, great for those getting deeper into their own analysis. Scores on presentation, breadth of coverage, comprehensiveness, with fairly good ease of use. The only negative: it is a little intimidating for many.


  • The Hindu: Often hailed as the only general newspaper that is actively into data journalism, The Hindu’s election data page disappoints, despite having functionality. It is not at all intuitive; all that you are greeted with is a map of India.


  • The Hindustan Times: It has sleekness of design, all the functionality, and a fairly clean page. Yet, it is not intuitive to use. But the best among all major general newspapers.


  • IBN Live: Another cluttered site, with a tabular representation of overall results, in sharp contrast to the group site, Firstpost.



  •  India Today: A simple, clean website. Almost a clone of NDTV site, with the same way of visualization. But it is not as good in terms of presentation and even functionality


  • Mint: Mint, clearly well ahead of any other media in India, when it comes to data journalism, disappointed. That too after advertising heavily about its election coverage.



  • NDTV: By far the simplest presentation, strictly focused on election results and nothing else. Scores on ease to use as well as presentation
  • NDTVThe Times of India:  The Times of India clearly decided to play up the news and pushed the data to the bottom of the page. It had a simple, easy to understand format but with no functionality to drill down further.



Apart from the result coverage, there have been some interesting analyses of results by some of these media brands. Especially noteworthy is this analysis in Mint which creates a sweep index.

This chart, which calculates the seat share based on individual parties’ vote share, had there been  proportional representation, has drawn a lot of criticism, both from the supporters of Bharatiya Janata Party as well as from supporters of regional parties such as Biju Janata Dal, which swept their own states but whose shares of votes in the national votes cast is small. On the other hand, parties such as BSP , which drew a blank but have a fairly high vote share distributed all across tend to gain from this. Many have used this as an example to attack data journalism per se.

But it must be noted that it is not data journalism that is at fault here. The analysis is at odds with the reality of diversity in India and the framework of federated structure with provision for regional parties. A state like Odisha, in this method, will have one/two representatives in the Lok Sabha.

At the time of writing this, interesting insights based on analysis of data is still coming in. We will report anything interesting that comes in, either here or through our Twitter handle @datajournoin.