DJ Showcase: (20 January 2014) is back with a fairly good data journalism story once again. Though the analysis is by another site,, which has created an infographic depicting living costs worldwide for expats. Of the 119 nations featured, India, Nepal and Pakistan have the lowest living costs, while Switzerland, Norway and Venezuela have the highest.

Yes, you read it right. Despite all the inflation talks, India is still one of the cheapest places on earth. Here’a s link to the story and below is the link to the infographic.


DJ Showcase: IndiaSpend

In an earlier post on data journalism in India, DataJourno founder had pointed out that though data journalism is still nascent in India, some of the work that Indian data journalists were doing were noticeable.

The post had actually given a list of top data journalism sites in India, listing IndiaSpend right on top but lamenting the fact that it was not being updated frequently.

We are happy to see the site once again becoming active with lots of new stories. DataJourno specifically wanted to showcase two stories.

The first, Is Data Noise Drowning Out The Chinkara’s Sneeze? goes deeper than just data analysis to raise a bigger question, extremely relevant in the Indian context. That is – whose data do you believe, when it comes to public issues?

The Rajasthan forest department’s chinkara census shows an 11% rise in the animal’s numbers over three years to 2013, while counting done by biologist Dr Sumit Dookia, who has spent about 15 years studying the chinkara, shows a 43% decline in its numbers in six representative sample sites.

The second, in its site, factchecker,in, Congress-NCP Govt Has Built More Houses Than It Claims!, actually digs up data to show that the previous Congress-NCP government in Maharashtra actually built more houses than it claimed. The former ruling combine took credit for building more than 4 lakh homes during its rule. In reality, it built more than a million homes, according to data obtained by IndiaSpend.

On the face of it, it is irrelevant now, as the elections are over and the former ruling alliance was decisively defeated. But it is important because this shows that if the political parties are sensitized, they can make more credible claims and counter each claim with data. Data is a great tool to fight propaganda, far better than raising your voice or making sarcastic comments.

Meanwhile, DataJourno has faltered on regular updation. Our apologies. We expect to be more regular from fourth week of December onwards

And you thought it’s BJP that builds all the roads?

Its supporters claim a BJP rule is the best bet for rapid infrastructure development; its critics say it does so to “show off” its achievement at the expense of social sectors. But both tend to agree that BJP governments do build infrastructure as a priority.

That could be one of the biggest myths, if one goes by real data, rather than perceptions. According to the recently released Infrastructure Statistics 2014 report by the Ministry of Statistics and Programme Implementation (MOSPI), five out of the top seven large states where the road density is maximum have never been ruled by the BJP. From the rest two, UP was ruled by BJP long back while Punjab has been ruled by its ally, Shiromani Akali Dal (SAD).

Here are the states with the best road infrastructure, as measured by road density (kilometer road length per 1000 square km area).


The data about the states with lowest road density reveal a completely different picture, though. From the seven large states with lowest road density in India, three (Gujarat, MP, and Chhatisgarh) are being ruled by BJP for long. Three more, Jharkhand, Himachal Pradesh and Rajasthan have been ruled  by BJP and others alternatively. Only Jammu & Kashmir, a largely hilly state, has not been ruled by the party ever.


As they say, there’s three sides to every story: your side, my side, and the facts.


Is North Indian School Education Really Inferior?

It is a common belief that North Indian school education—such as in the Hindi heartland—lags behind that in South Indian states. There are lots of myths and active myth making around that.

One of the reasons is literacy rate. South Indian states have higher literacy rates compared to North Indian states and this information is widely available and often analyzed and quoted  by government, academia, policy think tanks and media. In the absence of any other available data on quality of education, many conveniently take literacy rate as being synonymous with quality. And it just fuels an already existing prejudice that extends beyond quality of education to even include, at times, intellectual capability. All of us are just too familiar with jokes and one-liners conforming to this prejudice.

The results of the latest round of National Achievement Survey (NAS) conducted by National Council of Educational Research and Training (NCERT) for class VII, released earlier this year, most definitely busts that myth.

The survey, designed to provide a kind of health check to the school education system, is one of the largest national level educational assessment surveys anywhere in the world. This round, conducted in 2012, for example, used tests and questionnaires to gather information from 188,647 students in 6,722 schools across 33 states and union territories (UTs). Class VIII survey measures students’ ability in four areas: science, maths, language and social science.

In all these subjects,the state scores of Tamil Nadu and Andhra Pradesh are  significantly below the overall national score in all subjects. On the other hand, the best performance is by Uttar Pradesh, whose score is significantly above national score in three of the four subjects. States like Madhya Pradesh and Punjab too do not do badly.

Absolute scores, though “shocking” to many, only tell half the story. When juxtaposed with the literacy rates, they explain why the North Indian states are so underrated.

In this chart, we have plotted literacy rates vis-a-vis the NAS scores. The Y-axis shows literacy rates. The X-axis shows arithmetic mean of all the four subject scores (reading comprehension, maths, science and social science)

The chart explains how North Indian states are underrated when it comes to quality of education

The chart explains how North Indian states are underrated when it comes to quality of education

The chart shows literacy rate versus NAS scores for 32 states and UTs (Assam data was not available). For 21 states/UTs (inside the ellipse), the literacy rates and test scores are fairly proportional. There are four states with high (more than 80%) literacy but low (less than 240) mean score. The three large states are TN, Puducherry and Delhi.

There are seven states with low (less than 70%) literacy but high (more than 245) mean score. Two of them are very small UTs. The rest are all from the Hindi speaking belt: UP, Bihar, Jharkhand, MP and Rajasthan.  These are often considered as laggards when it comes to school education.

But as the chart shows clearly, their low literacy rate could be the reason behind the perception. In other words, these five states are the most underrated when it comes to quality of their school education.

We just hope that the policy decisions are not taken based on those perceptions!

Indians feel safe and like where they stay…relationships is the issue

Gallup has just released what it calls the inaugural Gallup-Healthways Global Well-Being Index (2013). Positioned as a “global barometer of individuals’ perceptions of their well-being”, the research measures the perceptions of 133,000 individuals spread across 135 countries. India ranks 71 in the list of 135 countries.

The well-being index is organized into five elements:

  • Purpose: liking what you do each day and being motivated to achieve your goals
  • Social: having supportive relationships and love in your life
  • Financial: managing your economic life to reduce stress and increase security
  • Community: liking where you live, feeling safe, and having pride in your community
  • Physical: having good health and enough energy to get things done daily

In analyzing the results of the index, Gallup classifies responses as “thriving” (well-being that is strong and consistent), “struggling” (well-being that is moderate or inconsistent), or “suffering” (well-being that is low and inconsistent).

This chart shows the percentage of Indians think that they are thriving in each of the five elements that the research measures, shown against percentage of people in South Asia and in the world who think the same.


The chart depicts percentage of people who think they are thriving

This just challenges some of the assumptions that we have about ourselves. Lets leave the two elements where India’s perception compares well with global/regional perception and focus on the three where it differs significantly.

  1. Social: We always take pride in our strong family life. But the findings challenge that as far lower Indians think they are thriving in having “supporting relationships” and “love” in their lives as compared to even our neighbors.
  2. Financial: Once again, Indians have a far lower perception about their financial well-being than the world and their neighbors. But that is not surprising considering this survey was conducted in the thick of the “policy paralysis” and economic slowdown.
  3. Community: This also surprises, though pleasantly. Despite all the government bashing that we do and the extraordinary self criticism of our society that we indulge in, we have the best perception of our well-being where government and society have a role to play. We like where we live more than others; we feel safe where we live more than than others; and we have more pride in our community than others have in theirs. 




DJ Anti-Showcase: Mismatching Story and Visualization

Traditional media seems to have suddenly woken up to the need of having its “presence” in data journalism. The irony is, few understand the opportunity and how it could potentially impact journalism. But no one wants to give it a miss; it is the coolest thing in the business.

Visualization, the most tangible face of data journalism is the fanciest thing to do. And most of the easy to use and cheap/free tools are slso available to do that. So why not?

This story from Hindustan Times, BJP’s loss in UP worse than it seems, is a great example of how visualization should not be made. The story which analyzes the recent bypolls data to argue that BJP actually lost a lot of vote share is a decent story. It achieves one thing. It negates any argument that BJP’s loss is because of some other party pulling more because of some other factors; or the loss is not because of vote share loss but because of arithmetic reasons. That’s a fairly timely story.

But the visualization that goes with it neither proves this point nor adds any information to the story. It is a completely different comparison between how much votes BJP got in different constituencies and a comparison of those votes, which actually is completely meaningless.

In short, the visualization that accompanies the story is not only unrelated to the storyline, it is of no use. A good visualization, says visualization guru, Alberto Cairo, should be beautiful, functional and insightful. If it is not insightful, it is still a story; a boring story. But this one is not even functional. What’s the point of comparing votes in different constituencies?

A visualization depicting a side by side comparison of BJP’s vote in last election and this one would have simply communicated the idea. In fact, what has been represented as two visuals, could be combined to make the point that the story is making.





DJ Showcase: Final Work of Participants in ICFJ Data Journalism Workshop

The three day ICFJ Data Journalism workshop in Delhi held between 5th to 7th September 2014 culminated in all the participants being divided into groups to work on real life stories using freshly learnt techniques in data scraping, cleaning and visualization. Ideas ranged from Narendra Modi’s popularity on Twitter to changing pattern of media ownership; from transformation of India into a cashless economy to the changing definition of middle class in India.

Here are examples of some of the work that various teams produced at the end of the workshop.

Complaints against police (An Infographic on statistics on complaints against police and conviction rates)

Cashless in India (A data-based story on how India is turning to electronic transactions)

Class Calculator (A tool to calculate which economic category a consumer belongs based on consumption pattern)

Inside Media (An investigation into how ownership patterns have changed in top six most valuable  media companies in India)

Terror Statistics (A data-based story on the cost of conviction on terror cases)

There were more such. Some of them were

Online Video Advertising: The Reality

Crime Against Women in India

Narendra Modi vs Other Global Leaders: Popularity on Twitter

Modi Teri Ganga Maili (Ganga Action Plan)

They either do not have their work available online or if they have, Datajourno does not have those links.

First Data Journalism Bootcamp Starts in Delhi

A 3-day data journalism bootcamp, co-hosted by the International Center for Journalists, the Hindustan Times, Hacks/Hackers New Delhi, Data{Meet} and the 9.9 School of Communication started on Friday in the National Capital Region of Delhi, India. The program—the first of its kind—has managed to attract a mix of veteran data journalists, young media professionals, data science enthusiasts, developers and designers.

More than 60 participants are participating in the workshop, which is seeing some good cross fertilization of ideas among journalists from different types of media and techies. The speakers and mentors are the who’s who of Indian open data/data journalism community, while some of the most well-known data journalists and professionals are participating in the program.

Over the next two days, the participants will work on real life stories involving data.

David Lemayian from Code for Africa giving a few smart tips


DJ Showcase: Times of India (03 September 2014)

The Times of India’s regular STATOISTICS column in its print edition is a consistent effort to popularize infographics based stories. A good infographic, says visualization guru Albert Cairo, should be beautiful, functional and insightful. Most of the TOI infographics are beautiful and functional. But the “insight” or the “story” is often missing.

What’s a story? Something that is unusual (“man bites a dog”), counter-intuitive or in the other extreme, establishes something that people have somehow believed but there is no direct evidence.

Rarely does a great story comes from one source. You may get an idea. But then, you make a hypothesis, test it out by getting more information from new sources or verifying some of the already obtained information.

Data journalism is no different. Once in a while, if you are lucky, you can get a good story from a single dataset. You have to juxtapose a couple of datasets; may be some investigation is required. The “insight” or the “USP” of the story often comes from that. Even some basic observations about exceptions, predominant trend are a good starting point.

Look at this infographics

Almost in all food items (and these are not basic food items like rice, wheat, vegetables or dal) urban India outscores rural India. That is not surprising per se. But there are exceptions. Fish is something where rural India scores. Apple remains primarily an urban fruit while tropical fruits like guava or mango (the desi fruits) are consumed equally by rural and urban India.

A good starting point for a great story is often: why? And this (or any single) dataset won’t answer that. Some of the best data journalism ideas come from single datasets, but great ideas need great execution to make them great stories.



Which state is India’s biggest drinking state?

Data journalism is journalism first—and last. Basic principles of good journalism applies to data journalism as well. One of those fundamental principles is to check credibility of information. It starts with knowing where to get the most authentic information on a particular topic. Yes, even in the days of Google.

This story in The Hindu,  by Rukmini S, one of the very, very few practicing data journalists in India, beautifully illustrates that.

The topic is alcohol consumption in different states. And the context is Kerala’s decision to move towards prohibition. Some basic research, as the euphemism goes for Google search, convinced the media that Kerala is indeed the top per capita alcohol consuming state in India. And what can you beat it? The top drinking state heading towards prohibition…

But is Kerala really India’s most drinking state? It took a real data journalist to ask that question and bust the myth.

As Kerala takes the first steps towards prohibition, here’s a question: is Kerala really India’s biggest drinker? The media sure seems to think so; here’s the Times of India, saying so today (but giving no source), The Indian Expresssaid it in 2008 but the source study is nowhere on the internet and the Economist said so in 2013citing a Kerala-based advocacy group director. Various other reports cite Kerala’s 2008 Economic Review but this isn’t available online either.

Anyone who has any interest in tracking consumption pattern in India would know that the biggest agency that tracks that info is National Sample Survey Office (NSSO) under Ministry of Statistics and Programme Implementation, through its various “rounds” of surveys. Rukmini used that data to prove that it is not Kerala but Andhra Pradesh which is the biggest drinker.

Despite being a great reminder of what should not be passed off as data journalism, the story fails to excite. A simple and direct headline like “And you thought Kerala is the biggest drinker” could have been far more direct than a text-bookish headline like “India’s biggest drinkers.”

Nevertheless, it assures. That data journalism in India is in good hands.