Showcase

DJ Showcase: Scroll.in (20 January 2014)

Scroll.in is back with a fairly good data journalism story once again. Though the analysis is by another site, movehub.com, which has created an infographic depicting living costs worldwide for expats. Of the 119 nations featured, India, Nepal and Pakistan have the lowest living costs, while Switzerland, Norway and Venezuela have the highest.

Yes, you read it right. Despite all the inflation talks, India is still one of the cheapest places on earth. Here’a s link to the story and below is the link to the infographic.

Advertisements

DJ Showcase: IndiaSpend

In an earlier post on data journalism in India, DataJourno founder had pointed out that though data journalism is still nascent in India, some of the work that Indian data journalists were doing were noticeable.

The post had actually given a list of top data journalism sites in India, listing IndiaSpend right on top but lamenting the fact that it was not being updated frequently.

We are happy to see the site once again becoming active with lots of new stories. DataJourno specifically wanted to showcase two stories.

The first, Is Data Noise Drowning Out The Chinkara’s Sneeze? goes deeper than just data analysis to raise a bigger question, extremely relevant in the Indian context. That is – whose data do you believe, when it comes to public issues?

The Rajasthan forest department’s chinkara census shows an 11% rise in the animal’s numbers over three years to 2013, while counting done by biologist Dr Sumit Dookia, who has spent about 15 years studying the chinkara, shows a 43% decline in its numbers in six representative sample sites.

The second, in its site, factchecker,in, Congress-NCP Govt Has Built More Houses Than It Claims!, actually digs up data to show that the previous Congress-NCP government in Maharashtra actually built more houses than it claimed. The former ruling combine took credit for building more than 4 lakh homes during its rule. In reality, it built more than a million homes, according to data obtained by IndiaSpend.

On the face of it, it is irrelevant now, as the elections are over and the former ruling alliance was decisively defeated. But it is important because this shows that if the political parties are sensitized, they can make more credible claims and counter each claim with data. Data is a great tool to fight propaganda, far better than raising your voice or making sarcastic comments.

Meanwhile, DataJourno has faltered on regular updation. Our apologies. We expect to be more regular from fourth week of December onwards

DJ Showcase: Final Work of Participants in ICFJ Data Journalism Workshop

The three day ICFJ Data Journalism workshop in Delhi held between 5th to 7th September 2014 culminated in all the participants being divided into groups to work on real life stories using freshly learnt techniques in data scraping, cleaning and visualization. Ideas ranged from Narendra Modi’s popularity on Twitter to changing pattern of media ownership; from transformation of India into a cashless economy to the changing definition of middle class in India.

Here are examples of some of the work that various teams produced at the end of the workshop.

Complaints against police (An Infographic on statistics on complaints against police and conviction rates)

Cashless in India (A data-based story on how India is turning to electronic transactions)

Class Calculator (A tool to calculate which economic category a consumer belongs based on consumption pattern)

Inside Media (An investigation into how ownership patterns have changed in top six most valuable  media companies in India)

Terror Statistics (A data-based story on the cost of conviction on terror cases)

There were more such. Some of them were

Online Video Advertising: The Reality

Crime Against Women in India

Narendra Modi vs Other Global Leaders: Popularity on Twitter

Modi Teri Ganga Maili (Ganga Action Plan)

They either do not have their work available online or if they have, Datajourno does not have those links.

DJ Showcase: LiveMint (11 AUGUST 2014)

Were the Afghan elections rigged?

From the headline, it looks like yet another political story from Afghanistan. But take a closer look and you will find it it one of the finest examples of data journalism. The fact that it is in an Indian newspaper and that too, one that primarily covers business and economy, just adds to its charm.

There are several reasons why we like this one so much.

First of all, it sets the expectation straight away. The blurb says what it is: one popular method used to determine whether a data set is doctored is to look at the last digits of the values. The simple sentence does two things: it raises the interest level of those looking forward to a data story; at the same time, it turns away those who are uncomfortable with data but are looking for a spicy story on some new evidence of malpractice being caught on camera. In short, it sets an accurate expectation. That is good journalism.

Second, and that is primarily the reason why it is here is that it actually tests the limits of data journalism. While most data journalism stories are about analyzed results, it is about nature of data sets.  While most are about data analysis, itis about statistics and yes, probability.  It looks at elections data in India, Iran and Afghanistan to suggest that elections in Iran and Afghanistan were probably rigged at the counting stage.

 

In an election, for example, there is no reason that the distribution of the last digits of the vote counts of various candidates should not be uniform—given the large number of votes that each candidate gets, the last digit is essentially random, and there is no reason that the probability of a 1 in the units place is more than that of a 2 in the units place. Thus, in a free and fair election, it is likely that the last digit is distributed uniformly.

Third, and this aspect often ignored by new age data journalism champions, many of who understand data very well but are not familiar with the basic promises of good journalism. And that is: you have to be fair and balanced, even if that takes a little interest away from the story. You cannot sacrifice these basic journalism values to make a story more interesting. This story adds the ‘note of caution’ in a very clear and prominent way.

Finally, a note of caution. There are several ways in which an election can be rigged. Speaking broadly, it can be rigged at either the voting or the counting stages. This method of looking at the last digits only gives us an indication of the probability of rigging in the counting stages. Methods such as “ballot stuffing” (reportedly not uncommon in India) cannot be caught with such methods.

Great work.

Data Journalism in India: Nascent but noticeable

Shyamanuja Das

In a well-discussed (and well-tweeted) article published on the Global Investigative Journalism Network website, India’s Media – Missing the Data Journalism Revolution recently, journalist and academician, Priya Rajasekar, argues that Indian media, by and large, is still to wake up to the opportunity of data journalism.

Probably the first in-depth analysis on the subject in India, the article is fairly complete in terms of capturing the viewpoints of the entire spectrum of stakeholders—the academicians, the practitioners and other influencers. It even goes to explore the reasons behind what it calls the Indian media’s “not subscribing to the idea (of data journalism)”.

The basic assumption — that Indian media has not really taken to data journalism seriously — is not exactly way off the mark, if one takes into account only the traditional media. India surely does not have the likes of a Guardian Datablog and NYT Upshot.

But then, how many of the traditional media brands even in the developed markets have such initiatives? India’s The Hindu  actually has a dedicated section on data stories, though its nowhere near Guardian and NYT sites.

There are online ventures, though. Though not quite the Vox and FiveThirtyEight of India, some of them are making an impact. A few are dedicated to data journalism, while others are news and analysis sites but do have a few good data journalism stories. Some traditional media houses have also started exploring the area in a more focused manner, which interestingly, is being noticed by even the common readers.

Two events in the recent past have helped the cause in a big way.

One, of course, was the General Elections held in April – May 2014. India’s is the largest elections on earth, not just in terms of the size of the electorate but also in terms of number of political parties. India has more than thousand political parties, out of which about 60 are recognized national and state parties. That makes analyzing vote shares and linking that to seats won fairly complex and interesting. With the Election Commission sharing raw data, we saw a lot of good analysis this time. DataJourno carried a round-up of election coverage here.

The other was release of crime data by National Crime Records Bureau (NCRB). Though NCRB has been sharing this data for many years now, thanks to the growing awareness about data analysis, almost all newspapers did multiple stories this time, analyzing the data. Crime against women and regional trends in crime dominated the coverage.

Here is a round-up of some of the data journalism initiatives in India. These are among the most noticeable efforts, though the list is not exactly comprehensive. One clarification: there are quite a few other sites that have fairly decent content based on analyzing data. But they are not really journalistic stories, for there are no ‘stories’ in most of them. In fact, that is a big confusion that exists in data journalism—what is journalism and what is not. But then, that is a topic by itself and is not restricted to India. So, we will keep that for another day.

Here is the list, with examples wherever possible.

In addition, two other newspapers must be mentioned for their data journalism efforts, though they do not call it by that name. Mint, a business newspaper and Times of India, India’s largest selling English newspaper.  Mint was the first newspaper to start visualizing stories much before the excitement about data journalism started. It also does a number of data analysis stories but they are restricted to mostly macroeconomics, not of immense interest to the lay readers. The Times of  India, has started a regular section in its print version, called STATOITICS (TOI is a shorter version of its full name), where it presents interesting data through simple visualization.

The trend is new but is surely catching up. One challenge, though, is that number crunchers who can write some English are posing as data journalists, taking advantage of lack of presence of real journalists, many of whom are intimidated by numbers. So, instead of being the hot new area within journalism, data journalism has ended up becoming a poor cousin of data science and analytics.

DJ Showcase: Scroll.in (8 July 2014)

Four charts that explain why we don’t need a separate rail budget

In yet another great example of data journalism where data/charts (and not so beautiful, eye-catching ones at that) have been used arguing a point, Scroll.in shows why we do not need a separate rail budget. The underlying logic builds on the fact that Railways is neither a big expenditure head nor the most dominant force in its area: surface transport.

 

Is opinion superior?

It is difficult to understand the fascination for the words “column” and “opinion” among Indian journalists, as compared to “stories” and “reports”.  Many think if you get an “opportunity” to write an opinion column, you have arrived as a journalist.

This perceived sense of superiority of “opinion” often makes media pass off good reporting and even data analysis as opinion. This piece in Mint, Why India’s sanitation crisis is a public health emergency, is a fairly good example of data journalism, which tries to corelate India’s widespread practice of open defecation with malnutrition. The accompanying map too is a fairly good, if not extraordinary, visualization.

But why the hell should it be labeled as opinion? Is it to give it that supposed importance or is there no other sections that the editors can fit it into?

In fact, data journalism is not as new or rare as we think it is in India. Stories like these are actually data journalism pieces. Just that many publications do not realize it.

 

 

India has second-highest number of shadow entrepreneurs in the world: ToI

“India has the second highest number of shadow entrepreneurs in the world. For every business that is legally registered in India, there are 127 shadow businesses that are not,” says a report in Times of India, quoting a study of 68 countries led  by Professor Erkko Autio and Dr Kun Fu from Imperial College Business School, UK.

Shadow entrepreneurs are individuals who manage a business that sells legitimate goods and services but they do not register their businesses. This means that they do not pay tax, operating in a shadow economy where business activities are performed outside the reach of government authorities.

In short, it is a problem of law enforcement.

Indonesia has 131 unregistered businesses for every one legally registered business, while India has 127 such businesses for every legal business. UK has just one unregistered business for every 30 legally registered business.

Shadow

Chart: DataJourno