Saturday, April 17, 2021
Beranda Document Open Data-Based Investigative Journalism

Open Data-Based Investigative Journalism

- Wahyu Dhyatmika

Investigative journalism is conducted with the aim of exposing mistakes, scandals and offenses. Even though a journalistic work succeeds in explaining a new thing thoroughly, if the journalist cannot reveal what was wrong through the writing then the coverage is not an investigative work. So the standard of investigative work is the disclosure of mistakes that were originally covered up. Similar to what detectives, investigators, and prosecutors do.
However, journalists are not law enforcers so they cannot force people to provide the required data. Journalists must master other techniques to find this data. In the midst of developments in various investigative techniques - undercover, follow the money, people trail, paper trail, etc. - there is a new model of investigative journalism, namely data driven investigative stories. In America it is sometimes referred to as Computer Assisted Reporting (CAR).

History at a Glance

Data-driven investigative journalism began in the United States when Detroit Free Press journalist in Michigan Philip Meyer used a computer to analyze the composition of Detroit residents to explain a series of riots there in 1967. For this coverage, he won a Pulitzer Prize. Meyer also wrote the book Precision Journalism in 1973 and a program to analyze data.

Five years later, Donald Barlett and James Steele of the Philadelphia Inquirer, tried to prove allegations that a judge in Philadelphia was racist when deciding sentences for blacks there. There are no hard documents / data that can prove this accusation other than the stories that developed in the community.

They both then reviewed all legal cases that occurred during the past 10 years in the state. 1,034 case laws are summarized in a table with 42 columns. Next, a card system was created to be analyzed by Philip Meyer's computer program 'Data Text'. The table was converted into code cards totaling 9,618 cards. By using an IBM 7090 computer they succeeded in proving that the bias of the judge when trying the black defendant was true.

In 1980, The Providence-Journal Bulletin journalist Elliot Jaspin was awarded an award for stories that use a lot of data analysis. Some of these include investigative papers regarding bad housing loans and traffic accidents involving school buses. He then collaborated with Daniel Wood to create a computer program Nine Track Express to help journalists analyze data. Jaspin also founded the National Institute for Computer Assisted Reporting (NICAR) at the Missouri School of Journalism in 1989.


Data-based investigations differ from conventional investigations which usually start reporting with information obtained from whistle blowers or whistleblowers. The data referred to in this context is a set of figures obtained from collection over a certain period of time. It is a spreadsheet and must contain numbers. It is this form of data that is used as a prefix to investigative-based coverage

The development of digital technology during the last ten years is in line with the data-based investigative reporting model. When computers increasingly dominate life, there is a trend known as Big Data. Machines appear that can suck up various information about individual activities, for example how many times a day we use social media. Macro data such as the number of children owned by the majority of the population of a country are also easier to obtain and access by the public.
In addition to technological developments, the emergence of the Open Government Initiative or open government initiatives also makes access to data easier. Governments in various countries are encouraged to publish various data related to public interest and governance via the Internet. In Indonesia there is a special portal called which was initiated during the UKP4 (Presidential Work Unit for Development Control Supervision) led by Kuntoro Mangkusubroto. The site contains data related to ministries, government agencies, local governments, and all other agencies in Indonesia.

These data are important so that journalists can overview a problem. So far, journalists often only make news from interviews or based on instinct. For example, when the idea to write a feature on poverty arose, journalists would go to a certain village which is known as a poor area. Another possibility is to ask a particular service or an observer skilled in the art about a suitable location for reporting.
In data driven stories this is not done because journalists choose their own location based on data, for example, poverty data in a district or city. Data-based coverage is more realistic than interview-based coverage because it is possible that the interviewee did not answer the questions completely and comprehensively. Public officials may interpret data based on their interests and present it to journalists.

When data-based reporting is done, this layer of interpretation is removed. That way, hopefully we can get a more objective picture. In addition, data-based reporting is more accurate in describing a problem. The Tempo Weekly News Magazine, for example, once published a report on stealth ships in the Nusantara Sea which investigated the reasons of businessmen in circumventing regulations regarding the requirement for ships to operate in Indonesia to be owned by Indonesian companies. The Tempo crew has complete data regarding this matter from the Directorate of Sea Transportation. However, the large amount of data made Tempo decide to interview several NGO activists who were concerned about the issue, one of which was KIARA.

READ  preliminary

From the results of the interview, it was found that the locations that often became the route for these stealth ships were Tual waters, Maluku, and that was then followed up. The complete data was ultimately not processed. From such a large problem, a small problem was chosen, the investigation was focused on Tual and Merauke. Tual is the operation area for stealth ships from the Philippines, while Merauke is the operation route for ships from China and Taiwan. The investigations did succeed in uncovering the violations, but were unable to see the big picture. Tempo only uses initial data and then focuses on two areas. The magnitude of the problem was not resolved.

READ  Concurrent Pilkada and Information Disclosure

Coverage Stage

In reporting data-based investigations, there are several steps that must be taken by journalists. First, find data relevant to the theme (data scrapping). This stage is used if the data we are looking for is not openly available, aka not open data. The data scrapping process can be done using codes (python and ruby) or using applications such as
Second, cleaning the data that has been collected (data cleaning). Not all of the data we have is useful or relevant to the theme of our coverage, therefore this stage must be carried out. To clean the data contained in the datasheet, journalists can use several software such as open refine and SQL.

After the data is cleaned, journalists must find patterns in the data so that they can be analyzed. Some of the software commonly used at this stage are Microsoft Excel, Microsoft Access, and various software related to statistics such as SPSS.

The last stage is the publication of the results of the investigation. The data that has been found, of course, cannot be displayed completely.

Data visualization software is needed to make it look more attractive, simple, and easy to understand. Journalists can use several data visualization tools such as and ArcGIS.
In his homeland, the United States, data-based investigative coverage is commonplace by journalists and the mass media. Some examples include:

  1. Innocent Lost. In March 2014, the Miami Herald newspaper published an investigative report on children who died as a result of neglect and domestic violence. They combed data from the Department of Children and Family in Miami for the period 2008-2014 and found 534 cases of children who died in the family. This investigation found that the deaths of these children under five could have been avoided if government officials had acted more quickly. This is because the community had previously reported the potential for violence against the children who died to the relevant agencies.
  2. Echo Chamber. In December 2014, Reuters examined 10,300 petitions or cases filed for trial in the United States Supreme Court over the past 9 years. Their investigation found that there were 66 lawyers who had a chance 6 times greater than the other 17 thousand lawyers to file a case with the Supreme Court of Justice and 31 law firms which had a much bigger chance than the other 8 thousand offices. The 66 attorneys represent well-known law firms and more than half have worked with one of the 9 Supreme Court justices in America. 51 of the 66 lawyers came from law firms which more often represented the interests of large corporations / investors. They were less than 1 percent of the total lawyers who filed their cases, they held 43 percent of the cases heard by the Supreme Court.
  3. Medicare Unmasked. In June 2015, the Wall Street Journal published their investigation into the proportion of doctors who receive the largest payment from the US health insurance program, Medicare. They found that 1 percent of the total doctors who received Medicare funds received 17.5 percent of total payments in 2013, and 16.6 percent in 2012. A total of 950 thousand doctors were on Medicare's list with a total payment of US$ 90 billion. This means that there are 9,500 doctors who receive US$ 15.7 billion in Medicare funds. This data was opened after the WSJ in 2011 sued the Association of American Doctors who insisted that the Medicare payment data is confidential doctors. In May 2013, the court ruled this open data.

Closing: Indonesian Context

In Indonesia, data driven investigative stories have not been widely used by journalists. Investigative coverage still relies on whistle blowers. However, MBM Tempo's coverage of Ratu Atut, the former Governor of Banten, can be used as an example of how data-based investigative coverage is carried out in Indonesia. In the November 6, 2013 edition, Tempo used various open data such as the list of companies on the portal owned by the Indonesian Chamber of Commerce (Kadin), the audit results of the Supreme Audit Agency (BPK), and the State Officials' Wealth Report (LHKPN) available on its website. Corruption Eradication Commission (KPK).

Learning from these various cases, there are a number of conditions that must be met so that data-based investigative coverage can continue to develop in Indonesia. First, there must be an open data initiative from the government so that journalists can access a variety of data according to their reporting needs.
Second, the media and journalists must be diligent in using the Freedom of Information Law. Public. Through the use of this law, access to data can be obtained legally. Third, there are digital applications that can help integrate scattered data. Fourth, editors have the ability to read statistics, clean datasheets, analyze numbers and build applications (data journalism).


Please enter your comment!
Please enter your name here

Most Popular

Recent Comments