Research rebuttal document reveals misuse of Holocaust datasets – Eurasia Review

Aerospace Engineering Faculty Melkior Ornik is also a mathematician, history buff, and a strong believer in integrity when it comes to using hard science in public debate. So when a story appeared in their News Feed about a pair of researchers who developed a statistical method to analyze datasets and used it to allegedly refute the number of Holocaust victims in a concentration camp in Croatia, this naturally caught his attention.

Ornik is a professor in the Department of Aerospace Engineering at the University of Illinois at Urbana-Champaign. He then studied the research in depth and used the method to reanalyze the same data from the United States Holocaust Memorial Museum. Then he wrote a rebuttal article debunking the researchers’ findings.

Ornik’s rebuttal is published in the same newspaper as the original article. He said the editor asked him to include a list of answers to some of the potential questions other scientists might have when reading his article. Weeks later, the newspaper placed a note on the original article stating that they neither endorse nor share the views of the authors, and recommends reading Ornik’s article.

“As scientists, as engineers, I think it’s our job to correct flawed and faulty science,” Ornik said. “There is so much effort to get the public and policymakers to believe in science, that when an expert in mathematics says he has proof, it gives credence to the argument. But when their claims are blatantly wrong, it is not good for science and it is not good for society. This is why it is especially important for scientists to challenge false findings when we find them. “

According to Ornik, some people argue that concentration camps did not exist or were not used to kill people, or that the currently widely accepted death toll has been dramatically inflated. Most historians do not take claims seriously in light of the vast data and evidence available.

“For the authors of the original article, claiming they found mathematical proof that this camp’s casualty list was fabricated has obvious historical implications,” Ornik said. “I think to some extent the damage has already been done, but I felt the need to record the assumptions, inaccuracies and misuse of raw museum data that I found in the original research . “

The article Ornik replied to presents a new method for identifying anomalies on a set of histograms. Ornik said he was not contesting the merits of the method presented in the original article, but simply its application to the Jasenovac concentration camp.

Ornik was suspicious of the article’s findings because the researchers suggested in one case that a smaller list naturally had a smaller outlier score, but they compared the scores of all sizes of victim lists for assert that the one related to Jasenovac, one of the most important, was problematic. .

“I started researching if there was some sort of size bias and whether or not they were actually more likely to flag or not be problematic to a larger list. And it turns out. that, despite the authors’ claims, they were, “Ornik said.” Larger lists are more likely to be computed to be problematic than small lists when their method is applied to data. “

Ornik, who routinely uses similar statistical analyzes in aerospace applications, explained another reason why their statistical argument doesn’t work.

“When you look at data, a collection of anything, and you want to discover an outlier – something different – you have to assume that all the data is from the same source, from the same distribution. List the victims by year of birth. This would give a graph of the age of each person. Let’s say 10 percent are over 70 years old. However, this distribution would not be true for a list of deported children, for example, because this list, by definition, is structurally different. It is also different from a list of everyone who has an ID card. Identity cards are only issued to people who are not children. Yet the lists these researchers have worked with come from a multitude of sources and include lists of children, lists of people who marry, lists of prisoners of war, things that by definition cannot come from the same distribution.

Another major mistake in the original document, Ornik said, is that some duplicate lists were treated as two separate lists. This meant that about 67 percent of their entire database were in fact sublists of the larger list.

“The more than 7,000 lists published online by the Holocaust Museum are unorganized,” Ornik said. “For example, there are two lists that contain exactly the same data; one is in Cyrillic and the other uses the Latin alphabet. But they treated them as two separate lists. There are other lists that contain the same name, but there is no way to know if it is the same person or two different people born on the same day with identical names. They could have removed the very glaring errors in which a list is clearly duplicated, but the rest you would need to access the original historical data. “

Comments are closed.