What are the implications of using hacked data for research?

A short thread inspired by the fact that, before AWs took it down, #Parler was extensively hacked and user data was leaked.

The #Parler dataset seems crazy interesting for doing research, and my first reaction after the breach was to shre it with other #CompSocSci ppl.

However, I started having second thoughts, so what follows is to organize ideas and have it somewhere I can look back to.

2/n
Generally speaking, as far as the ethics of research goes a good advice would be to handle hacked data with caution.

First of all, there's an issue of quality. Data might be altered or incomplete, and the source cannot be considered accountable (assuming src is anonymous).

3/n
Secondly and more importantly, a researcher using the data would probably be violating users’ consent and acting against the data collector's will.

Finally, users’ privacy is at stake, since researchers could see material that users didn’t agree for other people to see.

4/n
Sharing private information without consent might put people at risk of harm.

This is all the more true in cases such as the #ParlerHack, where the leaked information is of particularly sensitive nature, and there’s a high risk of unintended consequences.

5/n
However, it can be argued that in many cases the milk is already spilled.

After all the data is out there, users are already exposed, and using the leaked information for rsrch (with some precautions) might not cause any additional harm.

Does this mean free for all then?

6/n
Short answer, I am not sure.

On practical grounds, there might be legal boundaries in place (depending on the context).

But more generally, from a deontology perspective I think that (as long as the resercher is not responsible for the hack) the picture is blurred.

7/n
Sure, the issue of privacy when data is out in the open becomes secondary. Plus, data can be anonymized by the researcher, so that private information is not furtherly disseminated.

On the other hand, I think the problem of users’ consent should not be bypassed as easily.

8/n
There's also another issue.

In fact it can be argued that using illegally obtained data for research purposes might legitimize (or even encourage) illegal or unethical behavior.

9/n
Ultimately, the fact that data is publicly available data it doesn't mean neacessarily that it is available for research, and some of the arguments against its use are hard to dismiss.

Do you know of any explicit guidelines in poli /soc sciences that address this issue?

n/n
cc @therriaultphd @ylelkes @conjugateprior @cjw_phd

https://t.co/IvcTXARoga

More from Internet

There are lots of problems with ad-tech:

* being spied on all the time means that the people of the 21st century are less able to be their authentic selves;

* any data that is collected and retained will eventually breach, creating untold harms;

1/


* data-collection enables for discriminatory business practices ("digital redlining");

* the huge, tangled hairball of adtech companies siphons lots (maybe even most) of the money that should go creators and media orgs; and

2/

* anti-adblock demands browsers and devices that thwart their owners' wishes, a capability that can be exploited for even more nefarious purposes;

That's all terrible, but it's also IRONIC, since it appears that, in addition to everything else, ad-tech is a fraud, a bezzle.

3/

Bezzle was John Kenneth Galbraith's term for "the magic interval when a confidence trickster knows he has the money he has appropriated but the victim does not yet understand that he has lost it." That is, a rotten log that has yet to be turned over.

4/

Bezzles unwind slowly, then all at once. We've had some important peeks under ad-tech's rotten log, and they're increasing in both intensity and velocity. If you follow @Chronotope, you've had a front-row seat to the

You May Also Like

1. Project 1742 (EcoHealth/DTRA)
Risks of bat-borne zoonotic diseases in Western Asia

Duration: 24/10/2018-23 /10/2019

Funding: $71,500
@dgaytandzhieva
https://t.co/680CdD8uug


2. Bat Virus Database
Access to the database is limited only to those scientists participating in our ‘Bats and Coronaviruses’ project
Our intention is to eventually open up this database to the larger scientific community
https://t.co/mPn7b9HM48


3. EcoHealth Alliance & DTRA Asking for Trouble
One Health research project focused on characterizing bat diversity, bat coronavirus diversity and the risk of bat-borne zoonotic disease emergence in the region.
https://t.co/u6aUeWBGEN


4. Phelps, Olival, Epstein, Karesh - EcoHealth/DTRA


5, Methods and Expected Outcomes
(Unexpected Outcome = New Coronavirus Pandemic)