Tackling Misinformation: What Researchers Could Do With Social Media Data

Social media platforms rarely provide data to misinformation researchers. This is problematic as platforms play a major role in the diffusion and amplification of mis- and disinformation narratives. Scientists are often left working with partial or biased data and must rush to archive relevant data as soon as it appears on the platforms, before it is suddenly and permanently removed by deplatforming operations. Alternatively, scientists have conducted off-platform laboratory research that approximates social media use. While this can provide useful insights, this approach can have severely limited external validity (though see Munger, 2017; Pennycook et al. 2020). For researchers in the field of misinformation, emphasizing the necessity of establishing better collaborations with social media platforms has become routine. In-lab studies and off-platform investigations can only take us so far. Increased data access would enable researchers to perform studies on a broader scale, allow for improved characterization of misinformation in real-world contexts, and facilitate the testing of interventions to prevent the spread of misinformation. The current paper highlights 15 opinions from researchers detailing these possibilities and describes research that could hypothetically be conducted if social media data were more readily available. As scientists, our findings are only as good as the dataset at our disposal, and with the current misinformation crisis, it is urgent that we have access to real-world data where misinformation is wreaking the most havoc.

While new collaborative efforts are gradually emerging (e.g., Clegg, 2020; Mervis, 2020), they remain scarce and unevenly distributed across research communities and disciplines. Platforms periodically fund research initiatives on mis- and disinformation, but these rarely include increased access to data and algorithmic models. Most importantly, in these kinds of collaborations, intellectual freedom is easily limited by the fact that the overarching scope of the research is not defined by the researchers, but by the platforms themselves. In the rare case data sharing is a possibility, negotiations have been slow for several reasons, including platforms’ concerns over protecting their brands and reputation, and ethical and legal issues of privacy and data security on a grand scale (Bechmann & Kim, 2020; Olteanu et al., 2019). However, these barriers are not insurmountable (Moreno et al., 2013; Lazer et al. 2020). For instance, establishing a mechanism by which users can actively consent to various research studies, and potentially offering to make the data available to the participants themselves, would be a significant step forward (Donovan, 2020).

We invited misinformation researchers to write a 250-word commentary about the research that they would hypothetically conduct if they had access to consenting participants’ social media data. The excerpts below provide concrete examples of studies that misinformation researchers could conduct, if the community had better access to platforms’ data and processes. Based on the contents of the submission, we have grouped these brief excerpts into five areas that could be improved, and conclude with an excerpt regarding the importance of data sharing:

measurement and design,
who engages with misinformation and why,
unique datasets with increased validity,
disinformation campaigns,
interventions, and
the importance of data sharing.

While these excerpts are not comprehensive and may not be representative of the field as a whole, our hope is that this multi-authored piece will further the conversation regarding the establishment of more evenly distributed collaborations between researchers and platforms. Despite the challenges, on the other side of these negotiations are a vast array of potential discoveries that are needed by both the nascent field of misinformation as well as society.

Read the full publication here.