Ethical issues in digital data archiving and sharing

by Natasha Mauthner and Odette Parry on 15 October 2010

The development of a manifesto on ethics of e-research assumes that e-research differs from other research approaches, therefore requiring its own set of ethical guidelines. In considering the distinctiveness of e-research, it is helpful to differentiate between e-research as tool or method and e-research as site or ‘field’. Our own discussion focuses on e-research as method, though we recognise that the distinction is blurred given that e-research methods also re-constitute ‘the field’ in the form of e-data. We also restrict our contribution to one form of e-research, digital data archiving and sharing (see Mauthner and Parry under review). Data sharing has a long tradition in the social and natural sciences, and researchers recognise the scientific, moral, cultural and other benefits of preserving and sharing data. But the practice of data sharing is changing in significant ways. First, whereas data sharing was once a matter of individual choice, researchers are increasingly expected, and in some cases required, to share their data. Data sharing policies implemented by research funding agencies and scientific journals are making funding and publication, respectively, conditional on compliance with these policies. Second, the model of data sharing that is being promoted, and in some cases prescribed, is Open Access whereby data are expected to be made openly available to the maximum extent possible. Third, data sharing used to take place through personal exchange. Today, online databases mean that data can be instantly shared with the wider public and scientific communities. These combined changes mean that researchers are facing increasingly complex ethical dilemmas arising from their multiple, sometimes conflicting, responsibilities and obligations towards research respondents, co-researchers, the institutions that support research, science, and the general public.

One set of ethical issues concerns how data can be shared in the interests of science and society, whilst also protecting the interests of research participants. In the case of research involving human data, researchers typically feel a moral imperative to honour relationships of trust they have developed with respondents who have entrusted them with personal information. Researchers may use the data for the benefit of science and society, and to further their own careers, but many will do so only in a context where they feel they can safeguard respondents’ moral interests. Online open access data sharing can be perceived as threatening, or indeed violating, these personal and trust-based relationships and moral responsibilities because it entails a loss of control over users and usage of data, and therefore over respondent protection. The normative ethical framework implied by open access data sharing encourages researchers to seek respondents’ blanket informed consent to universal and unconditional use of their data as a preferential option. Researchers have some discretion to impose access restrictions, but policies and guidelines often frame these as secondary or last-resort options. This means that, in practice, researchers can provide only limited information and reassurances to their respondents about potential future uses and users of shared data. This can leave researchers with ethical concerns over exposing respondents to risk and uncertainty, and over the marginalisation of respondents’ moral and political rights to retain on-going involvement and decision-making powers in how their data will be used in the future. While anonymization is often put forward as a way of addressing issues of privacy and protection, in practice absolute anonymity may be impossible to achieve; and moreover, many see it as compromising the scientific integrity and value of the data.

A further set of ethical issues relates to the moral rights and interests of the researchers who generate the data, and how data generation, a skilled activity that requires time and career investments, will be recognised and rewarded in its own right. Within research teams, data collection efforts are usually, though not necessarily, recognized and rewarded through, for example, joint publications. However, there are currently limited guidelines or protocols for recognition and reward within the context of open access data sharing. This is of ethical concern particularly given that divisions of labour within research teams tend to be marked by power differentials (see Mauthner and Doucet, 2008; Mauthner and Edwards, 2010). For example, data tend to be produced by junior researchers, PhD students and/or technicians. Their structural positions and/or career stage may leave them vulnerable to lack of recognition and a limited say in data preservation and reuse. In international projects, data sharing may become a form of scientific neo-colonialism. While open access potentially provides postcolonial contexts with easy and cheap access to data generated elsewhere, these researchers may lack the necessary scientific, technical, digital or cultural resources to make effective use of the data. In practice, data users (e.g. from Western nations) may stand to gain more than data producers (e.g. from non-Western nations). Discourses equate open access data sharing with global empowerment of researchers and the democratization of knowledge. However, this can obscure the politics of collaborative knowledge production, and the ethical issues this raises.

Of equal ethical concern is the erosion of researchers’ moral and intellectual property rights that is accompanying the institutionalisation of open access data sharing. Data sharing policies and guidelines are redefining data as public rather than personal goods, and the moral case for data sharing is that it enables publicly-funded researchers to fulfil their moral duty and obligations to the public, by making available data that have been collected using public funds. This shift in definition allows research institutions (or the state) to assume authority over research data. This was evident in the recent case of Professor Mike Baillie, an ecologist from Queens University Belfast. In April 2010, Baillie was forced to release tree-ring data under the Freedom of Information Act. The Information Commissioner’s Office ruled that Queen’s University Belfast must release the data to the public because Baillie did all the work while employed at a public university. Baillie, however, claimed that the tree-ring data he had collected over a 40 year period were his own personal intellectual property. This raises questions over the ethics of mandating researchers to release data, all forms of which are inherently personal by virtue of their human production. In the social sciences, data are seen to be co-constructed by researchers and respondents, and they often contain much personal information relating to the researchers. Yet funding agencies are increasingly requiring researchers to share their personal data, rather than seeking their informed consent, as they do with research respondents. This raises ethical questions concerning the erosion of researchers’ autonomy and discretion to decide whether to share their data, with whom, when and how.

The ethical recommendations that emerge from the case of digital data archiving and sharing can usefully be considered in terms of three sets of relations: researchers’ relations with the public, with respondents, and with fellow researchers.

Relations between researchers and the public: One issue concerns our ethical responsibilities towards the public, which includes the state and public as funders of research as well as science communities. Data sharing policies and discourses suggest that where scientists are publicly-funded they have moral responsibilities and obligations to make their data and science publicly accessible and available to the maximum extent possible. This puts many researchers in a difficult ethical position, as they also feel that they moral responsibilities towards their respondents and co-researchers. The ethical dilemma arises because data sharing policies and discourses are rooted within universalist (utilitarian and deontological) moral theories that privilege researchers’ ethical obligations to a universal and general public rather than towards specific individuals. Researchers have a responsibility to be reflexive about the moral (and scientific and political) norms and values embedded within e-research policies and practices, and their ethical consequences.

Relations between researchers and respondents: The ethical issues that arise in the context of digital data archiving and sharing stem in part from the way in which technology obscures the human practices, and severs the human relations, of knowledge production. This partly explains the economic appeal of e-research; its perceived efficiency and cost-effectiveness derive from the fact that researchers are working with data taken out of the human contexts of their production. Yet this instrumental approach to data (data as commodity), and their production and use, is precisely the source of researchers’ ethical concerns because many see the human context and care of data generation and knowledge production as critical to securing the epistemic, ethical and political integrity of knowledge. This explains why, for example, many researchers remain sceptical about using large scale data repositories and favour consultative arrangements, and direct contact and discussions with potential users.

Relations amongst researchers: Ethical frameworks and guidelines have traditionally focused on the protection of research participants, and the relationships that researchers develop with these respondents. The rights and interests of the researchers conducting the research, including collaborative relations amongst researchers, have largely remained invisible. We would recommend that a manifesto on ethics of e-research accord equal importance to the moral rights and interests of researchers, as it does to those of research participants.

