Critical Data Scientists at Work: Summary report of the ICWSM-2019 Workshop on Critical Data Science

This document summarizes the activities and outcomes of the Workshop on Critical Data Science at ICWSM-2019 in Munich, Germany, as well as points to future directions for work in critical data science. It is open to all of your comments and suggestions, so please send us your feedback and additions! Email criticaldatasci2019@gmail.com, or fork this page at https://github.com/critical-data-science/critical-data-science.github.io/edit/master/index.html.

–Katja Mayer and Momin M. Malik, 01 September 2019

Overview

In an early suggestion of the term “critical data science,” Jo Bates (2016) writes:

“New data science techniques offer immense potential for scientific advancement and human development – there may even be a role for data science in advancing the democratic project. However, in order to ensure that these advances benefit all, rather than empower the few, it is crucial that data scientists work collaboratively with others to incorporate an analysis of power into their practice.”

Like any other human endeavor, data science will be what we make it. ICWSM, in bringing together social sciences and computer science over the past 13 years, is a prime space to bring together scholars from different disciplines and engage one another around frameworks for responsibly carrying out data science on social phenomena .

We define critical data science as our vision of the practice of working with and modeling data (the “data science”), combined with identifying and questioning the core assumptions that drive that practice (the “critical”)—not just looking at the world, but “back[ing] up and look[ing] at the framework of concepts and assumptions and practices through which [we] look at the world” (Agre, 2000). It can be seen as the intersection (or perhaps the union) of data science and critical data/algorithm studies, and an example of a “critical technical practice” (Agre 1997).

The workshop arose from the premise that only through combining cultures of critique with those of practice can we create responsible and sustainable ways of interdisciplinary collaboration. This workshop was to create a space for such combination, and to explore what it might contain: what will it look like to do data science with awareness to power relations? Who might make up critical practitioners of data science, or how might more emerge? What sorts of communities, coalitions, and collaborations need to exist between technical practitioners and nontechnical analysts, or between those working in academic research, industry, government, or in NGOs?

About this document

This serves as an overview and summary of the workshop, but we are also keeping this open as a living document. We invite workshop participants and others to add to this, especially in adding concrete actions to carry out in our professional and scholarly practices and by which to engage at our institutions/organizations.

Presentations

The workshop included short presentations by participants to support reflection of their own and neighboring scientific practices, and to create opportunities for further cooperation. Participants covered a broad range of backgrounds: industry data science and engineering, computer science, computational social science, linguistics, classics, environmental and human rights activism, social work, digital democracy, and arts.

Philippe Saner discussed ideas of data science in education. He observed that “the cultural framing of the ‘sexiness’ of data science… by industry transforms [claimed] neutrality into a prospective vision enabling students to see themselves as future ‘societal leaders’, thus in specific positions of power.” As data serves as “space between fields”, he argued for the possibilities of exploring these spaces rather than ignoring them, or leaving them to corporate logic and engineering tradition.

Jared Moore critiqued the use of the “social good” label of many current AI initiatives, noting that it serves to distract from intrinsic ills that come along with the mass deployment of resources (in energy, education, and labor) while generally failing to have a coherent standard of what constitutes “social good” to even assess whether it has been met. He proposed “AI for not bad” as a more honest label for those computer scientists who want to distinguish their work from the amoral mainstream but do not want to commit to the political stances necessary for coherently working towards positive social change.

Publication: Moore, Jared. (2019). “AI for not bad.” [Research Topic: Workshop Proceedings of the 13th International AAAI Conference on Web and Social Media.] Frontiers in Big Data. doi: 10.3389/fdata.2019.00018

Parvathi Subbiah discussed her UK-based research of support for the Chavismo movement in her native Venezuela, focusing on the barriers she has faced of lack of access to training, disciplinary skepticism and an inability to offer guidance, and (from the UK Foreign Office banning travel to her native Venezuela) transnational controls. She discussed how online articles of support for Chavismo among non-Venezuelans opened up areas of inquiry in place of the field work she was unable to do, but at the same time in themselves lack critical context. One sentiment analysis tools developed for the US context linked terms of “state” or “worker” to socialism, which created over-estimates of the prevalence of discussions of socialism and missed prominent themes of anti-imperialist sentiment. By using mixed-methods research, in particular interviews with non-Venezuelan supporters of Chavismo and Venezuelans living abroad, she has been able to identify and overcome problems with online records only.

Jaclyn Sawyer discussed her experience of data science in social work, the discipline that perhaps has the greatest claim to have systematically thought about and been devoted to the practice of “social good” (D’Ignazio, 2018; Patton, 2019). She reflected on her experience “working at the intersection of social welfare, data, and technology, a space that gives voice to experts of the human domain in the digital realm.” Jaclyn herself, like many other workshop participants, builds on rich inter- and transdisciplinary experiences, with her own background stretching across public policy, social work, and data science: with this, she described her work on the front lines of using data science for providing social services in the Data Services and Program Analytics department at Breaking Ground, a non-profit providing homeless street outreach and affordable housing opportunity in New York City. She described what could perhaps be an exemplar of responsible modeling, design, and implementation practice: starting not opportunistically from (patchy, inconsistent, and low-quality) available data, but starting with stakeholders and the perspectives of those they serve, and working with a team of laboriously building a whole pipeline of data collection, modeling, and application around homelessness and housing insecurity. Some particularly admirable parts of the program is that data collection only happens after trust-building with the program’s clients, that data cleaning is a major part of the pipeline, and that the project has gone through multiple development cycles to improve and be responsive.

Tea Brasanac reported on her project of “Visual anonymity and data privacy.” As she pointed out, there is a long journalistic tradition of hiding people’s faces to allow them to maintain anonymity while speak for themselves on video. When computational tools became able to reliably detect of faces, an obvious and relatively easy next step would have been automated, real-time blurring of faces while recording video; yet there is not a single available tool that does this. Tellingly, all development past facial detection has been towards facial recognition. Existing tools for blurring require identifiable video to be uploaded for processing, which as she learned from interviewing refugees, is not good enough: her interviewees did not want identifiable recordings of themselves to ever exist. Identifying and theorizing this lack, she has also set about addressing it, demoing a tool for real-time facial blurring at the workshop.

Laura Schelenz presented questions on how to make data collection in the project “Internet of Us” more inclusive, diversity-oriented, and aligned with data protection as well as ethical principles. This was met with challenges the setup of the project, as the methods of pursuing inclusion and diversity had not been made in consultation with the Global South communities who were the intended end-users of the project outputs; this led to a robust discussion of how to effectively distribute the resources which we as researchers have access to within the constraints that come along with those resources, and how to best fight structural inequality with projects that are enabled from that very inequality.

Helena Mihaljević presented reflections on ongoing work with Christian Steinfeldt and colleagues on studying gender representation in mathematical publications. On the one hand, we know there are gendered asymmetries in professional opportunities and advancement in academic mathematics (as in every field and profession), and it is worthwhile and important to study these at scale. On the other hand, studying this at scale makes it infeasible to ask individual authors for their gender identification; and name-based automated gender recognition both denies the lived experience of trans and gender non-binary individuals as well as systematically fails on names of Chinese and Eastern European origin, challenges that are seldom discussed in the literature. The resulting discussion closely paralleled the postcolonial theory idea of “strategic essentialism”; the extent to which it is possible to strategically deploy essentializing categories, with known limits and injustices, for the purposes of fighting other injustices. [We thank Os Keyes for making this connection of strategic essentialism to uses of data and modeling.]

Publication: Mihaljević, Helena, Marco Tullney, Lucía Santamaría, and Christian Steinfeldt. (2019). “Reflections on gender analyses of bibliographic corpora.” [Research Topic: Workshop Proceedings of the 13th International AAAI Conference on Web and Social Media.] Frontiers in Big Data. doi: 10.3389/fdata.2019.00029

Gabriel Pereira presented joint work with Annette Markham, experimenting with “algorithmic memory making.” In their art project, the Museum of Random Memory, they use models to deliberately distort recordings, drawing attention to how digital media are mediated in ways usually meant to be invisible and in so doing asserting control over the transmission of traces, stories, and narratives. Pereira connected this to the need of future-oriented ethics, an ethics of uses of data now knowing that future consequences are unforeseen and unforeseeable, while questioning the ethical role of data collectors in presenting data as lived experience.

Publication: Markham, Annette and Gabriel Pereira. (2019). “Experimenting with algorithmic memory-making: Lived experience and future-oriented ethics in critical data science.” [Research Topic: Workshop Proceedings of the 13th International AAAI Conference on Web and Social Media.] Frontiers in Big Data.

Non-attendee submissions

We had some submissions from participants who were not able to attend. We thank them for their participation nonetheless.

Eugene T. Richardson (Department of Global Health and Social Medicine, Harvard Medical School; Department of Medicine, Brigham and Women’s Hospital; and Partners In Health, Sierra Leone) submitted a draft chapter, “Immodest Causal Inference” from a forthcoming book. The formal representation of causality via directed acyclic graphs has been a topic of growing interest and excitement within data analysis, and indeed presents a counterpoint to the “correlation-only” mainstream of machine learning and its many dangers. But in this chapter, Richardson critiques some of the implications of this formal language. Namely, it leads us away from understanding structural factors (as, within the language of causal graphs, they can be ignored if they are anything but a direct ancestor within a causal graph), thereby leading away from solidarity and radical interventions. We look forward to the finished work as a powerful example of critically examining the technical limitations and psychological implications of a burgeoning paradigm.

Íñigo Martínez de Rituerto de Troya (Data Science for Social Good Europe, Universidade Nova de Lisboa) submitted a statement describing his work “working with the Portuguese national Institute for Employment and Professional Development to help unemployed individuals find work or undergo professional, vocational, or personal training.” He describes trying to balance using modeling with a skepticism about attempts to use technical means to address social problems: his current practice includes trying to find ways to co-create predictive systems with those who the systems will affect, and reflect on possible negative impacts.

Michael Castelle (Centre for Interdisciplinary Methodologies, University of Warwick) submitted a statement, “Towards a 21st-Century Critical AI: Methods for a Reflexive Deep Learning Practice”, describing his research into epistemic transformations around convolutional and recurrent neural networks. These methods will “likely pose a conceptual threat to traditional disciplinary methods (as well as a financial threat for competitive grants)”, even more than computational social science or digital humanities thus far; how might we address this? He anticipates, and encourages, the creation of a “trading zone” by which techniques other than just computer science, statistics, and mathematics be brought to bear to understand neural networks. Namely: history and sociology of science, knowledge, and technology, as well as the study of semiotics, and the “anthropological study of cultural and linguistic ideologies”. He described a proof-of-concept contribution to the 2018 Empirical Methods in Natural Language Processing conference’s 2nd Workshop on Abusive Language Online, showing the downstream modeling effects of annotation by context-aware domain experts versus context-unaware non-experts.

Discussion

Workshop presentations and discussions both delved into how we can change our socio-technical practices very much in line with Agre’s (1997) call for a critical technical practice. In our account, as in his, three things are central.

First, critical technical practice requires deeply personal involvement with our scientific routines. Reflection and personal knowledge is a deep part of scientific practice, but scientific disciplines frequently justify themselves by claiming objectivity, neutrality, and universality, leaving little room for reflection (Polanyi, 1962). One of the most enlightening parts of the workshop discussion was asking the computer scientists in the room what biographical aspects or experiences led them to, unlike many technical practitioners, be open to non-technical perspectives; for many, it was personal connections or commitments to political projects.

Second, Agre identifies an experience that we suspect is increasingly common: technical practitioners, those whose day-to-day work or research involves activities like math or coding for doing quantitative analysis and building systems, find not only fundamental limitations in their disciplines but also find themselves lost for how to seek answers. Agre describes his own process of looking to the humanities and social sciences, and experiencing a sense of vertigo when he finally learned to read other disciplines in their own terms rather than trying to translate them into the specification of a technical mechanism or a formal procedure.

Lastly, Agre envisions technical practitioners not abandoning their practices after discovering profound limits, but starting to carry out the practices from a fundamentally different foundation: a critical technical one. There is normally a dichotomy between the social scientific “analysts” who produce critical accounts of the ideas of and practices of science, and the scientific “actors” who produce those ideas and practices (Collins, 2008); Agre’s call suggests the possibility of hybrid identity between these two. This theorizes some current practices and future possibilities within communities like the FAT* conference, or niches of critical data and algorithm studies, surveillance studies, and data activism; and it perfectly captures a number of projects presented at the workshop that remain technical products or analyses, but are grounded by something far deeper than the under-theorized pragmatism that drives so much of software engineering and data analysis.

Outputs and future activities

During the workshop we compiled two blocks of questions that could guide our own personal agenda setting.

Politics: What are our experiences of paradigmatic politics? Who are the insiders, and who are the outsiders for effecting change? Do we feel capable of intervening in curricular decision-making, and can we disrupt dominant narratives of big data hegemony, efficiency and objectivity? What does it mean to do data science for good, for whom? What would be my personal priorities: short term, and long term?
Practice: what concrete actions can we take? How can we create spaces and time for collaboration besides always-hectic, project-based logics? Which incentive and reward structures would we need for that? Which skills do we want to establish in the training of the next generation? How can I/we collaborate? With whom? For what tasks?

Workshop participants developed first ideas on how to design critical technical practice in data science:

Systematic reflection. An important component is systematic reflection the issues we face, and the development of best practices for future projects. Potential task: create a template for such systematic reflection (e.g., an application of such a template might be a set of principles of good data science scholarship)
Participatory Action Research. Philippe Saner considers participatory action research (PAR) projects “as a possibility that brings together the different cultures of critique involved to investigate the ‘practices’ (modeling techniques, methods, tools etc.), discourses (framings, imaginations, visions etc.), and structural conditions of data science as a contemporary knowledge formation.” Multiple workshop participants agreed with PAR as an extremely promising framework, as indeed is also being recognized and brought to bear in the world of technology design (Costanza-Chock, 2018; Costanza-Chock et al., 2018).
Building teams. Jaclyn Sawyer’s observations on interdisciplinary practice could serve as a model for building data teams across sectors.
Venues for publishing. We need more white papers for practitioners, and more publishing / exchange formats for transdisciplinary understanding.
Linking sectors. How do we link sectors—non-profits, academia and corporate interests? There are tons of opportunities for data scientists to lend their ambition to nonprofits, such as in deep data dives. But this is not sustainable, as volunteering data scientists move on. What are the incentives?
Education. We have to change the education and training. How do we do this in our ecosystems?
Institution-building. Instead of starting our own efforts, we can acknowledge those who are already doing relevant work in this space. Can we attach our efforts to organizations of critical researchers, who are already advocating change? How do we identify such organizations and choose which ones to join?
Documentation. Making workflows more open and better documented. Which would be the right tools for this?
Ethical principles. We could adopt FAIR data principles (https://www.go-fair.org/fair-principles/), as well as reflect ethical concerns more openly, but while understanding that fairness is not something universal that could simply be built into technology.
Funding. One topic that came up is, especially in the US case, the role of military funding for computational research. Even if people were not actively changing their research and tailoring it to military priorities to attract funding (which almost certainly happens), there would still be a selection effect: research comporting with military goals gets be disproportionately supported. And, there was little disagreement that critical data science will not fit with military priorities. Can funding instruments better mandate ethical regulation? How can scholars, whether in the US or elsewhere, gain funding for work that challenges structures of power—not just from the military, but from corporations, or foundations with no public accountability? Or if not, what is the alternative: are there levels and types of compromise should we accept?

The slides and discussion cards can be found here: https://critical-data-science.github.io/wcds2019slides.pdf.

And we have also started a reading list via Zotero: https://www.zotero.org/groups/2282959/critical_data_science/items

Acknowledgements

There are a number of scholars at the forefront of combining practice and critique, and we were fortunate to have the guidance and input of several of them who serves as our reviews. Alphabetically by last name, thanks to:

Doris Allhutter, political scientist and STS scholar with a focus on software development;
Catherine D’Ignazio, Feminist Human-Computer Interaction scholar at MIT and co-author of the forthcoming Data Feminism;
Claire Donovan, cross-disciplinary scholar in research evaluation and policy;
Mary Gray, Senior Researcher at Microsoft Research New England, and co-author of Ghost Work;
Nick Seaver, ethnographer at Tufts University and co-compiler of “Critical Algorithm Studies: A reading list”; and
Luke Stark, media studies scholar at Microsoft Research Montreal.

Thanks to Katie Shilton and Casey Fiesler (respectively, information scientist at the University of Maryland, College Park and social computing researcher at University of Colorado Boulder, and co-organizers of the 2018 ICWSM workshop “Exploring Ethical Trade-Offs in Social Media Research”) for their guidance when proposing the workshop.

For additional input, we also thank Ben Green (PhD candidate at Harvard University, visiting research AI NOW, and author of “Data Science as Political Action” and The Smart Enough City), Jonnie Penn (historian of science at the University of Cambridge and co-organizer of the History of AI conference series and community), Amy Johnson (digital STS scholar and linguistic anthropologist), and members of the Ethical Tech Working Group at the Berkman Klein Center for Internet & Society at Harvard University.

A special thanks to ICWSM-19 Local Chair Mirco Schönfeld for his organizational efforts on the workshop day, as well as to General Chair Jürgen Pfeffer for his enormous efforts to make the workshops accessible. His success at lowering workshop costs and defraying costs for attendees made it possible for scholars outside of computer science to attend the ICWSM workshops, without which this workshop would not have been nearly the success it was.

We further thank our colleagues Claudia Müller Birn and Hemank Lamba for their valuable inputs, reviews, and technical assistance, and regret that they were prevented from attending and participating due to external circumstances.

What science becomes in any historical era depends on what we make of it.

– Sandra Harding, 1991

References

Agre, Philip E. (1997). “Towards a critical technical practice: Lessons learned from trying to reform AI.” Social science, technical systems, and cooperative work: Beyond the great divide. Ed. by Geoffrey C. Bowker, Susan Leigh Star, Will Turner, and Les Gasser. Mahwah, NJ: Lawrence Erlbaum Associates, pp. 131–158.

Agre, Phillip E. (2000, July 12). “Notes on critical thinking, Microsoft, and eBay, along with a bunch of recommendations and some URL’s.” Red Rock Eater News Service. https://pages.gseis.ucla.edu/faculty/agre/notes/00-7-12.html.

Bates, Jo. (2016, July 12). “Towards a critical data science – the complicated relationship between data and the democratic project.” LSE Impact Blog. https://blogs.lse.ac.uk/impactofsocialsciences/2016/01/12/towards-a-critical-data-science-data-and-the-democratic-project/.

Collins, Henry. (2008). “Actors’ and analysts’ categories in the social analysis of science.” Clashes of knowledge: Orthodoxies and heterodoxies in science and religion. Ed. by Peter Meusburger, Michael Welker, and Edgar Wunder. Springer, pp. 101–110.

Costanza-Chock, Sasha. (2018, July 16). Design justice, A.I., and escape from the matrix of domination. Journal of Design and Science, 3 (5). https://doi.org/10.21428/96c8d426.

Costanza-Chock, Sasha, Maya Wagoner, Berhan Taye, Caroline Rivas, Chris Schweidler, Georgia Bullen, and the Tech for Social Justice Project. (2018). #MoreThanCode: Practitioners reimagine the landscape of technology for justice and equity. Technical Report. Research Action Design & Open Technology Institute. https://morethancode.cc.

D’Ignazio, Catherine. (2018, September 2). “How might ethical data principles borrow from social work?” Medium. https://medium.com/@kanarinka/how-might-ethical-data-principles-borrow-from-social-work-3162f08f0353.

D’Ignazio, Catherine, and Lauren Klein. Data feminism. MIT Press, 2019. https://bookbook.pubpub.org/data-feminism.

Gray, Mary L., and Siddharth Suri. Ghost Work: How to stop Silicon Valley from building a new global underclass. Houghton Mifflin Harcourt, 2019.

Green, Ben. (2018). “Data science as political action: Grounding data science in a politics of justice.” https://arxiv.org/abs/1811.03435.

Green, Ben. (2019). The smart enough city: Putting technology in its place to reclaim our urban future. MIT Press.

Harding, Sandra. (1991). Whose science? Whose knowledge? Thinking from women’s lives. Cornell University Press.

Patton, Desmond U. (2019, March 24). “Why AI needs social workers and ‘non-tech’ folks.” Noteworthy – The Journal Blog. https://blog.usejournal.com/why-ai-needs-social-workers-and-non-tech-folks-2b04ec458481.

Polanyi, Michael. (1966). The tacit dimension. Doubleday.

Overview

We invite participation in the ICWSM Workshop on Critical Data Science, held on 11 June 2019 in Munich, Germany at the Thirteenth International AAAI Conference on Web and Social Media (ICWSM-2019).

What is "critical data science?"
We define this as our vision of the practice of working with and modeling data (the “data science”), combined with identifying and questioning the core assumptions that drive that practice (the “critical”). It can be seen as the intersection (or perhaps the union) of data science and critical data/algorithm studies, and an example of a “critical technical practice” (see workshop proposal).

What is this workshop about?
We seek to bring together data scientists and scholars from computer science and the social sciences to engage one another around frameworks for responsibly carrying out data science on social phenomena. By combining cultures of critique with those of practice, we seek to create critical and sustainable ways of interdisciplinary collaboration. The workshop will involve short reflective presentations by participants, combined with a creative group-based activity to further support reflection of their own and neighboring scientific practices, and to create opportunities for further cooperation.

Who is this workshop for?
Our main audience is data scientists attending ICWSM who are discontent about the state of data science practice, and suspect the problems are deep-seated and can only be addressed by a major change in how we see the world. We also welcome social scientists (e.g., those in sociology or STS), including those who may not be attending the main ICWSM conference, whose work involves working with data scientists.

Who is organizing this workshop?
Four scholars working across computer science and social science. See the organizers page for biographies.

What kind of submissions are accepted?
We are looking specifically for personal reflections and position pieces from data science practitioners or those who interact with or study practices of data science. Reflection and position-taking are integral parts of the scientific process and the development of disciplines, but there are often no venues for which researchers can get credit for doing them. This workshop seeks to be such a venue, where participants can present their own thoughts and experiences as well as work with others to deepen their own reflective practice. We accepted submissions of up to 4,000-word pieces with the option of publication, and are still accepting shorter pieces in order to participate and with optional inclusion on the workshop website. See the submissions page for details.

Where will the work be published?
Workshop attendees will have the option of having accepted archival submissions published in Workshop Proceedings of the 13th International AAAI Conference on Web and Social Media, a special collection of the journal Frontiers in Big Data.

How much does this cost?
While full ICWSM registration is expensive (registration fees are USD $724 for non-students and USD $384 for students), it is possible to register for only the workshop. The workshop cost is USD $75 for non-students and USD $50 for students. For those registering for the full conference, workshop registration is included.

Frontiers in Big Data is Open Access and has costs for publishing, but for any submitting authors who do not already have institutional support to cover this, the fees are being waived. See publishing costs.

What is ICWSM?
The International AAAI Conference on Web and Social Media, now in its 13th year and hosted by the Association for the Advancement of Artificial Intelligence (AAAI), is a computer science conference that has integrated social science and involved social scientists from the start. It is a venue that provides presentation opportunities for social scientists, and the work there is increasingly rigorous from both modeling perspectives and social science perspectives. As such, it is an excellent venue for modelers with critical views to find one another and organize.

Important Dates

April 3, 2019: Workshop Submission Deadline
April 15, 2019: Open Access Waiver Deadline
April 15, 2019: ICWSM Early Registration Deadline
May 12, 2019: Notification with comments
May 17, 2019: ICWSM late Registration Deadline
June 11, 2019: Workshop (2-6 PM)
June 11, 2019: Main ICWSM Conference

Call For Papers

We invite participation in the Workshop on Critical Data Science, taking place on June 11, 2019 at the 13th International AAAI Conference on Web and Social Media (ICWSM-2019) in Munich, Germany.

The social world is far messier than technical training prepares one for. Among data scientists trained in fields like computer science and statistics are those experiencing a sense of vertigo: we start to realize both the ways in which modeling breaks down on human beings, requiring different notions of rigor, and the potentially negative social impacts of modeling, requiring responsible engagement and activity.

We define “critical data science” as our vision of the practice of working with and modeling data (the “data science”), combined with identifying and questioning the core assumptions commonly underlying that practice (the “critical”). The workshop seeks to combine cultures of critique with those of practice, bringing together data scientists and scholars from computer science and the social sciences around responsibly carrying out data science on social phenomena, and creating sustainable frameworks for interdisciplinary collaboration.

The workshop will involve short reflective presentations by participants, combined with a creative group-based activity to further support reflection of their own and neighboring scientific practices and to create opportunities for further cooperation. The workshop will conclude with a wrap-up for collecting resources and discussing future outcomes, and producing a draft compilation of best practices and a list of priorities for further engagement.

We are still accepting non-archival 2-page statements of interest or motivation. Accepted archival papers will be published in ICWSM Workshop Proceedings, a special issue of the journal Frontiers in Big Data. Open Access publishing costs are being waived for authors without institutional support for covering these fees. See the submissions page for details of submission types and instructions.

Relevant topics include:

What should be standards and practices both of methodological rigor, and of respect for subjects, when carrying out computational research on social systems?
What role can discussions of methods and instruments play in larger critiques of the limitations of data science?
What are points of fundamental disagreement or diverging orientations/priorities between disciplines?
What can we learn from the long tradition of critical scrutiny in statistics?
What combinations of experiences and/or readings has led data scientists to recognize, and perhaps even adopt, ‘non-technical’ ways of framing the world? How do and can these ways of knowing interact with a modeling approach?
What philosophical commitments or normative orientations, if adopted by data scientists, would produce a principled data science? How can those be realized in interdisciplinary teams?
What might it look like to use modeling critically and reflexively, or to contextualize what we can or cannot know from modeling from within the modeling process?
What can we learn from works looking at the social impact of implemented model-based systems?
What sorts of practices, coalitions, and collaborations can include marginalized voices into data science rather than exclude them?
Beyond a space for critical reflection, what can be the positive project of a critical data science?
How can we design collaborations in critical data science?

For more details, please see the workshop proposal.

Submissions

Submissions should take a critical, reflexive stance, and also may provide an outlook on how to tackle a topic, such as a particular ethical or methodological challenge in an interdisciplinary setting.

Submission types

Submissions may either be non-archival 2-page statements of interest or motivation, or archival papers up to 4,000 words. Accepted archival papers will be published in ICWSM Workshop Proceedings, a special issue of the journal Frontiers in Big Data. Open Access publishing fees will be waived for authors without institutions support for covering these fees.

All submissions must be in English.

Statement of interest or motivation (up to 2 pages)

Statements of interest or motivation should not exceed 2 pages and may be submitted in any format. They will be non-archival (not included in the ICWSM Workshop Proceedings), but if accepted, the author(s) have the option of publishing here on the workshop website. Topics chosen should resonate with the relevant topics listed the Call for Papers. The statement should include an explanation of interest in the topics and why participation in the workshop is desired.

Full paper (up to 4,000 words)

Full papers should be 500 to 4,000 words, according to Frontiers guidelines, detailed at https://www.frontiersin.org/about/author-guidelines. Authors should submit as a “Perspective” article type. If accepted, these submissions have the option of being included in the Workshop Proceedings of the 13th International AAAI Conference on Web and Social Media published by Frontiers in Big Data.

For the specific layout, typesetting, and format, if you would like a template, here are quick links:

Latex template: http://www.frontiersin.org/design/zip/Frontiers_LaTeX_Templates.zip
Word template: http://www.frontiersin.org/Design/zip/Frontiers_Word_Templates.zip

According to the Frontiers site, “These templates are meant as a guide, you are of course welcome to use any style or formatting and Frontiers journal style will be applied during typesetting”; accordingly, while we request using the templates above, we will accept other styles/formatting.

In addition to addressing the relevant topics listed the Call for Papers, papers may be:

Reports of best practices in regard to responsible data science
Descriptions of novel interdisciplinary settings and methodologies (e.g. participatory or citizen science settings), supported by prior own work or a short state of the art description
Case studies of social, ethical or legal challenges faced
Frameworks and principles for active and responsible engagement with stakeholders potentially affected by applications of data science

Please consult the full workshop proposal for more details, including examples of reflections/position pieces. Evaluation and selection

Submissions will be evaluated on the basis of their fit to the workshop theme and will be reviewed by the workshop organizers, involving external reviewers when necessary. Selections will be made on the basis of the number of submissions, with a priority given for inclusion.

Submission instructions

Submit via the "Submit your manuscript" link at the top of the page at https://www.frontiersin.org/research-topics/9706 (registration for Frontiers required).
When submitting, please select one of the workshop organizers as the “preferred editor” (Momin M. Malik, Katja Mayer, Hemank Lamba, or Claudia Müller-Birn).
If a statement of interest or motivation, please submit as a “General Commentary.”
If a full paper, please submit as either a “Perspective” (length limit 3,000 words) or a “Brief Research Report” (length limit 4,000 words). Note that you are welcome to make submissions substantially shorter than this.
If you would like the submission to be non-archival, please indicate somewhere in the paper.
Note that the submission site only allows submission of images as TIFF. If you have EPS, PDF, PNG, or other format images, please submit them as supplementary material.
NO AUTHOR SHOULD HAVE TO PAY PUBLISHING FEES. Under “Payment and funding information,” if your institution is not a Frontier institutional member, you may temporarily enter in your own information for an “Individual payer.” Please ignore any emailed invoices, and please check “I and my fellow co-authors are fully aware of and agree with the payment of the listed article processing fee should the manuscript be accepted for publication.”
If you are not comfortable with temporarily temporarily entering in your information or checking the agreement for payment, or if you have ANY other issues with the submission site, you may alternatively email your paper, for full archival consideration (unless you indicate otherwise), to by the deadline.

Dates

Submissions due ~~March 25nd, 2019~~ April 3rd, 2019 at 23:59:59 Anywhere on Earth (UTC-12).
Acceptances and comments will be given by April 12th, 2019.
Final edits (camera-ready submission) are due April 22nd, 2019.
The workshop is held in Munich, Germany on June 11th, 2019. Please consult the ICWSM site for information about travel, the venue (note: this workshop is in a different location than the main conference!), accommodation, and registration. Note that workshop registration is independent from the main conference.

Publishing costs Submissions are by standard procedures, which indicate costs. However, workshop paper authors will NOT have to pay for publishing.

If your institution has an institutional agreement with Frontiers, publishing costs are covered.
If your institution has general or grant support for Open Access fees, but does not have an institutional agreement with Frontiers, you will pay the normal publication fee and grants/your institution will cover the cost.
If you do not have institutional support for Open Access:

By April 15, 2019 please complete the Waiver Application form online
While submittting this form, also contact the editorial office at and note your participation in ICWSM 2019.
Frontiers will review and reply to the request within two weeks. Information about waiver applications are not disclosed to the editors or reviewers.

Registration

Online registration is now open, please see https://icwsm.org/2019/attending/registration/ for details, and register at https://aaaiconf.cvent.com/icwsm19. ICWSM Workshop registration is separate from registration from the main ICWSM conference, so attendees do not need to pay the conference costs if only attending the workshop. The following fees apply for workshop registration:

Regular: USD $75
Student: USD $50

We are exploring sponsorship to help cover costs for attendees outside of computer science.

Program

Date: June 11, 2019
Time: 14:00 - 18:00 (2pm - 6pm)
Location: https://goo.gl/maps/pSmfAq8qYTE2

Bavarian School of Public Policy @ Technical University of Munich
Richard-Wagner-Straße 1
80333 Munich
Tel.: +49 89 907 793 0
http://www.hfp.tu.de/en/home

Note 1: in German, “ß” = “ss”, so “straße” = “strasse” (street). It will also work to search for “Richard-Wagner-Strasse 1, Munich”.

Note 2: the German name is “Hochschule für Politik München”, which would be translated as ”Munich College of Political Science” (and which is what appears on Google maps), but the official English name is instead the “Bavarian School of Public Policy”. If in doubt, search for Richard-Wagner-Strasse 1, Munich.

Final Schedule (PDF) available at https://projects.iq.harvard.edu/files/critical-data-science/files/wcds2019_schedule.pdf

Programme	Time	Description
Introduction	14:00-14:15	Framing, objectives, and format
Participant Introductions	14:15-14:30	One-sentence intro from participants about what brings them here
Politics	14:30-14:40	Reimagining Data Science as Political Space. Philippe Saner (Department of Sociology, University of Lucerne, Switzerland)
	14:40-14:50	AI for Not Bad. Jared Moore (Wadhwani Institute for Aritifical Intelligence, India)
	14:50-15:00	Researching Chavismo in a Convoluted Venezuela Parvati Subbiah (University of Chicago, US)
Discussion	15:00-15:15
Practice	15:15 - 15:25	Towards a More Holistic Data Practice Jaclyn Sawyer (Breaking Ground, New York City and Columbia University, US)
	15:25 - 15:35	Visual Anonymity and Data Privacy Tea Brašnac (Etkimo, Slovena)
Discussion	15:35 - 15:45	How do we forge cross-sector communities and support one another?
Break	15:45 - 16:15
Action	16:15 - 16:25	Data for Diversity-Aware and Non-Discriminatory Technology: Ethical Questions from the Project WeNet – the Internet of Us Laura Schelenz (International Center for Ethics in the Sciences and Humanitites, University of Tübingen, Germany)
	16:25 - 16:35	Reflections on Gender Analyses of Bibliographic Corpora Helena Mihaljević (University of Applied Sciences, Germany), Marco Tullney (German National Library of Science and Technology), Lucía Santamaría (Amazon Development Center, Germany), Christian Steinfeldt (University of Applied Sciences, Germany)
	16:35 - 16:50	Experimenting with Algorithmic Memory-Making: Lived Experience and Future-Oriented Ethics in Critical Data Science Annette Markham and Gabriel Pereira (Arhaus University, Denmark)
Discussion	16:50 - 17:10	Where do we take action, and what action should we take?
Mini-Break	17:10 - 17:15
Conclusion	17:15 - 17:45	Incentives, structures, and organizing
Wrap-up	17:45 - 18:00	Summary and future directions

Organizers

To email all the organizers about the workshop, please use .

Dr. Momin M. Malik is the Data Science Postdoctoral Fellow at the Berkman Klein Center for Internet & Society at Harvard University. He holds a PhD in Societal Computing and a Master’s in Machine Learning from the School of Computer Science, Carnegie Mellon University, an MSc in Social Science of the Internet from the Oxford Internet Institute, and an undergraduate degree in history of science from Harvard University. He was a 2017 Data Science for Social Good fellow. His dissertation work attempts to lay out a research agenda for using modeling in critical and reflexive ways, as well as connect this agenda to relevant precedents and parallel projects. During his PhD, he was a facilitator for Bias Buster @ CMU, a program for inclusivity workshops based on materials from and made in collaboration with Google Pittsburgh. He helped manage logistics to run a “Train the Trainers” one-day workshop at Google Pittsburgh with an attendance of about 50, and also was part of 4-facilitator teams running Bias Busters Train the Trainers sessions at the Tapia and WEPAN conferences.
Website: https://www.mominmalik.com
Frontiers Profile:https://loop.frontiersin.org/people/672765/
Twitter:@MominMMalik

Dr. Katja Mayer trained as a sociologist and works at the intersection of science-technology-society. She studies the interactions of social scientific method and its publics. Currently she is investigating open practices in Computational Social Science and Big Data for her habilitation project at the Department of Social Studies of Science and Technology at the University of Vienna. Until 2019, she was a postdoc at the School of Governance, Technical University Munich. She also works as a senior scientist at the Centre for Social Innovation in Vienna, serves as an expert for the European Commission, and is an associated researcher for the Responsible Research and Innovation in Academic Practice platform at the University of Vienna. Furthermore, she has been teaching sociology of knowledge, STS, and critical data studies since 2008 at various universities, and was a visiting fellow at the Carnegie Mellon University’s School of Computer Science. She is core member of OANA (Open Access Network Austria) and co-leads the working group on defining a national strategy for the transition to Open Science. During 2011-2013, she was scientific advisor to the president of the European Research Council (ERC). She is co-editor of a forthcoming special issue on Critical Data Studies in Frontiers on Big Data.
Website:https://homepage.univie.ac.at/katja.mayer/
Frontiers Profile:https://loop.frontiersin.org/people/660454/
Twitter:@katja_mat

Hemank Lamba is a PhD student in Societal Computing, and a Master’s student in Machine Learning at School of Computer Science, Carnegie Mellon University. Previously, he was a Research Engineer at IBM Research Labs, New Delhi. His research is focused on the understanding and modeling the user behavior on social media - specifically characterizing the deviant user behavior on these platforms, and understanding the effects of such behavior on the society. He has also been a fellow with multiple Data Science for Social Good initiatives (University of Chicago and IBM Research), where he has tackled problems related to food insecurity in U.S. and understanding the ecospace of philanthropic projects. In his time at Pittsburgh, he was a board member for the student organization Students for Urban Data Systems (SUDS), facilitating student projects on non-profit organizations and city’s open data. Hemank holds a B.Tech in Computer Science from IIIT-Delhi, India.
Website:https://sites.google.com/site/hemanklamba/
Frontiers Profile:https://loop.frontiersin.org/people/689744/
Twitter:@hemanklamba

Prof. Dr. Claudia Müller-Birn is the head of the research group Human-Centered Computing (HCC.lab) at the Institute of Computer Science at the Freie Universität Berlin. Before her appointment at FU Berlin, she undertook a post-doc at the Carnegie Mellon University, based on a Feodor Lynen Research Fellowship of the Alexander von Humboldt-Foundation. Her interdisciplinary research advances the fields of Computer-Supported Cooperative Work (CSCW) and Social Computing. Her research entails both an empirical and an engineering dimension. One objective is to contribute to a value-based socio-technical systems design that fulfills the specific needs of an application area, such as in ideation, and visualization. Besides, Claudia advocates the use and development of open source software, the principles of open science in her research work, and the open access to scholarly knowledge. She served as (co)chair of a number of conferences such as ACM OpenSym.
Website:https://www.clmb.de/
Frontiers Profile:https://loop.frontiersin.org/people/687337/
Twitter:@clmbirn

Workshop On Critical Data Science 2019

Critical Data Scientists at Work: Summary report of the ICWSM-2019 Workshop on Critical Data Science

Overview

About this document

Presentations

Non-attendee submissions

Discussion

Outputs and future activities

Acknowledgements

References

Overview

Important Dates

Call For Papers

Submissions

Submission types

Program

Organizers