The Augmented Social Scientist: Tutorial at IC2S2

Étienne Ollion and Rubing Shen will offer an introduction to deep learning supervised models at the next IC2S2 in Copenhagen, on July 17th 2023. [Link to all tutorials, check them out]

Based on a paper recently published in Sociological Methods and Research that demonstrated via an experiment how an expert could train an efficient and automatic classifier in a limited amount of time, we will organize this tutorial around three moments:

  1. The first one makes a case for the wide use of sequential transfer learning in the human and social sciences. Reviewing the existing literature on the topic, both classic and some recent developments, we show the promises of these recently developed approaches. Not only can a social scientist train an algorithm that correctly annotates hundreds of thousands of texts in a limited amount of time, but it often does so better than a human (who can get tired, bored, or inattentive).
  1. The second part will be fully hands-on. We will practically demonstrate how to use a BERT algorithm on text data. To do so, we will rely on a Google Colab notebook we created to accompany the release of the paper, which we will update. It will help us walk the participants through each step of the analysis. We will replicate some analyses from the paper, and we will show participants how to work with their data.
  1. The third part of the tutorial will discuss practical questions that emerge while carrying out annotation: What to annotate (sentences, paragraphs, articles)? How to create a well-defined indicator? When and how to use active learning? How to tune one’s model? What are the classic mistakes one can make and how to avoid them? We will conclude by saying a few words about when to not use transfer learning to briefly evoke the downsides of this approach.

Format: Each of the three segments will take about 50’, followed by a 10’ break.