AI & Social Sciences Seminar

The AI & Social Sciences Seminar meets regularly throughout the year. The sessions where we welcome external speakers are open to the public (via Zoom; for a link, please send an email to arnault.chatelain[at]


We are currently curating next year’s seminar. Stay tuned!


The seminars are at 5:15pm (CET) both at CREST and on Zoom.

26 June 2024 – internal seminar

12 June 2024 – Hannah Waight (NYU): “Propaganda Bias and Large Language Models”

Artificial Intelligence (AI) systems have been shown to display various social biases. While many such biases arise from content and data produced by individual internet users, we uncover a more insidious, centralized form of bias in AI – political biases that likely stem from government propaganda in the training data. Leveraging two unique datasets of Chinese propaganda news articles, we quantify the amount of propaganda in open-source training datasets for large language models (LLMs). We find large footprints of propaganda in the Chinese portions of open-source training datasets, especially for political topics. Using audit experiments with both human and machine evaluations, we document systematic difference in the output of LLMs in response to political questions – Chinese-language queries consistently generate more positive responses on Chinese political institutions and figures than the same queries in English. We further show evidence that the most used LLM systems to date memorize common propaganda phrases. In future versions of this paper, we will report on our pre-training experiments, demonstrating that the introduction of additional documents from the propaganda apparatus to pre-training can shape open-source LLMs to be more favorable to the Chinese government. While our evidence is primarily drawn from the Chinese case, our paper broadly introduces the possibility of propaganda bias – the potential for strategic manipulation of and unintended influence on LLMs through training data by existing political institutions.

29 May 2024 – Fabrizio Gilardi (U. Zurich): “Open-Source LLMs for Text Annotation: A Practical Guide for Model Setting and Fine-Tuning”

This paper studies the performance of open-source Large Language Models (LLMs) in text classification tasks typical for political science research. By examining tasks like stance, topic, and relevance classification, we aim to guide scholars in making informed decisions about their use of LLMs for text analysis. Specifically, we conduct an assessment of both zero-shot and fine-tuned LLMs across a range of text annotation tasks using news articles and tweets datasets. Our analysis shows that fine-tuning improves the performance of open-source LLMs, allowing them to match or even surpass zero-shot GPT-3.5 and GPT-4, though still lagging behind fine-tuned GPT-3.5. We further establish that fine-tuning is preferable to few-shot training with a relatively modest quantity of annotated text. Our findings show that fine-tuned open-source LLMs can be effectively deployed in a broad spectrum of text annotation applications. We provide a Python notebook facilitating the application of LLMs in text annotation for other researchers.

16 May 2024 – internal seminar

10 April 2024 – Petter Törnberg (U. Amsterdam): “Simulating Agents with LLMs”

Social media is often criticized for amplifying toxic discourse and discouraging constructive conversations. But designing social media platforms to promote better conversations is inherently challenging. This paper asks whether simulating social media through a combination of Large Language Models (LLM) and Agent-Based Modeling can help researchers study how different news feed algorithms shape the quality of online conversations. We create realistic personas using data from the American National Election Study to populate simulated social media platforms. Next, we prompt the agents to read and share news articles – and like or comment upon each other’s messages – within three platforms that use different news feed algorithms. In the first platform, users see the most liked and commented posts from users whom they follow. In the second, they see posts from all users – even those outside their own network. The third platform employs a novel “bridging” algorithm that highlights posts that are liked by people with opposing political views. We find this bridging algorithm promotes more constructive, non-toxic, conversation across political divides than the other two models. Though further research is needed to evaluate these findings, we argue that LLMs hold considerable potential to improve simulation research on social media and many other complex social settings.

20 Mar 2024 – internal seminar

31 Jan 2024 – internal seminar

17 Jan 2024 – Alexander Kindel (Sciences Po médialab): “A multivariate perspective on word embedding association tests”

Word embedding association tests are a popular family of linear models for measuring conceptual associations observable in text corpora (e.g., biases, stereotypes, schemas) using word embeddings. The key quantity in such measurement models is the arithmetic mean cosine similarity (MCS) between pairs of word vectors with labels drawn from keyword lists that relate to the targeted concepts. This quantity is always distorted by the choice of keyword lists whenever the number of words in each list is greater than two. Model-based linear adjustments (e.g. controlling for word frequency) do not fix the distortion. I describe the degree of distortion in several exemplary MCS models published in computational social science, and I show how to obtain a valid metric using results from the literature on multivariate correlation. An important implication is that MCS is a valid metric for conceptual association problems only under a contradictory assumption about the relevance of the keyword lists to their target concepts.

10 Jan 2024 – internal seminar

20 Dec 2023 – internal seminar

6 Dec 2023 – Antonin Descampe & Louis Escouflaire (UC Louvain): “Analyzing Subjectivity in Journalism: A Multidisciplinary Discourse Analysis Using Linguistics, Machine Learning, and Human Evaluation”

We present the results of three experiments on subjectivity detection in French press articles. Our research lies at the crossroads of journalism studies and linguistics and aims to uncover the mechanisms of objective writing in journalistic discourse. First, we evaluated a range of linguistic features for a text classification task of news articles and opinion pieces. Then, we fine-tuned a transformer model (CamemBERT) on the same task and compared it with the feature-based model in terms of accuracy, computational cost and explainability. We used model explanation methods to extract linguistic patterns from the transformer model in order to build a more accurate and more transparent hybrid classification model. Finally, we conducted an annotation experiment in which 36 participants were tasked with highlighting “subjective elements” in 150 press articles. This allowed us to compare human-based and machine-derived insights on subjectivity, and to confront these results with journalistic guidelines on objective writing.

22 Nov 2023 – Isabelle Augenstein (University of Copenhagen): “Transparent Cross-Domain Stance Detection”

Understanding attitudes expressed in text is an important task for content moderation, market research, or to detect false information online. Stance detection has been framed in many different ways, e.g. targets can explicit or implicit, and contexts can range from short tweets to entire articles. Moreover, datasets differ by domain, and use varying label inventories, annotation protocols, and cover different languages. This requires novel methods that can bridge domains as well as languages. Moreover, to be applied to content moderation, having a model that can provide a reason for a certain stance can be useful.
In this talk, I will present our research on cross-domain as well as cross-lingual stance detection, as well as on methods for creating transparent predictions by additionally providing explanations.

08 Nov 2023 – Yiwei Luo (Stanford University): “Othering and low prestige framing of immigrant cuisines in US restaurant reviews and large language models” (with Kristina Gligorić and Dan Jurafsky)

Identifying and understanding implicit attitudes toward food can help efforts to mitigate social prejudice due to food’s pervasive role as a marker of cultural and ethnic identity. Stereotypes about food are a form of microaggression that contribute to harmful public discourse that may in turn perpetuate prejudice toward ethnic groups and negatively impact economic outcomes for restaurants. Through careful linguistic analyses, we evaluate social theories about attitudes toward immigrant cuisine in a large-scale study of framing differences in 2.1M English language Yelp reviews of restaurants in 14 US states. Controlling for factors such as restaurant price and neighborhood racial diversity, we find that immigrant cuisines are more likely to be framed in objectifying and othering terms of authenticity (e.g., authentic, traditional), exoticism (e.g., exotic, different), and prototypicality (e.g., typical, usual), but that non-Western immigrant cuisines (e.g., Indian, Mexican) receive more othering than European cuisines (e.g., French, Italian). We further find that non-Western immigrant cuisines are framed less positively and as lower status, being evaluated in terms of affordability and hygiene. Finally, we show that reviews generated by large language models (LLMs) reproduce many of the same framing tendencies. Our results empirically corroborate social theories of taste and gastronomic stereotyping, and reveal linguistic processes by which such attitudes are reified.

25 Oct 2023 – Pierre-Carl Langlais (Head of Research, OpSic). “De l’opérationnalisation au fine-tuning : créer des LLM pour l’analyse de corpus en sciences sociales” (in French)

Depuis quelques mois, chatGPT est concurrencé par une nouvelle génération de LLM ouverts. Llama, Mistral, Falcon : ces modèles plus compacts peuvent être adaptés à une grande variété de tâches sous réserve d’être entraînés en amont. Cette présentation décrit de premiers essais expérimentaux de fine-tuning pour l’annotation de grands corpus en sciences sociales et en humanités : textes littéraires, expressions sur les réseaux sociaux ou échanges avec le service public. L’élargissement de la fenêtre contextuelle (jusqu’à 3000 mots pour llama) et la sophistication croissante des LLM permet aujourd’hui d’opérationnaliser des catégories d’analyses complexes (sarcasme, complotisme, intertextualité, temps diégétique). Avec ces premiers résultats, nous évoquerons également les enjeux méthodologiques associés à l’entraînement des LLM aujourd’hui, dont le recours de plus en plus fréquent à des données synthétiques.