We have built a novel dataset linking the Twitter profiles of 100,000 academics to their respective academic profiles, encompassing data from 12,000 institutions across 174 countries and 19 disciplines. The dataset captures all posts, comments, shares, metadata, and social following networks of these accounts from 2016 to 2022. We also collected similar scale data for random set of users in the U.S.
Using scalable large-language model (LLM) based classification techniques, we categorize and label the content, focusing on both the substantive content (the "what") and the language and tone of expression (the "how"). Our analysis covers topics such as climate crisis, economic policy, cultural dimensions, and the tone of language used.
Visit the Zenodo Repository to download data and code: https://zenodo.org/records/15115397
This work is licensed under an Apache License 2.0 [LICENSE}. The following paper has to be cited in all publications that make use of or refer in any kind to the files provided:
Garg. P & Fetzer. T (2025). Political Expression of Academics on Social Media. Nature Human Behaviour. Available Open Access 🔓.
We also kindly ask researchers using this to send us their paper, or a link to their paper. We will continuously upload or add links to these papers to inform the research community about ongoing progress and related work.
The replication data contains aggregate measures at individual by time level. Individuals include academics and general public. Individual IDs are replaced with a random set of characters. This is too ensure anonymity when making data public.
That data should be enough for most exploratory analysis and replication purposes. However, due to additional ethical and legal restrictions, we do not share the raw underlying tweets to external users.
The full dataset is large and stored on cloud servers. Due to its complexity and legality, the raw data cannot be accessed publicly. However, we are very open to collaboration ideas for academic research. If you are interested in accessing the raw data or have research collaboration ideas, please feel free to contact us.
Fill out the form below or via this link: https://forms.gle/3LjVbm4e3wprBfko7. Alternatively, shoot us an email.