Hands-on Transformers: Fine-tune your own BERT and GPT

The workshop will take place tentatively on August 22, 2023, on Zoom and Youtube Livestreaming

About The Data Science Summer School

The Data Science Summer School is a series of theoretical and practical workshops on the exciting methods and technologies currently employed by industry, government, and civil society to address the world's most complex problems today. It is organized by the Hertie School Data Science Lab with funding and support from the Hertie School and the Dieter Schwarz Foundation

Workshop Details

While Transformer models like BERT and GPTs are becoming more popular, there is a persistent misconception that they are very complicated to use. This workshop will demonstrate that this is not the case anymore. There are amazing open-source packages like Hugging Face Transformers that enable anyone with some programming knowledge to use, train and evaluate Transformers.

We will start with an intuitive introduction to transfer learning and discuss its added value for social science use-cases as well as limitations. We will then look at the open-source ecosystem and free hardware options to train Transformers. Building upon a high-level explanation of the main components of Transformers in Hugging Face’s implementation, we will then fine-tune different BERT and GPT models and discuss important aspects of fine-tuning and evaluation.

The code demonstrations will be in Python, but participants without prior knowledge of Python or Transformers are explicitly invited to participate. You will leave the workshop with Jupyter notebooks that enable you to train your own Transformer with your own data for your future research projects.“

Syllabus

To be updated

Content Licensing

All workshop materials and recording are under Creative Commons Attribution-NonCommercial-ShareAlike 2.0 license. You are free to share — copy and redistribute the material in any medium or format, and adapt — remix, transform, and build upon the material. However, you must give appropriate credit, provide a link to the license, and indicate if changes were made. You may do so in any reasonable manner, but not in any way that suggests the licensor endorses you or your use. You may not use the material for commercial purposes. If you remix, transform, or build upon the material, you must distribute your contributions under the same license as the original.



Workshop Materials

Instructor

Moritz Laurer
Moritz Laurer

Moritz Laurer is an Associate Researcher in the Global Governance, Regulation, Innovation, Digital Economy (GRID) unit at CEPS. He co-organises the European Policy Data Science Network, a network of data-driven researchers in leading think tanks and civil society organisations. Besides his work for CEPS, Moritz is currently pursuing a PhD in supervised machine learning for political text analyses at Free University of Amsterdam. On a more personal level, Moritz is passionate about using data science to analyse European politics and co-founded the NGO DataMine Europe ASBL. He is currently a visiting fellow at the Hertie School Data Science Lab.


Schedule (Central European Summer Time - CEST)

Session Starts

Hands-on Transformers: Fine-tune your own BERT and GPT (Part I)

Short Break

Session Continues

Hands-on Transformers: Fine-tune your own BERT and GPT (Part II)

Session Ends


Watch recording