Karolina Korgul

AI Safety & Alignment Researcher

PhD Student at the University of Oxford, dedicated to ensuring artificial intelligence systems are developed safely and remain aligned with human values. Exploring the intersection of machine learning, ethics, and existential risk mitigation.

now

PhD Researcher @ University of Oxford
Tutor @ University of Stanford
Technical Account Manager @ Google
Mentor @ arcadia impact IRG

#multi-agent-systems #evals #humanoids #agi #alignment #human-AI-interaction #human-AI-persuasion
Karolina Korgul

Research & Motivations

What drives my work in AI Safety and Alignment

🔬 Current Research Focus

Broadly speaking, I'm exploring alignment techniques and safety measures for AI systems. I am focusing on persuasion on human-AI line - as to how AI persuades its users to change their values and opinions as well as how malicious attackers can persuade and hijack AI Agents. In the long(er) run, I'd focus on safety of humanoids or other embodied AI systems, which I see as both most exciting and most scary thing to come.

🎯 Why This Matters

As AI systems become more sophisticated, understanding the dynamics of persuasion becomes critical. We need to ensure that AI systems don't manipulate users while also protecting them from malicious actors who might try to exploit these systems.

🚀 Future Directions

The transition to embodied AI systems (humanoids, robots) represents both the most exciting opportunity and the most significant safety challenge ahead. My work aims to lay the groundwork for safe deployment of these systems.

Teaching

Sharing knowledge and inspiring the next generation of AI researchers

AI Safety Workshops

Conducting workshops on AI safety principles and alignment challenges for students and researchers.

Guest Lectures

Speaking at universities and conferences about the importance of AI safety research.

Industry Work

Bridging academic research with real-world applications

Consulting

Advising organizations on AI safety best practices and risk assessment.

Collaborations

Working with industry partners to implement safety measures in production AI systems.

Mentoring

Supporting emerging researchers in AI safety

Hi, welcome to my page :)

Thank you for getting in touch – I really appreciate you taking the time to reach out and go through the extra step of finding this tab. Apologies that it's not just an email address here, to which I would personally reply in all cases. If anybody invents an AI clone of me that solves this (and can generate factual responses reliably), let me know.

Many great things I've experienced in life were thanks to connecting with kind people who supported me through mentoring and good advice in making the right decisions.

I want to keep the cycle going and take half a day each week to support those who could use my advice now. However, it seems that timeframe is rarely enough for the demand I receive – I feel flattered and try to help as well as I can, but that means I have to prioritise among the requests.

I try to manage it in three ways:

FAQ

If your request is covered by the information below, please do not expect a reply from me:

Applying for the MSc in Social Data Science? – Both the programme and the application process have changed significantly since I attended it, and I am not the best person to give an up-to-date overview. This website https://www.oii.ox.ac.uk/study/msc-in-social-data-science/ covers the most important information, and this page https://www.oii.ox.ac.uk/people/msc-students/ has the list and contact details of current students – I highly recommend looking into their backgrounds. I also recommend my good friend Jenny's account https://www.instagram.com/oxforddatagirl, which is an incredible source of useful information on the MSc at OII experience and application process.

Funding? – Unfortunately, I cannot help here, especially if you are a non-Polish applicant. I am more familiar with Polish funding schemes, and here I'm able to help more directly and share my experience if you are planning to apply for Bona Fide. I am, however, not a good source of information regarding Clarendon or Fulbright.

Being hired by me? – Please know that I no longer act as a hiring manager. If I take part in hiring processes, unfortunately, I cannot help you prepare for them. The best I can do is support you in preparing for processes I am not involved in or help with a referral.

Date? – I don't know if it's the power of my Google Scholar profile, but it turned out some of the responses were asking for a coffee date rather than a coffee chat. While I certainly found these ways of asking out quite unique, please note that I don't do coffee dates, nor am I really interested in any dates other than the date of my next arXiv release.

Posts

I noticed that there are some topics which mentees describe as a very lonely experience, but they are recurring themes we cover with different mentees. From November onwards, I will post more on them in the Thoughts & Reflections section, and hopefully you'll find it useful.

Contact Form

If you'd like to have a 25-minute virtual coffee chat or ask for recurring mentoring sessions, please fill out this form: Coffee Chat / Mentoring Form. I will do my best to respond to you.

If some parts of this text above look oddly familiar, it may be because it was inspired by the incredible Professor Rosalind W. Picard from the MIT Media Lab's automatic email response :) I was very impressed by how she navigates getting over 1,000 emails a day and decided to apply some of her practices.

My Thoughts

Reflections on AI safety, research, and the future

🚧 Section In Progress

This section is currently being developed and will be available mid-November.

I'm working on curating thoughtful content about AI safety, research insights, and philosophical reflections that go beyond what you typically see in academic papers.

Contact

Let's connect and collaborate

Collaboration Opportunities

I'm always interested in collaborating on AI safety research, speaking at events, or discussing the future of artificial intelligence. Feel free to reach out if you'd like to connect!