- Published on
Call for Supporters: Research Thesis in NLP
- Authors
- Name
- Minh N. Ta
💡 This blog post is aimed at calling my friends (up to 5) to join with me on topic of detecting machine-generated contents, which will be extended as my graduation research thesis by the end of next June. This research will be carried on at Foundation Models Lab, BKAI Research Center.
Research topics
The ease of access to large language models (LLMs) has enabled a widespread of machine-generated texts, and now it is often hard to tell whether a piece of text was human-written or machine-generated. This raises concerns about potential misuse, particularly within educational and academic domains. Thus, it is important to develop practical systems that can automate the process.
In academia, especially at HUST, students maybe overuse LLMs for their own purpose, this can lead to some decrease in student's abilities. Hence, my aim is create a system that can detect machine-generated contents in two domains:
- Coding exercises of IT courses (Introduction to Programming, Data Structures and Algorithms, Applied Algorithms, etc.).
- Students reports for their projects, especially in thesis or capstone projects.
Small guidance: I expect to use a small classifier to create a black-box detector (with acceptable quality), and use GAN-architecture to create an explainable detector.
Some Related Publications
From me
- Mervat Abassy*, Kareem Elozeiri*, Alexander Aziz*, Minh Ngoc Ta*, Raj Vardhan Tomar*, Bimarsha Adhikari*, Saad El Dine Ahmed*, et al. "LLM-DetectAIve: a Tool for Fine-Grained Machine-Generated Text Detection". In Proceedings of the 2024 Conference on Empirical Methods in Natural Language Processing: System Demonstrations, pages 336–343, Miami, Florida, USA. Association for Computational Linguistics.
- Yuxia Wang, et al. “GenAI Content Detection Task 1: English and Multilingual Machine- generated Text Detection: AI vs. Human”. In Proceedings of the 31st Inter- national Conference on Computational Linguistics (COLING). Abu Dhabi, UAE: Association for Computational Linguistics, Jan. 2025. (to be appeared).
- and two other publications are expected to be published in ACL/NAACL 2025.
From others
- Koike, R., Kaneko, M. and Okazaki, N. 2024. "OUTFOX: LLM-Generated Essay Detection Through In-Context Learning with Adversarially Generated Examples". In Proceedings of the AAAI Conference on Artificial Intelligence. 38, 19 (Mar. 2024), pp 21258-21266.
- Pengyu Wang, Linyang Li, Ke Ren, Botian Jiang, Dong Zhang, and Xipeng Qiu. 2023. "SeqXGPT: Sentence-Level AI-Generated Text Detection". In Proceedings of the 2023 Conference on Empirical Methods in Natural Language Processing, pages 1144–1156, Singapore. Association for Computational Linguistics.
- Huo, Mingjia, et al. "Token-Specific Watermarking with Enhanced Detectability and Semantic Coherence for Large Language Models". arXiv preprint arXiv:2402.18059 (2024).
What I offer and what I need?
What I need
- Have a strong interest in researching in NLP and LLM.
- A good background in Math and programmin (especially in Python).
- Experience in programming machine learning and deep learning algorithms.
- A beautiful soul and eager to learn.
- No need for prior knowledge of LLM or NLP.
What I have and offer
- Access to GPUs for research purpose.
- Supports in your own related projects.
- A guide of how to do research from A-Z.
- Potentially put your name in a publication or scientific research competition with my supervisor.
- No salary but unlimited funds for iced tea, bubble tea, etc. 😆
Supervisors and Extended Research Team
These people may not join in our projects, but the same research will be carried by me and them:
My thesis supervisor:
- Dr. Đinh Viết Sang - Vice Dean of School of ICT and Director of BKAI Research Center, HUST.
The extended research team:
- Professor Preslav Nakov - Department Chair and Professor in Natural Language Processing, MBZUAI.
- Professor Iryna Gurevych - Professor in NLP, Technical University Darmstadt.
- Dr. Yuxia Wang - Postdoctorial Research Fellow, MBZUAI; PhD. degree from University of Melbourne.
- Dr. Artem Shelmanov - Senior Research Scientist, MBZUAI.
- and some students supervised by these professors.
If you are interested in...
Please contact me via my email minh@tnminh.com if you want to work with me or have further questions about this topic.
You can also comment on this blog post for further discussion.
❗️ Deadline for Application: 23:59 12/12/2024.