Yuchen Shen

Carnegie Mellon University. Pittsburgh, PA

I am a master’s student at the Language Technologies Institute, Carnegie Mellon University. My research aims to address the efficiency & accuracy challenges in science via AI, developing systematic methodologies at both data & model levels. Guided by this principle, I am interested in AI + drug discovery, particularly:

  1. from the data perspective: building general-purpose models from large-scale DNA/RNA data for target identification;
  2. from the model perspective: developing multimodal conditional generative models for lead optimization.
  3. to advance ML: designing efficient & interpretable biology-informed ML algorithms by studying the computational patterns of the brain and cells.

Currently, I am working with Professor Aran Nayebi on brain-inspired ML. I am fortunate to work with Professor Barnabás Póczos on generative models for molecules and with Professor Leman Akoglu on outlier detection for efficient discovery from scientific data. Additionally, I am interested in optimization and have the privilege of working with Professor Xiaorui Liu on decentralized algorithms.

During my undergraduate years, I concentrated on Natural Language Processing (NLP) and worked on summarization, few-shot sentiment analysis, and chatbots.

Email is the best way to reach me and please feel free to send me an email to discuss research! I try to read all my emails carefully, but don’t hesitate to send another one if you don’t receive my reply after one week!

news

Jan 23, 2025 One paper (ChemGuide) accepted by ICLR 2025
Sep 26, 2024 One paper (ProTransformer) accepted by NeurIPS 2024
Jun 17, 2024 Two papers (GraphBPE, Non-differentiable Guidance) accepted by ICML 2024 AI for Science Workshop
Jan 11, 2024 New website is now live
Aug 29, 2023 Begin to study at CMU

selected publications

  1. EMNLP
    Label-Driven Denoising Framework for Multi-Label Few-Shot Aspect Category Detection
    Fei Zhao* ,  Yuchen Shen* ,  Zhen Wu ,  and  Xinyu Dai
    In Findings of the Association for Computational Linguistics: EMNLP 2022 , 2022
  2. Arxiv
    Chemistry-Inspired Diffusion with Non-Differentiable Guidance
    Yuchen Shen* ,  Chenhao Zhang* ,  Sijie Fu* ,  Chenghui Zhou ,  Newell Washburn , and 1 more author
    2024
  3. ICML AI4Sci
    GraphBPE: Molecular Graphs Meet Byte-Pair Encoding
    Yuchen Shen ,  and  Barnabas Poczos
    In ICML 2024 AI for Science Workshop , 2024