About me(Curriculum Vitae)

I am a self-driven explorer in the field of Natural Language Processing (NLP) and related skills, motivated by a strong passion for Machine Learning (ML), Deep Learning (DL), and software development. Despite not initially having a background in computer science, I am dedicated to continuously enriching my skills, particularly drawn to techniques that provide immediate feedback and a sense of accomplishment.

Currently, I serve as an ML/AI Engineer at Rippey.ai rippey, specializing in automating system with document data and unstructured email text data for the logistics industry. This role primarily involves classification and extraction tasks, as well as whole workflow from data collection to deployment. I am seeking my next role, if you are hiring, please contact me. Thank you!

Before this, I earned a master’s degree in Computational Linguistics, Analytics, Search, and Informatics (CLASIC) at The University of Colorado-Boulder. During my studies, I collaborated with Prof. Martha Palmer (ACL 2023 Lifetime Achievement Award Recipient) on the Knowledge-directed Artificial Intelligence Reasoning Over Schemas (KAIROS) and Uniform Meaning Representation (UMR) projects. My contributions included computing event similarity based on Wikidata for KAIROS and developing the UMR writer, an annotation tool for cross-linguistic Uniform Meaning Representation, Abstract Meaning Representation, THYME clinical data with temporal relations, etc.

My academic journey began with a B.A. degree in Chinese Language and Literature from Shanxi University, followed by an M.A. degree in Linguistics and Applied Linguistics from Nanjing Normal University, with a focus on Chinese Information Processing.

With over five years of experience in Python programming and more than three years of research experience in NLP, my previous work has primarily focused on Information Retrieval. This includes tasks such as named entity recognition and word segmentation, sentence segmentation, etc. Additionally, I have been actively involved in dataset construction, with an emphasis on high-quality annotation for text datasets, particularly in the context of semantic role labeling. Some notable projects in this area include the creation of the Chinese Abstract Meaning Representation corpus and Chinese FrameNet. Furthermore, I have explored innovative ways to structurally represent the meaning of literary works, such as ancient Chinese poems, as part of my engagement in digital humanities.

Prior to my studies at CU-Boulder, I gained valuable industry experience as a product manager intern at Beijing Lingosail company in China. During this role, I was responsible for designing and upgrading the ‘Termbox’ product for term extraction.

My major technical stacks:

  • Programming languages: Python JAVA
  • ML/DL: Hugging Face Keras PyTorch TensorFlow scikit-learn
  • Web development: Django Flask HTML5JavaScriptReactjQueryCSS3
  • Cloud service: AWS Google Cloud Heroku
  • Large Language Models & Prompt Engineering: Langchian

Application&Products

  • UMR-writer: An annotation tool for Semantic Roles Labeling with Uniform Meaning Representation. GitHub last commit

  • Termbox: A term-extraction and translation tool for Computer-Aided Translation.