Research Experience
Undergraduate Researcher | Sky Computing Lab, UC Berkeley
08/2022 - Present
Research focus: Distributed systems and network; Mentor: Dr. Joseph E. Gonzalez
Implemented and optimized the data verification process of Skyplane, the SOTA inter-and intra-cloud bulk data transfer software.
Implemented and optimized the multi-instance, multi-thread HTTP download function on Skyplane. Achieved a 30X speedup compared to existing recipes. This result was presented in a talk hosted by NASA.
Managed Globus Endpoints to conduct large data transfer experiments to benchmark Skyplane against Globus.
Undergraduate Researcher | ReDAS Lab, University of Houston
06/2021 - Present
Research focus: Deception Detection, NLP; Mentor: Dr. Rakesh Verma
Implemented and experimented with various feature engineering algorithms, ML and DL models in NLP.
Compiled a cross-domain deception dataset with fake reviews, fake opinions, phishing emails, and fake news.
Trained BERT models on the cross-domain dataset to show the possibility of domain-independent deception detection with results published in CODASPY 2022.
Application Engineering Intern | Oak Ridge National Lab
06/2022 - 08/2022
Research focus: Productivity and Sustainability in Software Engineering; Mentor: Dr. Gregory R. Watson
Developed a website, rateyourproject.org, to help users improve software engineering practices while collecting research data.
Concluded my findings in internally-reviewed poster and report.
Research Apprentice | Oski Lab, UC Berkeley
09/2020 - 06/2021
Project: NLP for Cannabis Text Data; Mentor: Dr. Cyrus Dioun
- Used Beautiful Soup to scrape product data from websites; pre-processed and merged data.
- Used NLP methods to find trends and patterns in product description data about cannabis, trying to uncover the political and cultural elements that affect market competition in the US cannabis industry.
Research Apprentice | Berkeley Law School
03/2020 - 09/2020
Project: Why Companies Rebrand; Mentor: Dr. Sonia Katyal
- Extracted useful information from almost 1 PB of patent applications of 4000 companies in 100 years from USTPO and 10-K files from SEC, using bag-of-words and sentiment analysis.