My name is Jiaming "Arthur" Qu (瞿佳明) and I am now a first-year Ph.D. student majoring in Information Science at School of Information and Library Science, UNC Chapel Hill. I am very lucky to be advised by Dr. Yue Wang and Dr. Jaime Arguello.
My research interests include but are not limited to Information Retrieval, Text Data Mining and Interpretable Machine Learning with my extensive knowledge in Mathematics and Statistics, and programming skills.
Ph.D. in Information Science• Aug 2019 - Present
Academic advisor: Dr. Yue Wang
M.S. in Information Science• Aug 2017 - May 2019
Academic advisor: Dr. Robert Capra
Master Thesis advisor: Dr. Yue Wang
Awards: Dean's Award for the best Master Thesis (2019)
B.S. in Management Information System• Sep 2013 - Jun 2017
Academic advisor & Bachelor Thesis advisor: Dr. Lei Wang
Awards: Distinguished Bachelor Thesis (2017)
Graduate Research Project; Advisor: Dr. Yue Wang, Dr. Jaime Arguello • Aug 2019 - Present
-Research on explaining search result relevance by learning from human experts’ internal decision-making processes
-Propose and compare three approaches, aiming to achieve a balance between accuracy and interpretability: use the tree model with speciﬁc features to replicate the decision process, use generalized additive model with general features to simulate the decision process and combine both two approaches
TREC 2019 News Track; Advisor: Dr. Yue Wang • Jul 2019 - Aug 2019
-Research on finding relevant news accurately from a huge corpus given a target news
-Implemented the initial retrieval in Lucene with query terms generated from the news text
-Trained a learning-to-rank model to re-rank the initial result with text features and context features
-Best run achieved a NDCG@5 of 0.59 (2018 best run achieved 0.46)
Master Thesis; Advisor: Dr. Yue Wang • Sep 2018 - Apr 2019
-Research on incorporating domain knowledge to enhance PubMed medical literature retrieval
-Proposed a framework of resource-based query expansion and a learning-to-rank approach
-Comparable to top teams on the TREC 2018 PM track leaderboard
-Recipient of Dean’s Achievement Award (2 out of 95)
Graduate Research Project; Advisor: Dr. Jaime Arguello • Jan 2018 - Jun 2018
-Research on predicting future success of restaurants on Yelp after a one-year period
-Worked on multi-level feature generation and selected informative ones from text and numeric data
-Conducted sentiment analysis of 4,736,897 pieces of Yelp reviews with NLTK and Word2vec toolkits
-Corresponding paper was published in PEARC 18 and was invited for a presentation
Zhang, L., Qu, J, Sheng, H., Yang, J., Wu, H., & Yuan, Z. (2019). Urban mining potentials of university: In-use and hibernating stocks of personal electronics and students’ disposal behaviors. Resources, Conservation and Recycling, 143, 210-217. doi:https://doi.org/10.1016/j.resconrec.2019.01.007 PDF
Wang, L., Zhao, Q., Wen, Z., & Qu, J. (2018). RAFFIA: Short-term Forest Fire Danger Rating Prediction via Multiclass Logistic Regression. Sustainability, 10 (12), 4620. doi:10.3390/su10124620 PDF
Wang, L. , & Qu, J. (2018). Web service reliability prediction via collaborative filtering and Slope One algorithm. Computer Engineering & Science, 40 (08), 16-24. (In Chinese)
Lu, X., Qu, J., Jiang, Y., & Zhao, Y. (2018). Should I Invest it?: Predicting Future Success of Yelp Restaurants. In Proceedings of the Practice and Experience on Advanced Research Computing (p. 64). ACM. doi:10.1145/3219104.3229287 PDF
Qu, J., Wang, Y. (2019). A Medical Literature Search System for Identifying Effective Treatments inPrecision Medicine. arXiv preprint arXiv:1904.07428. PDF
Xiaopeng Lu, Jiaming Qu. (2018) Should I Invest it? Predicting Future Success of Yelp Restaurants. PEARC'18: Practice and Experience in Advanced Research Computing , July 22-26, Pittsburgh, PA PDF
Research Assistant• Aug 2019 - Present
-Carry out research projects independently or with Dr. Yue Wang and Dr. Jaime Arguello on Machine Learning, Natural Language Processing and Data Mining in various domains
-Propose & implement state-of-art algorithms, design & conduct experiments and write research papers
NLP specialist Intern• Jan 2019 - May 2019
-Implemented web crawlers to automatically scrape data from stock exchange websites like Nasdaq, NYSE and HKEX to detect suspeded stocks
-Retrieved relevant news for each suspended stock from various financial news APIs with summarization and cleaned the text for human reading
-Build sentiment analysis models for financial news with both bag-of-words representations and word vectors
-Build models to analyze how sentiments in a stock's news affected its price trend
Course:INLS 690 - Data Mining• Jan 2019 - May 2019
-Independent course project on predicting a book's genres by its reviews
-Cleaned text data of book reviews, built multi-label classifiers with the scikit-learn library
-Implemented the bag-of-words model and the RNN with word vectors for classification
Course: INLS 620 - Web Information Organization• Sep 2018 - Dec 2018
-Independent course project on scraping, cleaning and publishing web data in RDF
-Cleaned data in R and OpenRefine, enriched data with Wikidata and DBpedia knowledge bases, converted to machine-readable RDF & turtle formats, and published visualized linked data to the web.
-Implemented in OpenRefine, R and Python
-Codes are here.
Course: INLS 672 - Web Development II• May 2018 - Jun 2018
-Independent course project on developing a website connecting to a database for sharing book reviews
-Developed basic functions of adding, deleting, modifying records, and running queries.
-Codes are here.
Course: INLS 719 - Usability Testing and Evaluation• Aug 2017 - Dec 2017
-Collaborative course project on usability study of student store website.
-Designed and conducted usability tests with four participants.
Skills: Text Data Mining; Applied Machine Learning and Deep Learning; Information Retrieval, Learning-to-rank, Interpretable Machine Learning
-My favourite programming languages are Python and Java.
-I use Python for most of my research with the powerful scikit-learn, gensim, nltk and other libraries. For Deep Learning, I am into Tensorflow but I am also trying PyTorch.
-I enjoy using R for statistical computing and I perfer the ggplot library than matplot.
-I used to do Java programming a lot for developments, now I mainly use Java to do research in Information Retrieval, with the amazing Apache Lucene library.
I am a fan of sports. My favorite basketball team is Dallas Mavericks and soccer team is Manchester United. I like traveling (especially roadtrip) and photography. My favourite destination so far is Faribanks, AK. Here is a photo of northern lights I took there. I am also into music and I was a guitarist in my band.