My research interests focus on Biomedical Text Mining (BioNLP), Computational Biology, Bioinformatics, Machine Learning and Graph Learning. My current projects focuses on phenotype annotation and embedding. Here you can find materials relevant to my published papers.

I also provide the titles for some working papers and projects in progress below. Full links are attached for these papers when I believe they are ready for peer review. Feel free to contact me if you are interested in some of these works.

Selected Publications

Shankai Yan and Ka-Chun WONG✉. (2019). Context awareness and embedding for biomedical event extraction. Oxford Bioinformatics 36(2): 637-643. Code Data

Ling Luo, Shankai Yan, Po-Ting Lai, Daniel Veltri, Andrew Oler, Sandhya Xirasagar, Rajarshi Ghosh, Morgan Similuk, Peter N Robinson, and Zhiyong Lu✉. (2021). PhenoTagger: a hybrid method for phenotype concept recognition using human phenotype ontology . Oxford Bioinformatics 37(13): 1884-1890. Code Data

Qingyu Chen, Robert Leaman, Alexis Allot, Ling Luo, Chih-Hsuan Wei, Shankai Yan, and Zhiyong Lu✉. (2021) Artificial Intelligence in Action: Addressing the COVID-19 Pandemic with Natural Language Processing. Annual Review of Biomedical Data Science 4: 313-339.

Qingyu Chen, Kyubum Lee, Shankai Yan, Sun Kim, and Zhiyong Lu✉. (2020) BioConceptVec: creating and evaluating literature-based biomedical concept embeddings on a large scale. PLOS Computational Biology 16(4): e1007617. Code Data

Yifan Peng, Shankai Yan and Zhiyong Lu✉. (2019). Transfer Learning in Biomedical Natural Language Processing: An Evaluation of BERT and ELMo on Ten Benchmarking Datasets. ACL BioNLP Workshop. Code Data

Ka-Chun WONG✉, Junyi Chen, Jiao Zhang, Jiecong Lin, Shankai Yan, Shxiong Zhang, Xiangtao Li, Cheng Liang, Chengbin Peng, Qiuzhen Lin, Sam Kwong, and Jun Yu. (2019). Early Cancer Detection from Multianalyte Blood Test Results. CellPress iScience 15: 332-341.

Shankai Yan and Ka-Chun WONG✉. (2019). GESgnExt: Gene Expression Signature Extraction and Meta-analysis on Gene Expression Omnibus. IEEE Journal of Biomedical and Health Informatics 24(1): 311-318. Code Data

Shankai Yan and Ka-Chun WONG✉. (2017). Elucidating high-dimensional cancer hallmark annotation via enriched ontology. Journal of Biomedical Informatics 73: 84-94. Code Data

Other Publications

Shankai Yan and Ka-Chun WONG✉. (2021). Future DNA computing device and accompanied tool stack: Towards high-throughput computation. Future Generation Computer Systems 117: 111-124.

Ka-Chun WONG✉, Jiao Zhang, Shankai Yan, Xiangtao Li, Qiuzhen Lin, Sam KWONG and Cheng Liang. (2019). DNA Sequencing Technologies: Sequencing Data Protocols and Bioinformatics Tools. ACM Computing Surveys 52(5): 1-30.

Ka-Chun WONG✉, Shankai Yan, Qiuzhen Lin, Xiangtao Li and Chengbin Peng. (2018). Deleterious Non-Synonymous Single Nucleotide Polymorphism Predictions on Human Transcription Factors. IEEE/ACM Transactions on Computational Biology and Bioinformatics 17(1): 327-333.

Junyi Chen, Shankai Yan and Ka-Chun WONG✉. (2018). Verbal aggression detection on Twitter comments: convolutional neural network for short-text sentiment analysis. Neural Computing and Applications 32: 10809-10818.

Ka-Chun WONG✉, Chengbin Peng, Shankai Yan and Cheng Liang. (2017). Probabilistic Inference on Multiple Normalized Genome-Wide Signal Profiles With Model Regularization. IEEE Transactions on NanoBioscience 16(1): 43-50.

Junyi Chen, Shankai Yan and Ka-Chun WONG✉. (2017). Aggressivity Detection on Social Network Comments. Proceedings of the 2017 International Conference on Intelligent Systems, Metaheuristics & Swarm Intelligence 103-107.

Working Papers and Projects in Progress

Phenotype Concept Recognition & Phenotype Embedding