My research interests focus on Computational Healthcare, Computational Biology, Biomedical Text Mining (BioNLP), Bioinformatics, Machine Learning and Graph Learning. My current projects focuses on phenotype annotation and embedding. Here you can find materials relevant to my published papers.
I also provide the titles for some working papers and projects in progress below. Full links are attached for these papers when I believe they are ready for peer review. Feel free to contact me if you are interested in some of these works.
Shankai Yan, Ling Luo, Li Fang, Daniel Veltri, Andrew J. Oler, Rajarshi Ghosh, Chih-Hsuan Wei, Morgan Similuk, Kai Wang, and Zhiyong Lu✉. (2022). PhenoGene: Disease-gene prioritization using graph embedding on patient phenotypic profiles. AMIA2022
Shankai Yan, Ling Luo, Po-Ting Lai, Daniel Veltri, Andrew J. Oler, Sandhya Xirasagar, Rajarshi Ghosh, Morgan Similuk, Peter N Robinson, and Zhiyong Lu✉. (2022). PhenoRerank: A re-ranking model for phenotypic concept recognition pre-trained on human phenotype ontology. Journal of Biomedical Informatics 129: 104059. Code Data
Shankai Yan and Ka-Chun WONG✉. (2019). Context awareness and embedding for biomedical event extraction. Oxford Bioinformatics 36(2): 637-643. Code Data
Ling Luo, Shankai Yan, Po-Ting Lai, Daniel Veltri, Andrew Oler, Sandhya Xirasagar, Rajarshi Ghosh, Morgan Similuk, Peter N Robinson, and Zhiyong Lu✉. (2021). PhenoTagger: a hybrid method for phenotype concept recognition using human phenotype ontology . Oxford Bioinformatics 37(13): 1884-1890. Code Data
Qingyu Chen, Robert Leaman, Alexis Allot, Ling Luo, Chih-Hsuan Wei, Shankai Yan, and Zhiyong Lu✉. (2021) Artificial Intelligence in Action: Addressing the COVID-19 Pandemic with Natural Language Processing. Annual Review of Biomedical Data Science 4: 313-339.
Qingyu Chen, Kyubum Lee, Shankai Yan, Sun Kim, and Zhiyong Lu✉. (2020) BioConceptVec: creating and evaluating literature-based biomedical concept embeddings on a large scale. PLOS Computational Biology 16(4): e1007617. Code Data
Yifan Peng, Shankai Yan and Zhiyong Lu✉. (2019). Transfer Learning in Biomedical Natural Language Processing: An Evaluation of BERT and ELMo on Ten Benchmarking Datasets. ACL BioNLP Workshop. Code Data
Ka-Chun WONG✉, Junyi Chen, Jiao Zhang, Jiecong Lin, Shankai Yan, Shxiong Zhang, Xiangtao Li, Cheng Liang, Chengbin Peng, Qiuzhen Lin, Sam Kwong, and Jun Yu. (2019). Early Cancer Detection from Multianalyte Blood Test Results. CellPress iScience 15: 332-341.
Shankai Yan and Ka-Chun WONG✉. (2019). GESgnExt: Gene Expression Signature Extraction and Meta-analysis on Gene Expression Omnibus. IEEE Journal of Biomedical and Health Informatics 24(1): 311-318. Code Data
Shankai Yan and Ka-Chun WONG✉. (2017). Elucidating high-dimensional cancer hallmark annotation via enriched ontology. Journal of Biomedical Informatics 73: 84-94. Code Data
Shankai Yan and Ka-Chun WONG✉. (2021). Future DNA computing device and accompanied tool stack: Towards high-throughput computation. Future Generation Computer Systems 117: 111-124.
Ka-Chun WONG✉, Jiao Zhang, Shankai Yan, Xiangtao Li, Qiuzhen Lin, Sam KWONG and Cheng Liang. (2019). DNA Sequencing Technologies: Sequencing Data Protocols and Bioinformatics Tools. ACM Computing Surveys 52(5): 1-30.
Ka-Chun WONG✉, Shankai Yan, Qiuzhen Lin, Xiangtao Li and Chengbin Peng. (2018). Deleterious Non-Synonymous Single Nucleotide Polymorphism Predictions on Human Transcription Factors. IEEE/ACM Transactions on Computational Biology and Bioinformatics 17(1): 327-333.
Junyi Chen, Shankai Yan and Ka-Chun WONG✉. (2018). Verbal aggression detection on Twitter comments: convolutional neural network for short-text sentiment analysis. Neural Computing and Applications 32: 10809-10818.
Ka-Chun WONG✉, Chengbin Peng, Shankai Yan and Cheng Liang. (2017). Probabilistic Inference on Multiple Normalized Genome-Wide Signal Profiles With Model Regularization. IEEE Transactions on NanoBioscience 16(1): 43-50.
Junyi Chen, Shankai Yan and Ka-Chun WONG✉. (2017). Aggressivity Detection on Social Network Comments. Proceedings of the 2017 International Conference on Intelligent Systems, Metaheuristics & Swarm Intelligence 103-107.
Shankai Yan, Kathleen Steinhofel, Paul Bates, Mariam Molokhia. (2018). Novel HLA Subclass Clustering methods to characterize Liver Toxicity Phenotype. ACPE 2018. (talk)
Working Papers and Projects in Progress
Fine-grained Phenotype Concept Recognition & Phenotype Embedding & Phenotype-driven Gene/Disease Prioritization & Graph-embedding-based Linkage Analysis and Association Study