Research
My research interests focus on Computational Healthcare, Computational Biology, Biomedical Text Mining (BioNLP), Bioinformatics, Machine Learning and Graph Learning.
Selected Publications
Jiangbo Zhang, Feifei Cui, Zilong Zhang, Qingchen Zhang, and Shankai Yan✉. (2025). DBODL: Combined dung beetle optimizer deep learning model for predicting RNA-protein binding sites. Under Review Code Data
Xin Yang, Dongmei He, Buchao Zhan, Zilong Zhang, Feifei Cui, Qingchen Zhang, and Shankai Yan✉. (2025). Het2Gene : a phenotype-driven model for gene prioritization by Heterogeneous graph embedding. Under Review Code Data
Dongmei He, Buchao Zhan, Xin Yang, Zilong Zhang, and Shankai Yan✉. (2025). FNatPred: a data-driven approach for distinguishing between NAT and Tumor on the fungal microbiome. Under Review Code Data
Buchao Zhan, Anqi Li, Xin Yang, Dongmei He, Yucong Duan, and Shankai Yan✉. (2024). RARoK:Retrieval-Augmented Reasoning on Knowledge for Medical Question Answering. BIBM2024 (Accepted) Code Data
Dongmei He, Xin Yang, Buchao Zhan, Zilong Zhang, Qingchen Zhang, and Shankai Yan✉. (2024). Augmented Mycobiome-Based Cancer Detection by an Interpretable Large Model. BIBM2024 (Accepted) Code Data
Shankai Yan, Ling Luo, Po-Ting Lai, Daniel Veltri, Andrew J. Oler, Sandhya Xirasagar, Rajarshi Ghosh, Morgan Similuk, Peter N Robinson, and Zhiyong Lu✉. (2022). PhenoRerank: A re-ranking model for phenotypic concept recognition pre-trained on human phenotype ontology. Journal of Biomedical Informatics 129: 104059. Code Data
Shankai Yan and Ka-Chun WONG✉. (2019). Context awareness and embedding for biomedical event extraction. Oxford Bioinformatics 36(2): 637-643. Code Data
Ling Luo, Shankai Yan, Po-Ting Lai, Daniel Veltri, Andrew Oler, Sandhya Xirasagar, Rajarshi Ghosh, Morgan Similuk, Peter N Robinson, and Zhiyong Lu✉. (2021). PhenoTagger: a hybrid method for phenotype concept recognition using human phenotype ontology . Oxford Bioinformatics 37(13): 1884-1890. Code Data
Qingyu Chen, Robert Leaman, Alexis Allot, Ling Luo, Chih-Hsuan Wei, Shankai Yan, and Zhiyong Lu✉. (2021) Artificial Intelligence in Action: Addressing the COVID-19 Pandemic with Natural Language Processing. Annual Review of Biomedical Data Science 4: 313-339.
Qingyu Chen, Kyubum Lee, Shankai Yan, Sun Kim, and Zhiyong Lu✉. (2020) BioConceptVec: creating and evaluating literature-based biomedical concept embeddings on a large scale. PLOS Computational Biology 16(4): e1007617. Code Data
Yifan Peng, Shankai Yan and Zhiyong Lu✉. (2019). Transfer Learning in Biomedical Natural Language Processing: An Evaluation of BERT and ELMo on Ten Benchmarking Datasets. ACL BioNLP Workshop. Code Data
Ka-Chun WONG✉, Junyi Chen, Jiao Zhang, Jiecong Lin, Shankai Yan, Shxiong Zhang, Xiangtao Li, Cheng Liang, Chengbin Peng, Qiuzhen Lin, Sam Kwong, and Jun Yu. (2019). Early Cancer Detection from Multianalyte Blood Test Results. CellPress iScience 15: 332-341.
Shankai Yan and Ka-Chun WONG✉. (2019). GESgnExt: Gene Expression Signature Extraction and Meta-analysis on Gene Expression Omnibus. IEEE Journal of Biomedical and Health Informatics 24(1): 311-318. Code Data
Shankai Yan and Ka-Chun WONG✉. (2017). Elucidating high-dimensional cancer hallmark annotation via enriched ontology. Journal of Biomedical Informatics 73: 84-94. Code Data
Other Publications
Xin Yang, Dongmei He, Buchao Zhan, Zilong Zhang, Feifei Cui, Qingchen Zhang, and Shankai Yan✉. (2024). Attention-aware rare disease diagnosis via Graph Attention Neural Network. ICCBB2024 (Accepted)
Dongmei He, Xin Yang, Zilong Zhang, Feifei Cui, Qingchen Zhang, and Shankai Yan✉. (2024). BMPCD: A pan-cancer detection method base on learning cross-domain features. ICCBB2024 (Accepted)
Siqi Dong, Buchao Zhan and Shankai Yan✉. (2024). Food Named Entity Recognition with BERT and Adversarial Training. MLNLP2024 (Accepted)
Yuchen Ma, Buchao Zhan, Jianhua Yu and Shankai Yan✉. (2024). SACMR: Sentiment Analysis in Chinese Language using Modified RoBERTa. Proceedings of the 2024 IEEE 9th International Conference on Computational Intelligence and Applications 84-88.
Buchao Zhan, Yucong Duan and Shankai Yan✉. (2024). IC-BERT: An Instruction Classifier Model Alleviates the Hallucination of Large Language Models in Traditional Chinese Medicine. Proceedings of the 2024 IEEE 9th International Conference on Computational Intelligence and Applications 221-225.
Yanbo Han, Buchao Zhan, Bin Zhang, Chao Zhao and Shankai Yan✉. (2024). BiCalBERT: An Efficient Transformer-based Model for Chinese Question Answering. Proceedings of the 2024 International Conference on Intelligent Systems, Metaheuristics & Swarm Intelligence 100-104.
Zhe Liu, Hiu-Man Wong, Xingjian Chen, Jiecong Lin, Shixiong Zhang, Shankai Yan, Fuzhou Wang, Xiangtao Li, Ka-Chun Wong✉. (2023). MotifHub: Detection of trans-acting DNA motif group with probabilistic modeling algorithm. Computers in Biology and Medicine 168: 107753.
Ruiqi Liu, Xiuhao Fu, Shankai Yan, Zilong Zhang✉, and Feifei Cui. (2023). AIPPT: Predicts anti-inflammatory peptides using the most characteristic subset of bases and sequences by stacking ensemble learning strategies. Proceedings of the 2023 IEEE International Conference on Bioinformatics and Biomedicine 23-29.
Hiu-Man Wong, Xingjian Chen, Hiu-Hin Tam, Jiecong Lin, Shixiong Zhang, Shankai Yan, Xiangtao Li, Ka-Chun Wong✉. (2021). Feature Selection and Feature Extraction: Highlights. Proceedings of the 2021 International Conference on Intelligent Systems, Metaheuristics & Swarm Intelligence 49-53.
Shankai Yan and Ka-Chun WONG✉. (2021). Future DNA computing device and accompanied tool stack: Towards high-throughput computation. Future Generation Computer Systems 117: 111-124.
Ka-Chun WONG✉, Jiao Zhang, Shankai Yan, Xiangtao Li, Qiuzhen Lin, Sam KWONG and Cheng Liang. (2019). DNA Sequencing Technologies: Sequencing Data Protocols and Bioinformatics Tools. ACM Computing Surveys 52(5): 1-30.
Ka-Chun WONG✉, Shankai Yan, Qiuzhen Lin, Xiangtao Li and Chengbin Peng. (2018). Deleterious Non-Synonymous Single Nucleotide Polymorphism Predictions on Human Transcription Factors. IEEE/ACM Transactions on Computational Biology and Bioinformatics 17(1): 327-333.
Junyi Chen, Shankai Yan and Ka-Chun WONG✉. (2018). Verbal aggression detection on Twitter comments: convolutional neural network for short-text sentiment analysis. Neural Computing and Applications 32: 10809-10818.
Ka-Chun WONG✉, Chengbin Peng, Shankai Yan and Cheng Liang. (2017). Probabilistic Inference on Multiple Normalized Genome-Wide Signal Profiles With Model Regularization. IEEE Transactions on NanoBioscience 16(1): 43-50.
Junyi Chen, Shankai Yan and Ka-Chun WONG✉. (2017). Aggressivity Detection on Social Network Comments. Proceedings of the 2017 International Conference on Intelligent Systems, Metaheuristics & Swarm Intelligence 103-107.
Abstracts
Shankai Yan, Kathleen Steinhofel✉, Paul Bates, Mariam Molokhia. (2018). Novel HLA Subclass Clustering methods to characterize Liver Toxicity Phenotype. ACPE 2018. (Talk)
Shankai Yan, Ling Luo, Li Fang, Daniel Veltri, Andrew J. Oler, Rajarshi Ghosh, Chih-Hsuan Wei, Morgan Similuk, Kai Wang, and Zhiyong Lu✉. (2022). PhenoGene: Disease-gene prioritization using graph embedding on patient phenotypic profiles. AMIA2022. (Podium Abstract)