研究方向
機器學習與生物和醫學大數據分析
用新一代測序數據解讀基因表達與調控
理解人類微生物組的信息構成
教育背景
1994 年3 月於清華大學獲模式識別與智慧型系統專業工學博士學位
1989 年7 月於清華大學獲工業自動化專業工學學士學位
工作經歷
2012 國家973項目首席科學家
2007.3-4 南加州大學分子與計算生物學系訪問學者
2006.2-3 哈佛大學公共衛生學院訪問科學家
2003 – 今清華信息科學與技術國家實驗室(籌)生物信息學部主任
2002 – 今清華大學生物信息學教育部重點實驗室副主任
2002 – 今清華大學自動化系模式識別與生物信息學教授
2001 – 2002 哈佛大學公共衛生學院生物統計系高級訪問學者
1999 – 2007 清華大學自動化系信息處理研究所所長
1996 – 2002 清華大學自動化系模式識別理論及套用副教授
1994 – 1996 清華大學自動化系講師
開設課程
計算分子生物學引論(研究生,2002-2009 秋)
統計學習理論導論(研究生,2000 秋,2002-2009 秋)
科學精神、道德與表達(研究生,2005-2007 夏)
模式識別基礎(本科生,1998-2009 秋,國家級精品課)
曾獲獎勵
2009 年國家教學成果二等獎
2008 年北京市教學成果一等獎
2006 年國家傑出青年基金
2004 年教育部新世紀優秀人才支持計畫
2002 年國家科技進步二等獎
2001 年中國海洋石油總公司科技進步一等獎
1995 年國家教委科技進步二等獎
學術成果
主要學術論文
2010
周雪崖,張學工,基於拷貝數變異的遺傳關聯研究,《科學通報》,in press
Zhengpeng Wu, Xi Wang, Xuegong Zhang, Using non-uniform read distribution models to improve isoform expression inference in RNA-Seq, Bioinformatics, in press, 2010
Xi Wang, Zhengpeng Wu, Xuegong Zhang, Isoform abundance inference provides a more accurate estimation of gene expression levels in RNA-seq, Journal of Bioinformatics and Computational Biology, 8(Suppl.1): 177-192, 2010
Ting Zhang, Xuegong Zhang, Zhirong Sun, Identifying changed protein-protein interactions in biological processes by gene coexpression analysis, Chinese Science Bulletin, 55(14): 1396-1402, 2010
王曦、汪小我、王立坤、馮智星、張學工,新一代高通量RNA測序數據的處理與分析,《生物化學與生物物理進展》,37(8): 834-846, 2010
The MAQC Consortium, The MAQC-II project: a comprehensive study of common practices for the development and validation of microarray-based predictive models, Nature Biotechnology, 28(8): 827-841, 2010
Likun Wang, Zhixing Feng, Xi Wang, Xiaowo Wang, Xuegong Zhang, DEGseq: an R package for identifying differentially expressed genes from RNA-seq data, Bioinformatics, 26(1): 136-138, 2010
PEI YunFei, WANG ZhiMin, Fei Fei, SHAO ZhiMing, HUANG Wei, ZHANG XueGong. Bioinformatics study indicates possible microRNA-regulated pathways in the differentiation of breast cancer, Chinese Science Bulletin, 55(10): 927-93, 2010
Tingting Li, Bingbing Wan, Jian Huang, Xuegong Zhang, Comparison of gene expression in hepatocellular carcinoma, liver development and liver regeneration, Mol Genet Genomics, 283: 485-492, 2010
2009
Ying Liu, Bo Jiang, Xuegong Zhang, Gene set analysis identifies master transcription factors in developmental courses, Genomics, 94: 1-10, 2009 (cover story)
Tingting Li, Jian Huang, Ying Jiang, Yan Zeng, Fuchu He, Michael Q. Zhang, Zeguang Han, Xuegong Zhang, Multi-stage analysis of gene expression and transcription regulation in C57/B6 mouse liver development, Genomics, 93: 235-242, 2009
Shicai Fan, Xuegong Zhang, CpG island methylation pattern in different human tissues and its correlation with gene expression, BBRC, 383(2009): 421-425
Yunfei Pei, Ting Zhang, Victor Renault, Xuegong Zhang, An overview of hepatocellular carcinoma study by omics-based methods, Acta Biochimica et Biophysica Sinica, 41(1): 1-15, 2009
YunfeiPei, Xi Wang, Xuegong Zhang, Predicting the fate of microRNA target genes based on sequence features, Journal of Theoretical Biology, 261: 17-22, 2009
凡時財,張學工,DNA甲基化的生物信息學研究進展, 《生物化學與生物物理進展》,36(2): 143-150, 2009
Michael Q. Zhang, Michael S. Waterman, Xuegong Zhang, Introduction: the seventh Asia Pacific Bioinformatics Conference (APBC2009), BMC Bioinformatics, 10(Suppl 1): S1, 2009
Li Zhu, Wanwan Tang, Guisen Li, Jicheng Lv, Jiaxiang Ding, Lei Yu, Minghui Zhao, Yanda Li, Xuegong Zhang, Yan Shen, Hong Zhang, Haiyan Wang, Interaction between variants of two glycosyltransferase genes in IgA nephropathy, Kidney International, 76: 190-198, 2009
2008
Bo Jiang, Xuegong Zhang, Tianxi Cai, Estimating the confidence interval for prediction errors of support vector machine classifiers, Journal of Machine Learning Research, 9(March): 521-540, 2008
Xuesong Lu, Xin Lu, Zhigang C. Wang, J. Dirk Iglehart, Xuegong Zhang and Andrea L. Richardson, Predicting features of breast cancer with gene expression patterns, Breast Cancer Research and Treatment, 108(2): 191-201, March 2008 (published online: May, 2007) (4.671)
Tingting Li, Fei Li, Xuegong Zhang, Prediction of kinase-specific phosphorylation sites with sequence features by a log-odds ratio approach, Proteins: Structure, Function, and Bioinformatics, 70: 404-414, 2008
李婷婷,蔣博,汪小我,張學工,轉錄因子結合位點的計算分析方法,《生物物理學報》,24(5): 334-346, 2008
Ujjwal Maulik, Anirban Mukhopadhyay, Sanghamitra Bandyopadhyay, Xuegong Zhang, Michael Zhang, Multiobjective fuzzy biclustering in microarray data: method and a new performance measure, IEEE Congress on Evolutionary Computation 2008 (CEC2008), pp. 1536-1543, June 1-6, 2008
Shicai Fan, Michael Q. Zhang, Xuegong Zhang, Histone methylation marks play important roles in predicting the methylation status of CpG islands, Biochemical and Biophysical Research Communications, 374: 559-564, 2008
Tao Peng, Chenghai Xue, Jianning Bi, Tingting Li, Xiaowo Wang, Xuegong Zhang and Yanda Li, Functional importance of different patterns of correlation between adjacent cassette exons in human and mouse, BMC Genomics, 9: 191, 2008
Xiaowo Wang, Xuegong Zhang, Yanda Li, Complicated evolutionary patterns of microRNAs in Vertebrates, Science in China, 51(6):552-9, 2008
2007
Yonghong Peng, Xuegong Zhang, Guest Editorial: Integrative data mining in systems biology: from text to network mining, Artificial Intelligence in Medicine, 41(2): 83-86, 2007
Tingting Li, Hu Fu, and Xuegong Zhang, Prediction of kinase-specific phosphorylation sites by one-class SVMs, Proceedings of 2007 IEEE International Conference on Bioinformatics and Biomedicine (BIBM2007), pp. 217-222, 2007
Xi Wang, Sanghamitra Bandyopadhyay, Zhenyu Xuan, Xiaoyue Zhao, Michael Q. Zhang, Xuegong Zhang, Prediction of transcription start site based on feature selection using AMOSA, CSB2007 Conference Proceedings, volume 6, pp.183-193, San Diego, Aug 13-17, 2007
Bo Jiang, Michael Q. Zhang, Xuegong Zhang, OSCAR: one-class SVM for accurate recognition of cis-elements, Bioinformatics, 23(5): 531-537, 2007
Shicai Fan, Fang Fang, Xuegong Zhang, Michael Q. Zhang, Putative zinc finger protein binding sites are enriched in the boundaries of methylation-resistant CpG islands in the human genome, PLoS ONE, 2(11): e1184, 2007
Jin Gu, Hu Fu, Xuegong Zhang, Yanda Li, Identifications of conserved 7-mers in the 3’-UTRs and microRNAs in Drosophila, BMC Bioinformatics, 8:432, 2007
S Li, ZQ Zhang, LJ Wu, XG Zhang, YD Li, YY Wang Understanding ZHENG in Traditional Chinese Medicine in the context of neuro-endocrine-immune network, IEE Systems Biology, 1(1): 51-60, 2007
Jing Zhang, Bo Jiang, Ming Li, John Tromp, Xuegong Zhang and Michael Q. Zhang, Computing exact P-values for DNA motifs, Bioinformatics, 23(5): 531-537, 2007
Jian Huang, Pei Hao, Yun-Li Zhang, Fu-Xing Deng, Qing Deng, Yi Hong, Xiao-Wo Wang, Yun Wang, Ting-Ting Li, Xue-Gong Zhang, Yi-Xue Li, Pen-Yuan Yang, Hong-Yang Wang, Ze-Guang Han, Discovering multiple transcripts of human hepatocytes using massively parallel signature sequencing (MPSS), BMC Genomics, 8: 207, 2007
Chaolin Zhang, Xuegong Zhang, Michael Q. Zhang, Yanda Li, Neighbor number, valley seeking and clustering, Pattern Recognition Letters, 28: 173-180, 2007
2006
Jun Li, Michael Q. Zhang, Xuegong Zhang, A new method for detecting human recombination hotspots and its applications to the HapMap ENCODE data, American Journal of Human Genetics, 79: 628-639, Oct 2006
Shao Li, Ruiqin Wang, Yulong Zhang, Xuegong Zhang, A. Joseph Layon, Yanda Li and Mingzhe Chen, Symptom combinations associated with outcome and therapeutic effects in a cohort of cases with SARS, The American Journal of Chinese Medicine, 34(6): 937-947, 2006
Jin Gu, Tao He, Yunfei Pei, Fei Li, Xiaowo Wang, Jing Zhang, Xuegong Zhang, Yanda Li, Primary transcripts and expressions of mammal intergenic microRNAs detected by mapping ESTs to their flanking seqeuences, Mammalian Genome, 17: 1033-1041, 2006
Chaolin Zhang, Xuegong Zhang, Michael Q. Zhang, Yanda Li, Neighbor number, valley seeking and clustering, Pattern Recognition Letters, 28: 173-180, 2006
Fang Fang, Shicai Fan, Xuegong Zhang and Michael Q. Zhang, Predicting methylation status of CpG islands in the human brain, Bioinformatics, 22(18): 2204-2209, 2006
Xuesong Lu, Xuegong Zhang, The effect of GeneChip gene definitions on the microarray study of cancers, BioEssays, 28(7): 739-746, 2006
Chaolin Zhang, Xuesong Lu, Xuegong Zhang, Significance of gene ranking for classification of microarray samples, IEEE/ACM Transactions on Computational Biology and Bioinformatics, 3(3): 312-320, 2006
Xuegong Zhang, Xin Lu, Qian Shi, Xiu-qin Xu, Hon-chiu E Leung, Lyndsay N Harris, James D Iglehart, Alexander Miron, Jun S Liu and Wing H Wong, Recursive SVM feature selection and sample classification for mass-spectrometry and microarray data, BMC Bioinformatics, 7:197, 2006 (10Apr2006)
劉沭華,張學工,周群,孫素琴,近紅外漫反射光譜法和模式識別技術鑑別中藥材產地,《光譜學與光譜分析》,26(4): 629-632, Apr. 2006
許建華,張學工,經典線性算法的非線性核形式,《控制與決策》,vol.21, no.1, pp. 1-12, 2006
Xu Jian-hua, Zhang, Xue-gong, Li Yan-da, Regularized kernel forms of minimum squared error method, Front. Electr. Electron. Eng. China, (2006)1: 1-7
Jianhua XU, Xuegong Zhang, Suqin Sun. Tuning SVM Parameters for Classifying Geographical Origins of Chinese Medical Herbs. International Journal of Wavelet, Multimedia and Information Processing, 2006, 4(3)
2005
Shicai Fan & Xuegong Zhang, Characterizing the microenvironment surrounding phosphorylated protein sites, Genomics, Proteomics & Bioinformatics, 3(4): 213-217, 2005
S. Weng, C. Zhang, Z. Liu, and X. Zhang, Mining the structural knowledge of high-dimensional medical data using Isomap, Medical & Biological Engineering & Computing, 43(3): 410-412, 2005
Jianhua Xu, Xuegong Zhang. A Multiclass Kernel Perceptron Algorithm. In: Proceedings of International Conference on Neural Networks and Brain (Mingsheng Zhao and Zhongzhi Shi, editors). Vol. 2, pp. 717-721, Oct. 13-15, 2005, Beijing, China. New York: IEEE Press
Chenghai Xue, Fei Li, Tao He, Guoping Liu, Yanda Li, Xuegong Zhang, Classification of real and pseudo microRNA precursors using local structure-sequence features and support vector machine, BMC Bioinformatics, 6: 310, 2005
Xiangqing Sun, Zhongqi Zhang, Yulong Zhang, Xuegong Zhang, Yanda Li, Multi-locus penetrance variance analysis method for association study in complex diseases, Human Heredity, 60(3): 143-149, 2005
Xiaowo Wang, Jing Zhang, Fei Li, Jin Gu, Tao He, Xuegong Zhang, Yanda Li, MicroRNA identification based on sequence and structure alignment, Bioinformatics, 21(18): 3610-3614, 2005
Jianning Bi, Huiyu Xia, Fei Li, Xuegong Zhang, Yanda Li, The effect of U1 snRNA binding free energy on the selection of 5' splice sites, Biochemical and Biophysical Research Communications, 333: 64-69, 2005
劉沭華,張學工,周群,孫素琴,模式識別和紅外光譜法相結合鑑別中藥材產地,《光譜學與光譜分析》,2005,v.25, no.6, 878-881 (Use of FTIR and pattern recognition to detemine geographical origins of Chinese midical herbs, Spectroscopy and Spectral Analysis)
Shuhua Liu, Xuegong Zhang, Suqin Sun, Discrimination and feature selection of geographic origins of traditional Chinese medicine herbs with NIR spectroscopy, Chinese Science Bulletin, 50(2): 179-184, 2005
Keyue Ding, Jing Zhang, Kaixin Zhou, Yan Shen, Xuegong Zhang, htSNPer1.0: software for haplotype block partition and htSNPs selection, BMC Bioinformatics, 6:38, 2005 (1 March 2005)
Keyue Ding, Kaixin Zhou, Jing Zhang, Joanne Knight, Xuegong Zhang, Yan Shen, The effect of haplotype block definations on inference of haplotype block structure and htSNPs selection, Molecular Biology and Evolution, 22(1): 148-159, 2005
2004
Jing Zhang, Fei Li, Jun Li, Michael Q. Zhang, Xuegong Zhang, Evidence and characteristics of putative human alpha recombination hotspots, Human Molecular Genetics, 13(22): 2823-2828, 2004
Xi Ma, Jun Cai, Wei Hu, Yimin Zhang, Yanda Li, Xuegong Zhang, Discovering possible context dependences around SNP Sites in human genes with Bayesian wetwork learning, ICARCV 2004, pp.1315-1319, Dec.2004
Xuesong Lu, Yanda Li, Xuegong Zhang, A simple strategy for detecting outlier samples in microarray data, ICARCV 2004, pp.1331-1335, Dec.2004
孫向青,賈彥彬,張學工,許琪,沈岩,李衍達,多巴胺通路的基因與精神分裂症風險的多位點關聯研究,《中國科學》(C輯),34(5): 465-470, 2004
X-Q. Xu, C.K. Leow, X. Lu, X. Zhang, J.S. Liu, W.H. Wong, A. Asperger, S. Deininger, H.E. Leung, Molecular classification of liver cirrhosis in a rat model by proteomics and bioinformatics, Proteomics, 4: 3235-3245, 2004
Jianhua Xu, Xuegong Zhang, A learning algorithm with Gaussian regularizer for kernel neuron, Advances in Neural Networks – ISNN 2004, part I, pp.252-257, Dalian, Aug., 2004
Jianhua Xu, Xuegong Zhang, Kernels based on weighted Levenshtein distance, IJCNN2004, pp.3015-3018, Budapest, July 2004
許建華,張學工,李衍達,支持向量機的新發展,《控制與決策》,vol.19,no.5, pp.481-484, 2004年5月
李衍達,張學工,李飛,生命信息技術前沿熱點—小RNA基因及基因組非編碼區的信息挖掘,中國科學院《2004高技術發展報告》,科學出版社,2004年3月 pp. 124-131
Fang Wen, Fei Li, Huiyu Xia, Xin Lu, Xuegong Zhang (corresponding author), Yanda Li, The impact of very short alternative splicing on protein structures and functions in the human genome, Trends in Genetics, vol.20, no.5, May 2004, pp.232-236
Xuesong Lu, Xing Wang, Ying Huang, Wei Hu, Guang R. Gao, Yanda Li, Xuegong Zhang, On some choices in Bayesian network learning for reconstructing regulatory networks, Proceedings of RECOMB04, March 2004, pp. 126-127
Chaolin Zhang, Yanda Li, Xuegong Zhang, gMap: extracting and interactively visualizing nonlinear relationships of genes from expression, Proceedings of RECOMB04, March 2004, pp. 228-229
許建華,張學工,李衍達,最小平方誤差算法的正則化核形式,《自動化學報》,vol.30, no.1, Jan. 2004, pp.27-36
特邀報告與講座
1. Xuegong Zhang, Computational prediction of miRNA-regulated pathways in the differentiation of histological grades in breast cancer, ACM-HK Bioinformatics Symposium, March 27, HK, 2010
2. Xuegong Zhang, Estimating variances of pattern classifiers’ performances with given samples, ISciDE (Sino-foreign-interchange Workshop on Intelligence Science and Intelligent Data Engineering), June 3-5, Harbin, 2010
3. Xuegong Zhang, Estimating gene expression values from RNA-seq data, IEEE International Conference on Bioinformatics & Biomedicine (BIBM2010), HK, Dec.18-21, 2010
4. Xuegong Zhang, Estimating gene expression values from RNA-seq data, The 8th ICSA International Conference: Frontiers of Interdisciplinary and Methodological Statistical Research, Guangzhou, Dec.19-22, 2010
5. Xuegong Zhang, Computational prediction of microRNA-regulated pathways in the differentiation of histological grades in breast cancer, Opportunities for Integrative Bionetworks to Enable Dynamic Models of Diseases – Beijing Summit, Beijing, Nov.20, 2009
6. Xuegong Zhang, Putting more biology in learning machines, Planery Keynote Speech at APBC2008, Jan 14-17, 2008, Kyoto
7. Xuegong Zhang, Studying molecular features of breast cancer with learning machines, invited talk at CAS International Symposium on Developmental Systems Biology, May 18-20, 2008, Beijing
8. Xuegong Zhang, Bioinformatics Study of the Molecular Features of Breast Cancer, invited keynote talk at the 5th International Conference on Information Technology and Applications in Biomedicine (ITAB’08), May 30-31, 2008, Shenzhen
9. Xuegong Zhang, Learning Biology with Machines: examples from alternative splicing and DNA methylation, invited talk at 2008 International Bioinformatics Workshop, June 7-9, 2008, Kunming
10. Xuegong Zhang, Understanding lymph node metastasis in breast cancers: a case study of microarray data analysis, invited talk, NSF Sponsored International Conference on Bioinformatics, June 10-14, 2007, Hangzhou
11. Xuegong Zhang, A bioinformatics study on lymph node metastasis of breast cancers, invited talk, International Symposium on Biochip Technology and Molecular Classification of Disease, May 6-8, Shanghai, 2007
12. Xuegong Zhang, Some new challenges for pattern recognition on high-throughput genomics/proteomics data, ICCTA2007, Kolkata, India, Mar 3-7, 2007
13. Xuegong Zhang, Effects of re-mapping the oligo probes onto the updated genome on high-level analyses of microarray data, BNI&IFBT2006, Beijing, Oct. 10, 2006
14. Xuegong Zhang, Machine learning in high-throughput genomics and proteomics, Tutorial at ICONIP2006 (http://iconip2006.cse.cuhk.edu.hk/program/Tutorial-3), Hong Kong, Oct.3, 2006
15. Fang Fang, Xuegong Zhang and Michael Q. Zhang, Computational studies in epigenetics, The First International Conference on Computational Systems Biology, Shanghai, July 20-23, 2006
16. Xuegong Zhang, Building gene networks by fusing literature and microarray data, Transcripteom 2005, Shanghai, Nov.5-9, 2005
17. Xuegong Zhang, Computational Analysis of Haplotype Blocks and Human Recombination Hotspots, Changchun International Bioinformatics Workshop, Changchun, July 5-7, 2005
18. 張學工,再看高通量表達數據的機器學習分析,東方科技論壇:計算生物學最新進展,上海,2005年7月2日
19. Xuegong Zhang, Computation Analysis of Human Recombination Hotspots, 1st International Workshop on Computational and Systems Biology, Beijing, May 23, 2005
20. 張學工, 生物信息學中的若干計算問題,中國計算機學會青年計算機科技論壇,2005年4月22日
21. Xuegong Zhang, Learning Specific Gene Relation Networks from Literatures, 2005 Sina-German Workshop on Networks: from Biology to Theory, Beijing, Apr 4-8, 2005
22. Xuegong Zhang, SVM and Its Application Examples in Computational Biology, CSHL bioinformaics seminar, Feb 9, 2005
23. Xuegong Zhang, Computational Analysis of Haplotype Blocks and Recombination Hotspots, Dr. Jun Liu’s lab seminar at Harvard University, Feb 1, 2005
24. Xuegong Zhang, Significance of Gene Ranking for Classification of Microarray Samples, SRCCS 2004 International Workshop for Statistics, Seoul National University, Korea, June 2004
25. Xuegong Zhang, Considerations on Sample Classification and Gene Selection with Microarray Data using Machine Learning Approaches, Statistical Method in Microarray Analysis Workshop at NUS, Singapore, Jan 2004
26. 張學工,基因表達數據中模式識別問題的一些特點,中國科協中國科學青年科學家論壇第81次論壇“生物信息學中的若干前沿問題的探討”,2003年11月28-29日
教材、專著與章節
1. 張學工,《模式識別》(第三版),清華大學出版社,2010.8
2. V. Vapnik著,許建華、張學工譯,《統計學習理論》,電子工業出版社,2004.6
3. 邊肇祺、張學工等編著,《模式識別》(第二版),清華大學出版社,2000.1
4. V. Vapnik著,張學工譯,《統計學習理論的本質》,清華大學出版社,2000.8
5. 張學工、劉業新編著, 《X Window /Motif編程速成》,清華大學出版社,1998.3
6. Xuegong Zhang and Yanda Li, The application of artificial neural networks in editing noisy seismic data, Chapter 7, in F.Aminzadeh & M. Jamshidi eds., Soft Computing: Fuzzy Logic, Neural Networks, and Distributed Artificial Intelligence, PTR Prentice Hall, NY, 1994