Yarowsky, David

Professor
Computer Science

Hackerman Hall 324G
(410) 516-5372
yarowsky@jhu.edu

Jump to:

News

About

Education
  • Ph.D. 1996, Univ Pennsylvania
  • Ph.D. 1995, Univ Pennsylvania
Experience
  • 2012 - 2013:  General Chair, Conference on Empirical Methods in Natural Language Processing
  • 2011 - 2011:  Program Chair, International Joint Conference on Natural Language Processing
  • 2010 - 2010:  Co-program chair, 2010 ACM SIGIR Web N-Gram Workshop
  • 2010 - 2010:  Program Chair, International Joint Conference on Natural Language Processing 2011
  • 2009 - 2009:  Chair, 2009 Conference of Empirical Methods in Natural Language Processing
  • 2009 - 2010:  Chair, 2010 Conference of Empirical Methods in Natural Language Processing
  • 2008 - 2008:  Chair, Ad-hoc promotion committee
  • 2004 - 2013:  Chair, Curriculum Committee, Department of Computer Science
  • 2003 - 2003:  Chair, ACL-2003 Conference
  • 2002 - 2002:  Program Co-Chair, Conference on Human Language Technologies
  • 2001 - 2001:  Chair, Conference on Empirical Methods in NLP
Research Areas
  • Characterizing communicants
  • Computational morphology
  • MACHINE learning
  • Machine translation
  • Natural Language Processing
  • Speech processing
  • Very large text databases
Awards
  • 2013:  Fellow of the Association for Computational Linguistics
  • 2013:  Awarded successful bid to host major ACL 2014 Conference in Baltimore (1000+ attendees) [Flagship annual conference in NLP field - held in North America every 3 years] (wrote and defended successful competitive proposal; am the 2014 host chair)
  • 2012:  Named General Chair of major EMNLP 2013 Conference (400+ attendees)
  • 2009:  Continued citation in 2009 for ranking in the top-10 for the most overall citations of one's publications in the cumulative 46-year collection of all ACL Journals and Conference proceedings; and 3rd highest H score (when excluding self-citations)
  • 2008:  It was announced this year that I rank in the top-10 for the most overall citations of one's publications in the cumulative 45-year collection of all ACL Journals and Conference proceedings; and tied for 3rd highest H score (when excluding self-citations)
  • 2000:  NSF CAREER Award
  • 1994:  CIS Graduate Award for Excellence in Teaching - University of Pennsylvania
  • 1992:  NDSEG Graduate Fellowship in Computer Science
  • 1987:  Michael C. Rockefeller Memorial Fellowship
  • 1986:  Member - Phi Beta Kappa
Presentations
  • "Very Large Scale Computational Morphology for the World's Languages", Invited Talk.  Emerson, MD.  October 1, 2014
  • "Morphology and Language Modeling for Keyword Search", IARPA Babel Program PI Meeting.  Linthicum, Maryland.  February 1, 2014
  • "Learning to Characterize Communicants on Multiple Dimensions via their Names", HLTCOE Technical Exchange.  Baltimore, MD.  February 1, 2014

Publications

Journal Articles
  • Guo J, Che W, Yarowsky D, Wang H, Liu T (2016).  A distributed representation-based framework for cross-lingual transfer parsing.  Journal of Artificial Intelligence Research.  55.
  • Kirov C, Sylak-Glassman J, Que R, Yarowsky D (2016).  Very-large scale parsing and normalization of Wiktionary morphological paradigms.  Proceedings of the 10th International Conference on Language Resources and Evaluation, LREC 2016.
  • Sylak-Glassman J, Kirov C, Yarowsky D (2016).  Remote elicitation of inflectional paradigms to seed morphological analysis in low-resource languages.  Proceedings of the 10th International Conference on Language Resources and Evaluation, LREC 2016.
  • Guo J, Che W, Yarowsky D, Wang H, Liu T (2016).  A representation learning framework for multi-source transfer parsing.  30th AAAI Conference on Artificial Intelligence, AAAI 2016.
  • Sylak-Glassman J, Kirov C, Post M, Que R, Yarowsky D (2015).  A universal feature schema for rich morphological annotation and fine-grained cross-lingual part-of-speech tagging.  Communications in Computer and Information Science.  537.
  • Sylak-Glassman J, Kirov C, Yarowsky D, Que R (2015).  A language-independent feature schema for inflectional morphology.  ACL-IJCNLP 2015 - 53rd Annual Meeting of the Association for Computational Linguistics and the 7th International Joint Conference on Natural Language Processing of the Asian Federation of Natural Language Processing, Proceedings of the Conference.  2.
  • Guo J, Che W, Yarowsky D, Wang H, Liu T (2015).  Cross-lingual dependency parsing based on distributed representations.  ACL-IJCNLP 2015 - 53rd Annual Meeting of the Association for Computational Linguistics and the 7th International Joint Conference on Natural Language Processing of the Asian Federation of Natural Language Processing, Proceedings of the Conference.  1.
  • Trmal J, Chen G, Povey D, Khudanpur S, Ghahremani P, Zhang X, Manohar V, Liu C, Jansen A, Klakow D, Yarowsky D, Metze F (2014).  A keyword search system using open source software.  2014 IEEE Workshop on Spoken Language Technology, SLT 2014 - Proceedings.
  • Bergsma S, Yarowsky D (2013).  Learning domain-specific, L1-specific measures of word readability.  TAL Traitement Automatique des Langues.  54(1).
  • Chen G, Khudanpur S, Povey D, Trmal J, Yarowsky D, Yilmaz O (2013).  Quantifying the value of pronunciation lexicons for keyword search in lowresource languages.  ICASSP, IEEE International Conference on Acoustics, Speech and Signal Processing - Proceedings.
  • Volkova S, Wilson T, Yarowsky D (2013).  Exploring demographic language variations to improve multilingual sentiment analysis in social media.  EMNLP 2013 - 2013 Conference on Empirical Methods in Natural Language Processing, Proceedings of the Conference.
  • Bergsma S, Dredze M, Van Durme B, Wilson T, Yarowsky D (2013).  Broadly improving user classification via communication-based name and location clustering on twitter.  NAACL HLT 2013 - 2013 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies, Proceedings of the Main Conference.
  • Volkova S, Wilson T, Yarowsky D (2013).  Exploring sentiment in social media: Bootstrapping subjectivity clues from multilingual twitter streams.  ACL 2013 - 51st Annual Meeting of the Association for Computational Linguistics, Proceedings of the Conference.  2.
  • Bergsma S, Post M, Yarowsky D (2012).  Stylometric analysis of scientific articles.  NAACL HLT 2012 - 2012 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies, Proceedings of the Conference.
  • Klementiev A, Irvine A, Callison-Burch C, Yarowsky D (2012).  Toward statistical machine translation without parallel corpora.  EACL 2012 - 13th Conference of the European Chapter of the Association for Computational Linguistics, Proceedings.
  • Bergsma S, Yarowsky D (2011).  NADA: A robust system for non-referential pronoun detection.  Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics).  7099 LNAI.
  • Rao D, Yarowsky D (2011).  Typed graph models for semi-supervised learning of name ethnicity.  ACL-HLT 2011 - Proceedings of the 49th Annual Meeting of the Association for Computational Linguistics: Human Language Technologies.  2.
  • Bergsma S, Yarowsky D, Church K (2011).  Using large monolingual and bilingual corpora to improve coordination disambiguation.  ACL-HLT 2011 - Proceedings of the 49th Annual Meeting of the Association for Computational Linguistics: Human Language Technologies.  1.
  • Rao D, Yarowsky D, Shreevats A, Gupta M (2010).  Classifying latent user attributes in Twitter.  International Conference on Information and Knowledge Management, Proceedings.
  • Lin D, Church K, Ji H, Sekine S, Yarowsky D, Bergsma S, Patil K, Pitler E, Lathbury R, Rao V, Dalwani K, Narsale S (2010).  New tools for web-scale N-grams.  Proceedings of the 7th International Conference on Language Resources and Evaluation, LREC 2010.
  • Rao D, Yarowsky D (2009).  Ranking and semi-supervised classification on large scale graphs using map-reduce.  ACL-IJCNLP 2009 - TextGraphs 2009: 2009 Workshop on Graph-Based Methods for Natural Language Processing, Proceedings of the Workshop.
  • Garera N, Callison-Burch C, Yarowsky D (2009).  Improving translation lexicon induction from monolingual corpora via dependency contexts and part-of-speech equivalences.  CoNLL 2009 - Proceedings of the Thirteenth Conference on Computational Natural Language Learning.
  • Sayeed A, Elsayed T, Garera N, Alexander D, Xu T, Oard DW, Yarowsky D, Piatko C (2009).  Arabic cross-document coreference detection.  ACL-IJCNLP 2009 - Joint Conf. of the 47th Annual Meeting of the Association for Computational Linguistics and 4th Int. Joint Conf. on Natural Language Processing of the AFNLP, Proceedings of the Conf..
  • Garera N, Yarowsky D (2009).  Structural, transitive and latent models for biographie fact extraction.  EACL 2009 - 12th Conference of the European Chapter of the Association for Computational Linguistics, Proceedings.
  • Mayfield J, Alexander D, Dorr B, Eisner J, Elsayed T, Finin T, Fink C, Freedman M, Garera N, Mcnamee P, Mohammad S, Oard D, Piatko C, Sayeed A, Syed Z, Weischedel R, Xu T, Yarowsky D (2009).  Cross-document coreference resolution: A key technology for learning by reading.  AAAI Spring Symposium - Technical Report.  SS-09-07.
  • Li Z, Yarowsky D (2008).  Mining and modeling relations between formal and informal Chinese phrases from web corpora.  EMNLP 2008 - 2008 Conference on Empirical Methods in Natural Language Processing, Proceedings of the Conference: A Meeting of SIGDAT, a Special Interest Group of the ACL.
  • Li Z, Yarowsky D (2008).  Unsupervised translation induction for chinese abbreviations using monolingual corpora.  ACL-08: HLT - 46th Annual Meeting of the Association for Computational Linguistics: Human Language Technologies, Proceedings of the Conference.
  • Rao D, Garera N, Yarowsky D (2007).  JHU1: An unsupervised approach to person name disambiguation using web snippets.  ACL 2007 - SemEval 2007 - Proceedings of the 4th International Workshop on Semantic Evaluations.
  • Garera N, Yarowsky D (2006).  Resolving and generating definite anaphora by modeling hypernymy using unlabeled corpora.  Proceedings of the Tenth Conference on Computational Natural Language Learning, CoNLL-X.
  • Riesa J, Yarowsky D (2006).  Minimally supervised morphological segmentation with applications to machine translation.  AMTA 2006 - Proceedings of the 7th Conference of the Association for Machine Translation of the Americas: Visions for the Future of Machine Translation.
  • Pytlik B, Yarowsky D (2006).  Machine translation for languages lacking Bitext via multilingual gloss transduction.  AMTA 2006 - Proceedings of the 7th Conference of the Association for Machine Translation of the Americas: Visions for the Future of Machine Translation.
  • Mann GS, Yarowsky D (2005).  Multi-field information extraction and cross-document fusion.  ACL-05 - 43rd Annual Meeting of the Association for Computational Linguistics, Proceedings of the Conference.
  • Yarowsky D, Florian R (2002).  Evaluating sense disambiguation across diverse parameter spaces.  Natural Language Engineering.  8(4).
  • Florian R, Cucerzan S, Schafer C, Yarowsky D (2002).  Combining classifiers for word sense disambiguation.  Natural Language Engineering.  8(4).
  • Yarowsky D (2000).  Hierarchical decision lists for word sense disambiguation.  Language Resources and Evaluation.  34(1-2).
  • Yarowsky D (2000).  Hierarchical decision lists for word sense disambiguation.  Computers and the Humanities.  34(1-2).
  • Resnik P, Yarowsky D (1999).  Distinguishing systems and distinguishing senses: New evaluation methods for Word Sense Disambiguation.  Natural Language Engineering.  5(2).
  • Gale WA, Church KW, Yarowsky D (1995).  Discrimination decisions for 100,000-dimensional spaces.  Annals of Operations Research.  55(2).
  • Gale WA, Church KW, Yarowsky D (1992).  A method for disambiguating word senses in a large corpus.  Computers and the Humanities.  26(5-6).
Book Chapters
  • Yarowsky D (2010).  Word sense disambiguation.  Handbook of Natural Language Processing, Second Edition.
Conference Proceedings
  • Yarowsky D, Trmal J, et al. (2014).  A Keyword Search System Using Open Source Software.  WSLT 2014.  IEEE.  6 pages.
  • Yarowsky D, Volkova S (2014).  Improving Gender Prediction of Social Media Users via Weighted Annotator Rationales.  Personalization: Methods and Applications (NIPS 2014).  NIPS.  8 pages.
  • Yarowsky D, Mayfield J, et al. (2014).  KELVIN: Extracting Knowledge from Large Text Collections}.  AAAI Fall Symposium on Natural Language Access to Big Data.  AAAI.  1121.  555-570 (16 pages).
  • Mayfield J, Alexander D, Dorr B, Eisner J, Elsayed T, Finin T, Fink C, Freedman M, Garera N, McNamee P, Mohammad S, Oard D, Piatko C, Sayeed A, Syed Z, Weischedel R, Xu T, Yarowsky D (2009).  Cross-Document Coreference Resolution: A Key Technology for Learning by Reading.  Proceedings of the AAAI 2009 Spring Symposium on Learning by Reading and Learning to Read.
Back to top