books and such
- Debates in the Digital Humanties, a book series. All of the books are available online an have open annotations.
- The data-sitters club. A fun and colloquial guide to digital humanities computational text analysis
- the Data Notebook, an online suite of open interactive resources that provides instructional materials for introductory data analytics and data visualization approaches relevant to a wide range of subjects and disciplines. Specifically, this book focuses on principles related to data storytelling, and provides tangible research steps and include case studies, mini-lessons, and interactive instructional components. By Kenton Rambsy and Peace Ossom Ossom-Williamson.
- Speech and Language Processing, 3rd edition draft, by Dan Jurafsky and James H. Martin
- Introduction to Earth Data Science Textbook by the Earth Lab at the University of Colorado Boulder. A good, introduction to python.
- Use Data for Earth and Environmentla Science in Open Source Python Textbook also from the Earth Lab at CU Boulder. Includes a chapter on APIs and working with twitter data in python.
software and data
- A collection of digital humanities data sets, curated by .txtlab a laboratory for cultural analytics at McGill University.
- GloVe Global Vectors for Word Representation
- spaCy, a python library for NLP. Fast, efficient.
- Tensorflow's projector tool
- The NRC Valence, Arousal, and Dominance (NRC-VAD) Lexicon by Saif M. Mohammad.
- Code for the US History Textbook project
courses
- Introduction to Computational Literary Analysis, Summer 2022 a course by Jonathan Reeve. Highly recommended.
- Introduction to Digital Humanites a course at the CUNY graduate center by Matthew K. Gold.
- CSC321 neural networks and Machine Learning, a course at the University of Toronto. Not specific to text, but has many text examples. Includes some lecture notes, slides, and python notebooks.
- Meaningful Text Analysis with Word Embeddings by Jonathan Reeve. Videos for the course can be found here
tutorials, etc
- Creating visuals with NLTK's FreqDist by Darius Fuller. Shows how to make some better-looking frequency distribution charts.
- The Illustrated Word2vec - A Gentle Intro to Word Embeddings in Machine Learning, a helpful, short video by Jay Alammar. He has lots of other explainer videos on his youtube channel. Jay's artcile, The Illustrated Word2vec, is also excellent and highly recommended.