大学院
HOME 大学院 Text Analysis for Economics and Area Studies
学内のオンライン授業の情報漏洩防止のため,URLやアカウント、教室の記載は削除しております。
最終更新日:2024年4月22日

授業計画や教室は変更となる可能性があるため、必ずUTASで最新の情報を確認して下さい。
UTASにアクセスできない方は、担当教員または部局教務へお問い合わせ下さい。

Text Analysis for Economics and Area Studies

This course focuses on the intersection of economics and China studies, utilizing textual data analysis—a rapidly advancing field in the past decade. Textual sources have become increasingly prevalent in various formats, many of which exist today in the digital form. Examples of text sources encompass newspaper articles, posts on social media, annual reports, prospectuses, earnings presentations, patents, academic articles, policy documents, central bank meeting records, legal transcripts, and job listings. In addition to the well-established quantitative economic variables, the integration of textual data has led to a multitude of novel scholarly contributions. Throughout this course, students will gain familiarity with literature, methodologies, and tools related to the quantitative text analysis. Simultaneously, emphasis will be placed on cultivating a qualitative (domain-specific) comprehension of distinct languages and text generation processes. That is why the title of this course includes Area Studies.

Each class will be structured into two segments: (1) lectures and discussions (approximately 50 minutes) and (2) coding practice (approximately 50 minutes). Lectures and discussions will encompass the literature in Economics along with practical examples tied to China studies. The coding exercises will initiate with foundational operations mainly using R, followed by specific techniques for processing textual data and basic machine learning implementations. Through these exercises, the overarching objective is for participants to grasp prevailing research trends involving text-as-data applications and develop fundamental proficiencies in practical text manipulation. Attendance, active engagement in discussions, as well as timely submission of assignments and end-of-term reports are required for all enrolled students. The detail of evaluation policy will be announced at the first meeting.

All participants are expected to bring own laptop computer which installed the R studio. Following materials are helpful for setting up the R studio.
-Harvard University, Institute for Quantitative Social Science (IQSS), Data Science Workshops Materials (See Installation part): https://*****/*****
-Peng, Roger D. R Programming for Data Science (See Installation part): https://bookdown.org/*****
-松村優哉・湯谷啓明・紀ノ定保礼・前田和寛(2021).『改訂2版 RユーザのためのRStudio[実践]入門〜tidyverseによるモダンな分析フローの世界』技術評論社。
MIMA Search
時間割/共通科目コード
コース名
教員
学期
時限
291116-02
GEC-EC6910S3
Text Analysis for Economics and Area Studies
伊藤 亜聖
A1 A2
月曜4限
マイリストに追加
マイリストから削除
講義使用言語
英語
単位
2
実務経験のある教員による授業科目
NO
他学部履修
開講所属
経済学研究科
授業計画
Schedule with core readings Part 1. Text as data approach Week 1. Introduction (Oct 7th) Gentzkow, M., Kelly, B., & Taddy, M. (2019). Text as data. Journal of Economic Literature, 57(3), 535-574. Grimmer, J., Roberts, M. E., & Stewart, B. M. (2022). Text as data: A new framework for machine learning and the social sciences. Princeton University Press, Chapter 2. Week 2. Economics (Oct 14th) Gentzkow, M., Kelly, B., & Taddy, M. (2019). Text as data. Journal of Economic Literature, 57(3), 535-574. Week 3. China Studies (Oct 21st) King, G., Pan, J., & Roberts, M. E. (2013). How censorship in China allows government criticism but silences collective expression. American Political Science Review, 107(2), 326-343. Part 2. Dictionary methods Week 4. Keywords Search (Oct 28th) Baker, S. R., Bloom, N., & Davis, S. J. (2016). Measuring economic policy uncertainty. The Quarterly Journal of Economics, 131(4), 1593-1636. Grimmer, J., Roberts, M. E., & Stewart, B. M. (2022). Text as data: A new framework for machine learning and the social sciences. Princeton University Press, Chapter 16. Week 5. Sentiment Analysis (Nov 4th) Loughran, T., & McDonald, B. (2011). When is a liability not a liability? Textual analysis, dictionaries, and 10‐Ks. The Journal of Finance, 66(1), 35-65. Week 6. Document Similarity (Nov 11th) Grimmer, J., Roberts, M. E., & Stewart, B. M. (2022). Text as data: A new framework for machine learning and the social sciences. Princeton University Press, Chapter 7. Kelly, B., Papanikolaou, D., Seru, A., & Taddy, M. (2021). Measuring technological innovation over the long run. American Economic Review: Insights, 3(3), 303-320. Week 7. Mid-term presentation and feedback (or guest lecture) (Nov 13th, Wed) Part 3. Machine learning Week 8. Text Regression: A supervised learning (Dec 2nd) Gentzkow, M., Kelly, B., & Taddy, M. (2019). Text as data. Journal of Economic Literature, 57(3), 535-574. James, G., Witten, D., Hastie, T, and Tibshirani, R. (2013). An introduction to statistical learning with applications in R. Springer Publication, Chapter 6. (邦訳:『 Rによる統計的学習入門』朝倉書店) Week 9. Topic Model: An unsupervised learning (Dec 9th) Grimmer, J., Roberts, M. E., & Stewart, B. M. (2022). Text as data: A new framework for machine learning and the social sciences. Princeton University Press, Chapter 13. Week 10. Latent Semantic Scaling: A semi-supervised learning (Dec 16th) Grimmer, J., Roberts, M. E., & Stewart, B. M. (2022). Text as data: A new framework for machine learning and the social sciences. Princeton University Press, Chapter 20. Watanabe, K. (2021). Latent Semantic Scaling: A Semisupervised Text Analysis Technique for New Domains and Languages. Communication Methods and Measures 15(2): 81-102. Part 4. Deep Learning Week 11. Word Embeddings (Dec 23rd) Grimmer, J., Roberts, M. E., & Stewart, B. M. (2022). Text as data: A new framework for machine learning and the social sciences. Princeton University Press, Chapter 8. Rodriguez, P. L., & Spirling, A. (2022). Word embeddings: What works, what doesn’t, and how to tell the difference for applied research. The Journal of Politics, 84(1), 101-115. Week 12. Large Language Models (Jan 6th) Hansen, S., Lambert, P. J., Bloom, N., Davis, S. J., Sadun, R., & Taska, B. (2023). Remote work across jobs, companies, and space (No. w31007). National Bureau of Economic Research. Liu, P., Yuan, W., Fu, J., Jiang, Z., Hayashi, H., & Neubig, G. (2023). Pre-train, prompt, and predict: A systematic survey of prompting methods in natural language processing. ACM Computing Surveys, 55(9), 1-35. Yang, J., Jin, H., Tang, R., Han, X., Feng, Q., Jiang, H., ... & Hu, X. (2023). Harnessing the power of LLMs in practice: A survey on ChatGPT and beyond. arXiv preprint arXiv:2304.13712. Week 13. Final presentation (Jan 20th)
授業の方法
Lecture and coding exercise.
成績評価方法
Participations, problem sets, and the final report.
教科書
Grimmer, J., Roberts, M. E., & Stewart, B. M. (2022). Text as data: A new framework for machine learning and the social sciences. Princeton University Press
参考書
Ash, E., & Hansen, S. (2023). Text algorithms in economics. Annual Review of Economics, 15. Gentzkow, M., Kelly, B., & Taddy, M. (2019). Text as data. Journal of Economic Literature, 57(3), 535-574. カタリナック・エイミー、渡辺耕平(2019)「日本語の量的テキスト分析」『早稲田大学高等研究所紀要』第11号、133-143頁。
履修上の注意
All participants are expected to bring own laptop computer which installed the R studio. In addition to R Studio, we may use other forms of coding tool including Google Colab (Python and R).