DDA3600 Factor Investing
I make the course content public to honor my academic idol, John Cochrane, who made his classic course on asset pricing available to everyone. This is also a tribute to the spirit of sharing exemplified by MIT OpenCourseWare over the years, and to the belief that everyone should have access to high-quality educational content.
Course Description
This course offers a thorough exploration of factor investing, blending econometrics and machine learning. Key topics include foundational concepts of factor investing, portfolio sorting, cross-sectional and time-series regression methods, and multiple hypothesis testing. The course also covers advanced machine learning models, integrating alternative data sources, and addressing practical challenges such as factor timing and allocation. Drawing on the instructor’s extensive experience in the quantitative hedge fund industry, the course emphasizes real-world practice and application. By course end, students will be prepared to conduct empirical research, develop and backtest factor-based strategies, and apply factor investing principles in professional settings.
Prerequisites
- MAT2040 Linear Algebra
- STA2001 Probability and Statistics I
- Basic knowledge of finance and investment is beneficial.
- Familiarity with Python (assignments will require coding with Python).
Course Syllabus
Assessment Scheme
Component | Weight | Instruction |
---|---|---|
Assignments | 20% | There will be 3 to 4 assignments, which will allow students to work on large-scale empirical data of the Chinese stock market. |
Quiz 1 | 25% | This quiz covers topics up to lecture 5. |
Quiz 2 | 25% | This quiz covers topics from lecture 7 to lecture 9. |
Final Project | 30% | Students will work on an advanced topic related to factor investing. |
Recommended Reading
- Lecture 1
- Chapter 1 of 石川, 刘洋溢, 连详斌 (2020). 因子投资:方法与实践. 电子工业出版社.
- Chapter 1 of Zhang, M., T. Lu, and C. Shi (2025). Navigating the Factor Zoo: The Scienec of Quantitative Investing. Routledge.
- Lecture 2
- Merton, R. C. (1973). An intertemporal capital asset pricing model. Econometrica 41(5), 867–887.
- Pedersen, L. H. (2022). A primer on asset pricing (Big Data Asset Pricing Lecture 1). (link)
- Ross, S. A. (1976). The arbitrage theory of capital asset pricing. Journal of Economic Theory 13(3), 341–360.
- Sharpe, W. F. (1964). Capital asset prices: A theory of market equilibrium under conditions of risk. Journal of Finance 19(3), 425–442.
- Chapters 5 and 6 of Cochrane, J. H. (2005). Asset Pricing (2nd Ed.). Princeton University Press.
- Chapter 1 of Zhang, M., T. Lu, and C. Shi (2025). Navigating the Factor Zoo: The Scienec of Quantitative Investing. Routledge.
- Lecture 3
- Pedersen, L. H. (2022). A primer on empirical asset pricing (Big Data Asset Pricing Lecture 2). (link)
- Chapter 5 of Bali, T. G., R. F. Engle, and S. Murray (2016). Empirical Asset Pricing: The Cross Section of Stock Returns. Wiley.
- Chapter 2.1 of 石川, 刘洋溢, 连详斌 (2020). 因子投资:方法与实践. 电子工业出版社.
- Lecture 4
- Gibbons, M. R., S. Ross, and J. Shanken (1989). A test of the efficiency of a given portfolio. Econometrica 57(5), 1121–1152.
- Fama, E. F. and J. D. MacBeth (1973). Risk, return, and equilibrium: Empirical tests. Journal of Political Economy 81(3), 607–636.
- Fama, E. F. and K. R. French (2020). Comparing cross-section and time-series factor models. Review of Financial Studies 33(5), 1891–1926.
- Pedersen, L. H. (2022). A primer on empirical asset pricing (Big Data Asset Pricing Lecture 2). (link)
- Chapter 12 of Cochrane, J. H. (2005). Asset Pricing (2nd Ed.). Princeton University Press.
- Chapter 2.2 of 石川, 刘洋溢, 连详斌 (2020). 因子投资:方法与实践. 电子工业出版社.
- Lecture 5
- Chen, A. Y. (2021). The limits of p-hacking: Some thought experiments. Journal of Finance 76(5), 2447–2480.
- Chordia, T., A. Goyal, and A. Saretto (2020). Anomalies and false rejections. Review of Financial Studies 33(5), 2134–2179.
- Harvey, C. R. (2017). Presidential address: The scientific outlook in financial economics. Journal of Finance 72(4), 1399–1440.
- Harvey, C. R. and Y. Liu (2018). Detecting repeatable performance. Review of Financial Studies 31(7), 2499–2552.
- Harvey, C. R. and Y. Liu (2020). False (and missed) discoveries in financial economics. Journal of Finance 75(5), 2503–2553.
- Harvey, C. R. and Y. Liu (2021). Lucky factors. Journal of Financial Economics 141(2), 413–435.
- Harvey, C. R., Y. Liu, and A. Saretto (2020). An evaluation of alternative multiple testing methods for finance applications. Review of Asset Pricing Studies 10(2), 199–248.
- Harvey, C. R., Y. Liu, and H. Zhu (2016). … and the cross-section of expected returns. Review of Financial Studies 29(1), 5–68.
- Jensen, T. I., B. T. Kelly, and L. H. Pedersen (2023). Is there a replication crisis in finance? Journal of Finance 78(5), 2465–2518.
- Pedersen, L. H. (2022). The factor zoo and replication (Big Data Asset Pricing Lecture 4). (link)
- Shi, C. (2024). Multiple hypothesis testing, empirical asset pricing, and factor investing. Working paper. (link)
- Chapter 6.1 of 石川, 刘洋溢, 连详斌 (2020). 因子投资:方法与实践. 电子工业出版社.
- Lecture 6
- Fama, E. F. and K. R. French (1993). Common risk factors in the returns on stocks and bonds. Journal of Financial Economics 33(1), 3–56.
- Fama, E. F. and K. R. French (2015). A five-factor asset pricing model. Journal of Financial Economics 116(1), 1–22.
- Hou, K., C. Xue, and L. Zhang (2015). Digesting anomalies: An investment approach. Review of Financial Studies 28(3), 650–705.
- Hou, K., H. Mo, C. Xue, and L. Zhang (2019). Which factors? Review of Finance 21(1), 1–35.
- Liu, J., R. F. Stambaugh, and Y. Yuan (2019). Size and value in China. Journal of Financial Economics 134(1), 48–69.
- Shi, C. (2024). Piotroski’s F-score in the Chinese stock market. Working paper. (link)
- Chapters 3-5 of 石川, 刘洋溢, 连详斌 (2020). 因子投资:方法与实践. 电子工业出版社.
- Lecture 7
- Bryzgalova, S., V. DeMiguel, S. Li, and M. Pelger (2023). Asset-pricing factors with economic targets. Working paper.
- Bryzgalova, S., S. Lerner, M. Lettau, and M. Pelger (2025). Missing financial data. Review of Financial Studies 38(3), 803–882.
- Giglio, S. and D. Xiu (2021). Asset pricing with omitted factors. Journal of Political Economy 129(7), 1947–1990.
- Kelly, B. T., S. Pruitt, and Y. Su (2019). Characteristics are covariances: A unified model of risk and return. Journal of Financial Economics 134(3), 501–524.
- Kozak, S., S. Nagel, and S. Santosh (2018). Interpreting factor models. Journal of Finance 73(3), 1183–1223.
- Lettau, M. and M. Pelger (2020). Factors that fit the time series and cross-section of stock returns. Review of Financial Studies 33(5), 2274–2325.
- Lecture 8
- Athey, S. and G. W. Imbens (2019). Machine learning methods that economists should know about. Annual Review of Economics 11, 685–725.
- Breiman, L. (2001). Statistical modeling: The two cultures (with comments and a rejoinder by the author). Statistical Science 16(3), 199–231.
- Bryzgalova, S., J. Huang, and C. Julliard (2023). Bayesian solutions for the factor zoo: We just ran two quadrillion models. Journal of Finance 78(1), 487–557.
- Hastie, T., A. Montanari, S. Rosset, and R. J. Tibshirani (2022). Surprises in high-dimensional ridgeless least squares interpolation. Annals of Statistics 50(2), 949–986.
- Martin, I. W. R. and S. Nagel (2022). Market efficiency in the age of big data. Journal of Financial Economics 145(1), 154–177.
- Mullainathan, S. and J. Spiess (2017). Machine learning: An applied econometric approach. Journal of Economic Perspectives 31(2), 87–106.
- Nagel, S. (2021). Machine Learning in Asset Pricing. Princeton University Press.
- Pelger, M. (2023). Asset pricing and investment with big data. In A. Capponi and C.-A. Lehalle (Eds.), Machine Learning and Data Sciences for Financial Markets: A Guide to Contemporary Practices, Chapter 16. Cambridge University Press.
- Shi, C. (2025). From econometrics to machine learning: Transforming empirical asset pricing. Working paper. (link)
- Chapter 12 of Zhang, M., T. Lu, and C. Shi (2025). Navigating the Factor Zoo: The Scienec of Quantitative Investing. Routledge.
- Lecture 9
- Avramov, D., S. Cheng, and L. Metzker (2023). Machine learning vs. economic restrictions: Evidence from stock return predictability. Management Science 69(5), 2587-2619.
- Bryzgalova, S., M. Pelger, and J. Zhu (forthcoming). Forest through the trees: Building crosssections of asset returns. Journal of Finance.
- Chen, L., M. Pelger, and J. Zhu (2024). Deep learning in asset pricing. Management Science 70(2), 714–750.
- Coqueret, G. and T. Guida (2020). Machine Learning for Factor Investing. Chapman and Hall/CRC. (link)
- Gu, S., B. T. Kelly, and D. Xiu (2020). Empirical asset pricing via machine learning. Review of Financial Studies 33(5), 2223–2273.
- Gu, S., B. T. Kelly, and D. Xiu (2021). Autoencoder asset pricing models. Journal of Econometrics 222(1), 429–450.
- Kelly, B. T., S. Pruitt, and Y. Su (2019). Characteristics are covariances: A unified model of risk and return. Journal of Financial Economics 134(3), 501–524.
- Kelly, B. T. and D. Xiu (2023). Financial machine learning. Foundations and Trends in Finance 13(3-4), 205–363.
- Pedersen, L. H. (2022). Machine learning in asset pricing (Big Data Asset Pricing Lecture 5). (link)
- Chapter 6.8 of 石川, 刘洋溢, 连详斌 (2020). 因子投资:方法与实践. 电子工业出版社.
- Lecture 10
- Asness, C. S., S. Chandra, A. Ilmanen, and R. Israel (2017). Contrarian factor timing is deceptively difficult. Journal of Portfolio Management 43(5), 72–87.
- Bender, J., X. Sun, R. Thomas, and V. Zdorovtsov (2018). The promises and pitfalls of factor timing. Journal of Portfolio Management 44(4), 79–92.
- Blitz, D. (2025). Caveats of simple factor timing strategies. Working paper. (link)
- Haddad, V., S. Kozak, and S. Santosh (2020). Factor timing. Review of Financial Studies 33(5), 1980–2018.
- He, W., Z. Su, and J. Yu (2024). Macroeconomic perceptions financial constraints, and anomalies. Journal of Financial Economics 162, 103952.
- Chapter 7.5 of 石川, 刘洋溢, 连详斌 (2020). 因子投资:方法与实践. 电子工业出版社.
- Lecture 11
- Barberis, N. and R. Thaler (2003). Chapter 18 A survey of behavioral finance. In Financial Markets and Asset Pricing, Volume 1 of Handbook of the Economics of Finance, pp. 1053–1128. Elsevier.
- Barberis, N., L. J. Jin, and B. Wang (2021). Prospect theory and stock market anomalies. Journal of Finance 76(5), 2639–2687.
- Barberis, N., A. Mukherjee, and B. Wang (2016). Prospect theory and stock returns: An empirical test. Review of Financial Studies 29(11), 3068–3107.
- Kahneman, D. and A. Tversky (1979). Prospect Theory: An analysis of decision under risk. Econometrica 47(2), 263–292.
- Lian, X. and C. Shi (2021). A composite four-factor model in China. Working paper. (link)
- Shi, C. (2025). Behavioral finance and empirical asset pricing. Working paper. (link)
- Chapters 6.3-6.5 of 石川, 刘洋溢, 连详斌 (2020). 因子投资:方法与实践. 电子工业出版社.
- Lecture 12
- Bybee, L., B. T. Kelly, A. Manela, and D. Xiu (2024). Business news and business cycles. Journal of Finance 79(5), 3105–3147.
- Bybee, L., B. T. Kelly, and Y. Su (2023). Narrative asset pricing: Interpretable systematic risk factors from news text. Review of Financial Studies 36(12), 4759–4787.
- Dessaint, O., T. Foucault, and L. Fr´esard (2024). Does alternative data improve financial forecasting? The horizon effect. Journal of Finance 79(3), 2237–2287.
- Lee, C. M. C., S. T. Sun, R. Wang, and R. Zhang (2019). Technological links and predictable returns. Journal of Financial Economics 132(3), 76–96.
- Loughran, T. and B. McDonald (2020). Textual analysis in finance. Annual Review of Financial Economics 12, 357–375.
- Luo, R., C. Shi, S. Zhao, Q. Wu, and Q. Geng (2025). Technological Momentum in China: Large Language Model Meets Simple Classifications. Working paper. (link)
- 王闻, 孙佰清 (2022). 另类数据:理论与实践. 世界图书出版公司.
- 孙佰清, 王闻 (2022). 另类数据:投资新动力. 世界图书出版公司.
- Chapter 7.8 of 石川, 刘洋溢, 连详斌 (2020). 因子投资:方法与实践. 电子工业出版社.
- Lecture 13
- De Nard, G., O. Ledoit, and M. Wolf (2021). Factor models for portfolio selection in large dimensions: The good, the better and the ugly. Journal of Financial Econometrics 19(2), 236-257.
- James, W. and C. Stein (1961). Estimation with quadratic loss. Proceedings of the Fourth Berkeley Symposium on Mathematical Statistics and Probability, vol. 1. Oakland, CA, USA: University of California Press, pp. 361-380.
- Ledoit, O. and M. Wolf (2003). Improved estimation of the covariance matrix of stock returns with an application to portfolio selection. Journal of Empirical Finance 10(5), 603-621.
- Ledoit, O. and M. Wolf. (2004b). A well-conditioned estimator for large-dimensional covariance matrices. Journal of Multivariate Analysis 88(2), 365-411.
- Ledoit, O. and M. Wolf (2015). Spectrum estimation: A unified framework for covariance matrix estimation and PCA in large dimensions. Journal of Multivariate Analysis 139, 360-384.
- Ledoit, O. and M. Wolf (2017). Nonlinear shrinkage of the covariance matrix for portfolio selection: Markowitz meets Goldilocks. Review of Financial Studies 30(12), 4349-4388.
- Ledoit, O. and M. Wolf (2020). Analytical nonlinear shrinkage of large-dimensional covariance matrices. Annals of Statistics 48(5), 3043-3065.
- Ledoit, O. and M. Wolf (2022). The power of (non-)linear shrinkage: A review and guide to covariance matrix estimation. Journal of Financial Econometrics 20(1), 187-218.
- Chapter 7.2 of 石川, 刘洋溢, 连详斌 (2020). 因子投资:方法与实践. 电子工业出版社.
- Lecture 14
- Arnott, R. D., C. R. Harvey, V. Kalesnik, and J. T. Linnainmaa (2021). Reports of value’s death may be greatly exaggerated. Financial Analysts Journal 77(1), 44–67.
- Blitz, D. (2021). The quant crisis of 2018–2020: Cornered by big growth. Journal of Portfolio Management 47(6), 8–21.
- Israel, R., K. Laursen, and S. Richardson (2021). Is (systematic) value investing dead? Journal of Portfolio Management 47(2), 38-62.
- Chapter 6.6 of 石川, 刘洋溢, 连详斌 (2020). 因子投资:方法与实践. 电子工业出版社.
Recommended Talks / Lectures
- Machine Learning and Asset Pricing:
- Bryzgalova, S. Bayesian solutions for the factor zoo: We just ran two quadrillion models. 2020 Virtual Finance Workshop (Discussant: Bryan Kelly). (link)
- Bryzgalova, S. Factor selection and aggregation in asset pricing: When off-the-shelf machine learning is not enough. 2023 Machine Learning e Inteligência Artificial no Mercado Financeiro Workship. (link)
- Bryzgalova, S. Missing financial data. 2022 ABFR Webinar (Discussant: Guofu Zhou). (link)
- Fan, J. Structural deep Learning in conditional asset pricing. 2022 ABFR Webinar (Discussant: Andrew Patton). (link)
- Kelly, B. T. Characteristics are covariances: A unified model of risk and return. 2018 Utah Winter Finance Conference (Discussant: Kent Daniel). (link), (Discussant link)
- Kelly, B. T. Business news and business cycles. 2022 Hoover Institution Workshop On Using Text As Data In Policy Analysis. (link)
- Kelly, B. T. The virtue of complexity in return prediction. 2022 Jacobs Levy Center Frontiers in Quantitative Finance Conference. (link)
- Mullainathan, S. Machine learning and prediction in economics and finance. 2017 AFA Lecture. (link)
- Nagel, S. Market efficiency in the age of big data. 2021 ABFR Seminar (Discussant: Kent Daniel). (link)
- Pelger, M. Deep learning in asset pricing. 2020 Utah Winter Finance Conference (Discussant: Bryan Kelly). (link), (Discussant link)
- Xiu, D. Expected returns and large language models. 2023 GSU-RFS Fintech Conference (Discussant: Serhiy Kozak). (link)
- Econometrics
- Nagel, S. When do cross-sectional asset pricing factors span the stochastic discount factor? 2022 SoFiE Seminar (Discussant: Stefano Giglio). (link)
- Multiple Hypothesis Testing: