Beyond the Black Box: A Data Science Reading List for Economists
Data science is reshaping how we understand markets, organizations, and economic behavior. As analytical methods evolve—from causal inference to generative modeling and explainable machine learning—so does the need for readings that bridge theory, empirical evidence, and cutting‑edge computational techniques. This curated list brings together six thought‑provoking pieces that span foundational concepts, global risk perspectives, long‑horizon financial insights, and advanced applications of machine learning in economics and corporate strategy.
1. On Causality: A History of How Economics Learned to Think About Cause and Effect (Carlos Chavez substack).
2. The future of risk: How global trends are reshaping risk management. A rapidly shifting and interconnected risk landscape, technology, and AI transform what good risk management looks like. Financial institutions must embrace new operating models and best practices (McKinsey).
3. What Earnings Explain, and What They Don’t: Insights from 150 Years of Market Data (Enterprising Investor).
4. Teaching Economics to the Machines. Structural economic models, while parsimonious and interpretable, often exhibit poor data fit and limited forecasting performance. Machine learning models, by contrast, offer substantial flexibility but are prone to overfitting and weak out-of-distribution generalization. We propose a theory-guided transfer learning framework that integrates structural restrictions from economic theory into machine learning models. The approach pre-trains a neural network on synthetic data generated by a structural model and then fine-tunes it using empirical data, allowing potentially misspecified economic restrictions to inform and regularize learning on empirical data. Applied to option pricing, our model substantially outperforms both structural and purely data-driven benchmarks, with especially large gains in small samples, under unstable market conditions, and when model misspecification is limited. Beyond performance, the framework provides diagnostics for improving structural models and introduces a new model-comparison metric based on data-model complementarity (NBER).
5. Generative economic modeling. We introduce a novel approach for solving quantitative economic models: generative economic modeling. Our method combines neural networks with conventional solution techniques. Specifically, we train neural networks on simplified versions of the economic model to approximate the complete model's dynamic behavior. Relying on these less complex submodels circumvents the curse of dimensionality, allowing the use of well-established numerical methods. We demonstrate our approach across settings with analytical characterizations, nonlinear dynamics, and heterogeneous agents, employing asset pricing and business cycle models. Finally, we solve a high-dimensional HANK model with an occasionally binding financial friction to highlight how aggregate risk amplifies the precautionary motive (BIS).
6. Is CSR a leading indicator of corporate restructuring performance? Evidence from explainable machine learning. Corporate social responsibility (CSR), associated with corporate reputation, has attracted considerable attention from both scholars and practitioners. However, empirical evidence concerning CSR's capacity to predict performance following corporate restructuring remains limited and inconclusive. This paper theoretically examines CSR's influence on post-restructuring performance through a reward-punishment framework and empirically assesses its predictive power using explainable machine learning. Our findings demonstrate that CSR serves as a significant predictor of post-restructuring performance. The analysis reveals a negative relationship, indicating that CSR functions as a restructuring punishment rather than a performance reward. Furthermore, both the predictive strength and the directional nature of this effect depend on the underlying motivations driving CSR engagement. These findings provide valuable insights for managers seeking to strategically align CSR initiatives with restructuring objectives and enhance the governance of restructuring processes (Journal of Management Science and Engineering).
Taken together, these readings show that modern data science is not just about better prediction, but about building models that can reason about cause and effect, adapt to uncertainty, and inform high-stakes decisions in markets and organizations. Whether you are curious about how economists think about causality, how risk management is evolving, or how neural networks can be fused with structural models and CSR data, this list offers a set of starting points to explore the future of data-driven economics.
Feel free to follow me on X



Comentarios