Accelerating Economic Research with Julia

Abstract:

Economic research strongly depends upon the economist’s ability to identify relevant information for causal inference and forecast accuracy efficiently. We address this goal in our Julia’s ParallelGSReg project, developing different econometric-machine learning packages. In JuliaCon 2023, we will present an improved version of our dimensionality reduction package (including non-linear algorithms) and a new "research acceleration package" with automatized Latex code and AI-bibliographic features.

Description:

In their recent volume of “Econometrics with Machine Learning”, Chan & Mátyás (2022) remind us of the well-established distinction in which Econometrics and Machine Learning are perceived as alternative methodological cultures: one focused on prediction (model selection, sampling properties, accuracy metrics) and the other on explanation (causal inference, hypothesis testing, coefficient robustness). Moving away from this false dichotomy, we introduce ParallelGSReg (https://github.com/ParallelGSReg): a Julia’s research project with several packages (GlobalSearchRegression.jl, GlobalSearchRegressionGUI.jl and ModelSelection.jl) aimed at 1) building bridges between those complementary cultures; and 2) encouraging economic researchers to use Julia in order to improve computational efficiency in model selection tasks (particularly those using dimensionality reduction techniques with causal-inference requirements). In JuliaCon 2018, the focus was on “efficiency”. We presented the world-fastest all-subset-regression command (GlobalSearchRegression.jl, which runs up to 3165 times faster than the original Stata-code and up to 197 times faster than well-known R-alternatives; see https://github.com/ParallelGSReg/JuliaCon2019/blob/master/GlobalSearchRegression.jl-paper.pdf). In 2019, the goal was “ease of use” for what we improved our Graphic-User-Interface (GlobalSearchRegressionGUI.jl) and developed a basic package (ModelSelection.jl) to automatize Julia-to-Latex migration of dimensionality reduction results (which also includes all GlobalSearchRegression.jl functions and additional features like regularization and cross-fold validation). For JuliaCon 2023 the target is “scope and integration”, for which we are:

  1. updating all packages (removing compatibility issues with the newest Julia versions);
  2. improving ModelSelection.jl with: 2.a) new classification algorithms (logistic, probit, etc) for regularization and all-subset-regression functions; 2.b) additional tests for causal inference (unit root tests); 2.c) extended cross-fold validation capabilities (to deal with re-sampling requirements of panel data and time-series databases); and 2.d) higher computational efficiency, reducing the Time-to-First-Result (TTFR) by focusing on statistical functions (moving Julia-to-Latex migration capabilities to a complementary package).
  3. developing ResearchAccelerator.jl, a new package with: 3.a) extended Julia-to-Latex migration functions that work as an “automatic research assistant”. Using ModelSelection.jl results, it generates a Latex document, with relevant tables, graphics, and metrics. 3.b) AI integration for references and literature review. Using user-provided keywords or phrases, ResearchAccelerator.jl will interact with Google Scholar to obtain a potentially relevant bibliography. Then a subset of them with available abstracts, references, and keywords will be used to provide citation networks, and keywords/ citations statistics. Finally, a machine learning system with modern NLP models will be used to generate, based on articles’ abstracts, a similarity network to provide users with additional information for a deeper search among related bibliography. This network will be exported to the Latex document as a table, a figure, and to a standard output file to be viewed using graph plotting and analysis tools such as Gephi.
  4. including a JuliaCall for Stata-integration, which allows all packages in our ParallelGSReg project to be used in batch mode through the Stata’s gsreg.ado package. This feature is developed to give change-averse economic researchers the simplest way to verify the substantial runtime reduction they can obtain by progressively switching to Julia. We will introduce all these contributions (including some new benchmark figures) in the first 5 minutes of our Lightning talk. Then, a live hands-on example will be developed in 3 minutes to leave the last 2 minutes for audience questions.

Platinum sponsors

JuliaHub

Gold sponsors

ASML

Silver sponsors

Pumas AIQuEra Computing Inc.Relational AIJeffrey Sarnoff

Bronze sponsors

Jolin.ioBeacon BiosignalsMIT CSAILBoeing

Academic partners

NAWA

Local partners

Postmates

Fiscal Sponsor

NumFOCUS