Demian Panigo, AdĂˇn Mauri Ungaro, NicolĂˇs MonzĂłn, Alexis Tcach, Juan MenduiĂ±a, Pablo Gluzmann, Alejo, Nahuel Panigo

07/28/2023, 4:00 PM â€” 4:10 PM UTC

32-124

Economic research strongly depends upon the economistâ€™s ability to identify relevant information for causal inference and forecast accuracy efficiently. We address this goal in our Juliaâ€™s ParallelGSReg project, developing different econometric-machine learning packages. In JuliaCon 2023, we will present an improved version of our dimensionality reduction package (including non-linear algorithms) and a new "research acceleration package" with automatized Latex code and AI-bibliographic features.

In their recent volume of â€śEconometrics with Machine Learningâ€ť, Chan & MĂˇtyĂˇs (2022) remind us of the well-established distinction in which Econometrics and Machine Learning are perceived as alternative methodological cultures: one focused on prediction (model selection, sampling properties, accuracy metrics) and the other on explanation (causal inference, hypothesis testing, coefficient robustness). Moving away from this false dichotomy, we introduce ParallelGSReg (https://github.com/ParallelGSReg): a Juliaâ€™s research project with several packages (GlobalSearchRegression.jl, GlobalSearchRegressionGUI.jl and ModelSelection.jl) aimed at 1) building bridges between those complementary cultures; and 2) encouraging economic researchers to use Julia in order to improve computational efficiency in model selection tasks (particularly those using dimensionality reduction techniques with causal-inference requirements). In JuliaCon 2018, the focus was on â€śefficiencyâ€ť. We presented the world-fastest all-subset-regression command (GlobalSearchRegression.jl, which runs up to 3165 times faster than the original Stata-code and up to 197 times faster than well-known R-alternatives; see https://github.com/ParallelGSReg/JuliaCon2019/blob/master/GlobalSearchRegression.jl-paper.pdf). In 2019, the goal was â€śease of useâ€ť for what we improved our Graphic-User-Interface (GlobalSearchRegressionGUI.jl) and developed a basic package (ModelSelection.jl) to automatize Julia-to-Latex migration of dimensionality reduction results (which also includes all GlobalSearchRegression.jl functions and additional features like regularization and cross-fold validation). For JuliaCon 2023 the target is â€śscope and integrationâ€ť, for which we are:

- updating all packages (removing compatibility issues with the newest Julia versions);
- improving ModelSelection.jl with: 2.a) new classification algorithms (logistic, probit, etc) for regularization and all-subset-regression functions; 2.b) additional tests for causal inference (unit root tests); 2.c) extended cross-fold validation capabilities (to deal with re-sampling requirements of panel data and time-series databases); and 2.d) higher computational efficiency, reducing the Time-to-First-Result (TTFR) by focusing on statistical functions (moving Julia-to-Latex migration capabilities to a complementary package).
- developing ResearchAccelerator.jl, a new package with: 3.a) extended Julia-to-Latex migration functions that work as an â€śautomatic research assistantâ€ť. Using ModelSelection.jl results, it generates a Latex document, with relevant tables, graphics, and metrics. 3.b) AI integration for references and literature review. Using user-provided keywords or phrases, ResearchAccelerator.jl will interact with Google Scholar to obtain a potentially relevant bibliography. Then a subset of them with available abstracts, references, and keywords will be used to provide citation networks, and keywords/ citations statistics. Finally, a machine learning system with modern NLP models will be used to generate, based on articlesâ€™ abstracts, a similarity network to provide users with additional information for a deeper search among related bibliography. This network will be exported to the Latex document as a table, a figure, and to a standard output file to be viewed using graph plotting and analysis tools such as Gephi.
- including a JuliaCall for Stata-integration, which allows all packages in our ParallelGSReg project to be used in batch mode through the Stataâ€™s gsreg.ado package. This feature is developed to give change-averse economic researchers the simplest way to verify the substantial runtime reduction they can obtain by progressively switching to Julia. We will introduce all these contributions (including some new benchmark figures) in the first 5 minutes of our Lightning talk. Then, a live hands-on example will be developed in 3 minutes to leave the last 2 minutes for audience questions.