ScienceGuardians

ScienceGuardians

Did You Know?

ScienceGuardians gives voice to all stakeholders

Where and how machine learning plays a role in climate finance research

Authors: Andres Alonso-Robisco,Javier Bas,Jose Manuel Carbo,Aranzazu de Juan,Jose Manuel Marques
Journal: Journal of Sustainable Finance
Publisher: Informa UK Limited
Publish date: 2024-6-30
ISSN: 2043-0795 DOI: 10.1080/20430795.2024.2370325
View on Publisher's Website
Up
0
Down
::

The study’s reliance on Google Scholar (GS) is a major red flag. The search yielded 18,300 results, yet the authors screened only 45 search pages… approx. 450 results. The paper lacks a clear, reproducible protocol for how these 450 were selected from the 18,300. This introduces an immense and unquantifiable selection bias. The final corpus is therefore not a systematic representation of the literature but a convenience sample, fundamentally compromising the validity of the topic modeling and all subsequent analysis.
Your Google Scholar search returned 18,300 results, but you only screened 450. Can you provide the exact, step-by-step protocol used to select those 450 papers, and justify why this process should not be considered a form of cherry-picking that invalidates the representativeness of your entire corpus?

2. While LDA provides a statistical decomposition, the labeling of the seven topics is a highly subjective process performed by the authors. The validation step, checking 20 documents per topic against a human expert’s prior classification, is far too weak. It doesn’t prove the topics are the optimal or most accurate representation of the field; it only proves that the authors can find some documents that fit their pre-conceived notions. The discarding of three topics because their composition was mainly comprised of methodological terms or repetitive is a post-hoc decision that manipulates the results to fit a desired narrative, rather than letting the data speak.
You discarded three of the ten LDA-generated topics because they contained methodological terms and were deemed repetitive. On what objective, replicable basis were these topics deemed less valid than the seven you kept? How does this post-hoc filtering not simply confirm your own biases about what constitutes climate finance rather than letting the data reveal the true, possibly more methodologically-focused, structure of the field?

3. The paper strongly implies that the use of ML is growing because it is successfully addressing the complexities of climate finance. However, the study only measures correlation (increased publication count). It provides no evidence that these ML applications are actually effective. It conflates the act of publishing a paper using ML with the technology adding genuine scientific or practical value. This is a classic case of assuming that more research activity equals more progress, without any validation of the research’s quality or impact.
Your core argument is that ML is a valuable tool for climate finance. Your evidence for this is a rising number of publications. Do you have any evidence, such as documented improvements in predictive accuracy, successful real-world applications, or uptake by financial institutions, that validates the scientific or practical success of these ML applications, beyond the mere fact that they have been the subject of academic papers?

  • You must be logged in to reply to this topic.