This story about backlash to an earth science specific chatbot at EGU seems to detail a lot of insider politicking that seems only obliquely related to the concerns over the use of Large Learning Models for scientific research and writing. I don’t understand the context of all this politicking enough to comment, beyond saying that if the questionable use of large amounts of copyrighted material to train an LLM is not the most ethically questionable thing being discussed, you’re probably not doing well.
Unfortunately, all the drama does seems to have actually muddied the serious discussions that need to be had about whether we really want ChatGPT and its cousins getting inserted into the research pipeline. My specific issues with using chatbot-style interfaces in research are three-fold:
- Utility: I am not particularly excited by summarization, because regardless of whether the information provided is cited or uncited, I could never trust the summary to be accurate (this is a fundamental consequence of how LLMs work and don’t let anyone tell you otherwise). I would need to check everything in the summary, just like I would for a student paper or thesis. In that case, the annoying extra work is justified because you are training someone, in this case, it would just be annoying.
- Misuse: of course, if you don’t care about accuracy (or are driven by the perverse incentives of the modern academy to deprioritize it), having ready available tools that can write large parts of your papers for you is an invitation to flood the world with a tide of pointless sludge. Writing scientific papers is a creative activity, because you are trying to say something new about the world. This is a difficult and frustrating (if ultimately very satisfying) process for scientists; it is impossible for an LLM. Generative AI can reproduce the form of a scientific paper, but cannot generate the new insights that are supposed to be at their heart.1
- Environmental Impact: Earth Scientists in particular need to consider the disproportionate consumption of energy and water resources for training and deploying the LLM models that return this sludge of questionable accuracy.
To be clear, I don’t think all machine learning/AI is useless. It’s the generic “one chatbot to rule them all” approach that I think is flawed, and a corrupting influence besides. It seems to me that AI tools that help us to interpret data, or find relevant literature, should be more geared to giving us prods and signposts, rather than just doing our job for us.
- The misuse of generative AI by professors is far more problematic than misuse by students, despite much more hand-wringing over the latter. Students have been rapidly throwing together random phrases without understanding what any of it means – and earning mediocre grades for it – since time immemorial.
Nice plan for content warnings on Mastodon and the Fediverse. Now you need a Mastodon/Fediverse button on this blog.