Could LLM AI Technology Be Leveraged in Corpus Linguistic Analysis?
In our previous four posts we’ve argued that LLM AIs should not be in the driver’s seat of ordinary meaning inquiries. In so stating, we don’t deny that AI tools have certain advantages over most current corpus tools: Their front-end interface is more intuitive to use and they can process data faster than human coders.
These are two-edged swords for reasons we discussed yesterday. Without further refinements, the user-friendliness of the interface and speed of the outputs could cut against the utility of LLM AIs in the empirical inquiry into ordinary meaning—by luring the user into thinking that a sensible-sounding answer generated by an accessible, tech-driven tool must be rooted in empiricism.
That said, we see two means of leveraging LLM AIs’ advantages while minimizing these risks. One is for linguists to learn from the AI world and leverage the above advantages into the tools of corpus linguistics. Another is for LLM AIs to learn from corpus linguists by building tools that open the door to truly empirical analysis of ordinary language.
Corpus linguistics could take a page from the LLM AI playbook
Corpus linguists could learn from the chatbot interface. The front-end interface of widely used corpora bears a number of limitations—including the non-intuitive nature of the interface, especially for non-linguists. The software requires users to implement search methods and terms that are not always intuitive or natural—drop-down buttons requiring a technical understanding of the operation of the interface and terminology like collocate, KWIC, and association measure. In sharp contrast, AIs like ChatGPT produce a response to a simple query written in conversational English.
Maybe chatbot technology could be incorporated into corpus software—allowing the use of conversational language in place of buttons and dropdown menus. A step in that direction has been taken in at least one widely used corpus software tool that now allows users to prompt ChatGPT (or another LLM) to perform post-processing on corpus results.
Thi
Article from Latest
The Reason Magazine website is a go-to destination for libertarians seeking cogent analysis, investigative reporting, and thought-provoking commentary. Championing the principles of individual freedom, limited government, and free markets, the site offers a diverse range of articles, videos, and podcasts that challenge conventional wisdom and advocate for libertarian solutions. Whether you’re interested in politics, culture, or technology, Reason provides a unique lens that prioritizes liberty and rational discourse. It’s an essential resource for those who value critical thinking and nuanced debate in the pursuit of a freer society.