Can AI shortcut your scientific literature search? We asked ChatGPT

The rapid development of AI is shaking up how we think about the future of scientific research. From predictions that it will push the boundaries of human intelligence to concerns it will increase plagiarism and dilute research skills, there’s little doubt that AI is emerging as one of the most powerful agents of change in higher education.

In the context of research skills, we are now bombarded with messages that AI tools can make the process faster, easier, effortless even. People are increasingly asking whether AI can indeed deliver on this promise and ease the burden of discovery in their scientific literature searches.

Here at IFIS Publishing, we wanted to know too… So we did what everyone else is doing. We asked ChatGPT.

For those of you who have been living under a rock, ChatGPT is the chatbot released by OpenAI in November last year. Built on OpenAI’s foundational large language models (LLMs), ChatGPT has redefined the standards of artificial intelligence. It is an AI that speaks our language and can ‘learn’ the complexities of human interaction.

We asked ChatGPT: “Can AI tools like yourself deliver the same quality scientific literature search result as subject-specific abstract and indexing databases such as our Food Science and Technology Abstracts (FSTA)?”

ChatGPT explains: “AI programs like myself can be a valuable tool in assisting with scientific literature searches, including those in food science.”

However, it continues: “It's important to understand that AI and subject-specific databases serve different purposes and have distinct strengths and limitations.”

The chatbot then generated a comparison between using AI and a subject-specific database like FSTA.

FSTA versus AI: Will you get reliable results?

In AI’s favour, ChatGPT noted artificial intelligence algorithms have access to ‘a vast amount of general knowledge. But does that really help you when you are conducting research in a specialism like food science?

An algorithm is only as good as the data it has been trained on. Biases in the data will result in biases in the results returned. If AI tools scrape publicly available information from the internet, for instance, the results they return risk the inclusion of predatory content.

This matters because predatory publications operate to generate revenue rather than disseminate credible and quality research. Prioritising profit over academic integrity, predatory publishers engage in deceptive practices such as misleading peer review processes, fast-track publication without proper scrutiny, and minimal editorial oversight. And while not everything published in a predatory journal will be junk science, some of it may be.

Predatory-publishers-represented-by-shark

It is difficult to know exactly how many predatory journals are active. One study concluded that 8,000 predatory journals collectively publish 420,000 papers every year, nearly a fifth of the scientific community's annual output of 2.5 million papers. This is a big – and growing – problem.

As the chatbot notes: “The quality and credibility of the information provided by AI models depend on the data they have been trained on. If the training data includes content from predatory journals or unreliable sources, there is a risk that the AI program may produce responses that include such information.”

Do you really want to risk building your research plans on literature search results that could contain as much as 20% pseudoscience?

In contrast, every one of the scientific journals indexed in FSTA has been vetted to exclude predatory content. Our experts use a comprehensive 60-point checklist to ensure you can trust the results returned to you.

On top of this, every one of the 1,840,000+ articles indexed in FSTA have been relevancy checked and tagged by our team of food scientists to boost discoverability and ensure the search results you get back are relevant and comprehensive.

Our highly popular Literature Searching Best Practice Guide now includes a new chapter on how to effectively and ethically use AI tools for academic research and writing. Take a look!

Natural language processing versus FSTA’s controlled vocabulary

There’s no doubt that natural language processing is a massive leap in AI capabilities. AI programs can process natural language queries, making it easier for users to search for information in everyday language. This makes it possible for inexperienced searchers to generate results using regular words and phrases.

But easier does not always equate to better. As ChatGPT explains, while natural language processing ‘has its merits’ controlled vocabulary indexing ‘offers several advantages over natural language’.

FSTA is built on controlled vocabulary indexing and we have developed the largest and most comprehensive food science thesaurus in the world – one that is regularly updated to reflect the latest developments in the field.

This means our indexers use pre-defined and organised terms to categorise and describe the content of research in a consistent manner. Consistency and precision eliminate ambiguity and make it easier to retrieve relevant information, boosting your discovery process.

While natural language indexing may be more flexible, it can lead to different ways of describing the same concept. In contrast, controlled vocabulary improves recall, ensuring that all relevant documents are retrieved under a specific subject heading.

Our model also allows researchers to discover related concepts and explore the broader context of their searches. FSTA’s extensive thesaurus provides hierarchical and associative relationships between terms. This empowers users to browse and navigate information more effectively and improves discoverability.

Here at IFIS Publishing, our team of food science and search experts are confident that our databases offer content and search quality that AI – or generalist databases like Google Scholar for that matter – simply can’t match. But we would say that. What does ChatGPT conclude?

“Despite the benefits, it's important to note that controlled vocabulary indexing also has some limitations.”

Oh?

“Creating and maintaining controlled vocabularies can be time-consuming and may not cover all emerging or specialised topics. Additionally, new terms may emerge that are not yet included in the controlled vocabulary, potentially limiting the discoverability of cutting-edge research.”

Luckily for our database users, staying on top of the specialisms of food and health sciences is our raison d'etre. We regularly update our lexicon and review new sources of content to ensure the science going into our databases is both comprehensive and discoverable. By using FSTA or one of our other food science databases, we take the work out of getting comprehensive results you can trust, giving you confidence in your literature discovery process.

Research Skills Blog