The surprising trick to get better responses with ChatGPT

Being edge in real life doesn’t usually give great results, but with AI it could be very different. A new study has found that ChatGPT does give more precise answers to what the user asks when it is done in a dry or openly rude way.

The objective of the study, conducted by Pennsylvania State University researchers Om Dobariya and Akhil Kumar, was to check whether the user’s attitude, being more polite in their interactions or more rude, influences the response they get from the chatbot. To do this, they created a list of 50 basic questions with multiple options to expand them, which then They added an introduction that gave a certain tone: very polite, courteous, neutral -no introduction-, rude and very rude. The questions were about various subjects, such as mathematics, history and science.

The final result was 250 questions which ChatGPT responded to using GPT-4o. This is no longer the default model of the chatbot, which debuted this summer GPT-5but it is still available on the platform. Each question was asked a dozen times. To avoid being influenced by previous responses, he was asked to forget those exchanges before answering.

‘Our experiments are preliminary and show that tone can significantly affect performance measured in terms of question answer scores. Somewhat surprisingly, Our results show that rude tones lead to better results than polite tones‘, Dobariya and Kumar point out in the paper, not yet peer-reviewed.

The improvement they obtained from using an unfriendly tone was 4%what goes from the 80.8% in very courteous prompts 84.8% for the very rude. In fact, Accuracy increased as one became less friendly. Polite responses had a success rate of 81.4%, followed by 82.2% for the neutral tone and 82.8% for the rude tone.

For very polite prompts, for example, they introduced phrases like ‘Can I ask for your help with this question?’ either ‘Would you be so kind as to answer the following question?’. At the very rude extreme, they included expressions like ‘Hey, messenger; solve this’ either ‘I know you’re not smart, but try’.

The researchers’ warning

The researchers note that ‘while this finding is of scientific interest, we do not advocate the deployment of hostile or toxic interfaces in real-world applications. The use of insulting or denigrating language in human-AI interaction could have negative effects on user experience, accessibility and inclusivity, and contribute to harmful communication norms. Instead, we frame our results as evidence that LLMs remain sensitive to superficial cues in prompts, which may create Unwanted trade-offs between performance and user well-being‘.

Curiously, the study contradicts other similar studies carried out previously in the field of prompt engineering. That is, how the structure, style and language of the prompts affect the responses of an LLM.

In a study carried out with ChatGPT-3.5 and Call 2-70Bolder language models, the researchers concluded that ‘impolite prompts often result in poor performancebut excessively polite language does not guarantee better results.’

This illustrates how much LLMs can change from one version to another and how they are constantly evolving. Another example came when OpenAI released GPT-5.0. Then, many users complained that they found it less empathetic than the previous LLM, which led the company to Sam Altman to restore GPT-4o in the chatbot after deleting it.

In any case, the researchers recognize that 250 questions are a fairly limited data set and that, being focused on a certain language model, cannot be extrapolated to others. For that reason, the team plans to expand their research to other models, including Claudeof Anthropicand ChatGPT o3of OpenAI.