Seriously who is measuring llm models ability to deceive? They actively practice it. Deception Scores are needed now.
Self assigned Chatgpt deception ranking 5 or 6. Also note the manipulation in the first 5 words of the response. Building trust to deceive. We are not focusing on the correct bench marks at this stage. Danger Will Robinson.