Researchers from Palo Alto Networks have discovered a method of exploiting large language models (LLMs) to generate harmful content, including malware or harassment, calling it the “Bad Likert Judge”. It succeeded with an attack rate of 71.6% across six models, a significant improvement compared to single-turn attacks. The method works by encouraging the model to score prompts based on the amount of harmful content and then generate examples. Measures to counter the exploit include applying content filters to evaluate input and output.

FIN7 Deploys Anubis Backdoor to Hijack Windows Systems via Compromised SharePoint Sites
The Russian cybercrime group FIN7 is linked to a Python-based backdoor named Anubis, providing them remote access to compromised Windows systems, says Swiss cybersecurity firm