Microsoft Rolls Out New Safety Features Including LLM-Powered Tools For Vulnerability Detection

Microsoft seems to be pulling out all the stops to make sure it indulges in responsible use of AI technology.

The company mentioned how the team was busy working on a series of safety features that AI would power and hence would help in detecting vulnerabilities.

The goal is to ensure the team has its fair share of safety features that keep Azure clients protected at all times. Moreover, Microsoft mentioned how such tools could better monitor things like hallucinations that remain unsupported as we speak.

There will also be attempts to prevent any kind of malicious prompts for AI clients that work with the company’s respective models.

Microsoft says it’s well aware of how clients do not have the right technical skills to carry out injection attacks via prompts or those related to hate content seen online. Therefore, the evaluation system is now able to generate those to see how such attacks can be simulated.

In this manner, customers would be able to benefit from scores that assist in seeing what the outcome could appear like.

This would assist in preventing generative AI controversies that arise due to responses that are undesirable and unintended. Such examples would entail recent ones like the promotion of fake celebs and misleading images of prominent historical figures and beyond.

So far, we’re hearing discussions surrounding three leading features. That entails Prompt Shields that prevent the spread of prompt injections or fake prompts arising from outside documents. Secondly, there’s Groundedness Detection that searches for and bars hallucinations. Lastly, there are safety evaluations that look for the model’s vulnerabilities and therefore can now be viewed across the Azure AI platform.

We’re hearing about more features coming soon including those who direct models to a safer output and end up tracking prompts that flag users as problematic which could come soon.

For now, we’re not quite sure when and if users are typing prompts or if any kind of processing is involved as far as third parties are concerned.

But with the help of this latest technology, we’ll witness how monitoring systems can better evaluate all sorts of triggers and banned terminology that is provided to see if the model could provide a response.

Once this key step is done, the system would look at replies produced by models and would ensure that models ended up hallucinating data not found in the prompt. In such cases where Google’s Gemini is present, you get to see filters that reduce bias which might have unintended responses.

This is a certain area where the company feels Azure’s AI features would assist with curated control. Shortly, there was some concern about how perhaps Microsoft would be judging and allowing customized control to take place in terms of what’s right or wrong for the range of AI models.

But that’s now taken care of well. Azure users would have greater control by taking advantage of filters that rid hate speech which models directly witness and end up blocking. Also, plans include Azure clients getting reports about those involved in causing nothing but mischief online such as triggering unsafe outputs.

Bird has mentioned time and time again how the goal of the team seems to be helping users navigate their way through AI safer and free from all sorts of malicious content seen online.

The latest safety features are proof of just that and more, not to mention how they continue to be linked to ChatGPT-4 and other leading AI models popular today like Llama 2. But the fact that we’re seeing more popularity linked to this domain rise with time, makes sense as to why experts are worried right now and therefore are pulling out all the stops to have built-in safety features.

And this is not something new. The software giant has been working long and hard regarding beefing up both security as well as safety of software. And that’s very true as more and more clients show interest in the Azure program.

Image: DIW-Aigen

Read next: Meta And Google Accused Of Limiting Reproductive Health Ads In Certain Regions To Promote Misinformation
Previous Post Next Post