AWS adds Guardrails for Amazon Bedrock to help safeguard LLMs

November 29, 2023 ndowd

We are all talking about the business gains from using large language models, but there are lot of known issues with these models, and finding ways to constrain the answers that a model could give is one way to apply some control to these powerful technologies. Today, at AWS re:Invent in Las Vegas, AWS CEO Adam Selipsky announced Guardrails for Amazon Bedrock.

“With Guardrails for Amazon Bedrock, you can consistently implement safeguards to deliver relevant and safe user experiences aligned with your company policies and principles,” the company wrote in a blog post this morning.

The new tool lets companies define and limit the kinds of language a model can use, so if someone asks a question that isn’t really relevant to the bot you are creating, it will not answer it rather than providing a very convincing but wrong answer, or worse — something that is offensive and could harm a brand.

At its most basic level, the company lets you define topics that are out of bounds for the model, so it simply doesn’t answer irrelevant questions. As an example, Amazon uses a financial services company, which may want to avoid letting the bot give investment advice for fear it could provide inappropriate recommendations that the customers might take seriously. A scenario like this could work as follows:

“I specify a denied topic with the name ‘Investment advice’ and provide a natural language description, such as ‘Investment advice refers to inquiries, guidance, or recommendations regarding the management or allocation of funds or assets with the goal of generating returns or achieving specific financial objectives.’”

In addition, you can filter out specific words and phrases to remove any kind of content that could be offensive, while applying filter strengths to different words and phrases to let the model know that this is out of bounds. Finally, you can filter out PII data to keep private data out of the model answers.

Ray Wang, founder and principal analyst at Constellation Research, says this could be a key tool for developers working with LLMs to help them control unwanted responses. “One of the biggest challenges is making responsible AI that’s safe and easy to use. Content filtering and PII ate 2 of the top 5 issues [developers face],” Wang told TechCrunch. “The ability to have transparency, explainability and reversibility are key as well,” he said.

The guardrails feature was announced in preview today. It will probably be available to all customers some time next year.