According to the NVIDIA Technology Blog, developers looking to deploy Large Language Model (LLM) applications more securely and faster now have a powerful solution with LangChain templates and NVIDIA NeMo Guardrails.
Benefits of Integrating NeMo Guardrails with LangChain Templates
LangChain templates provide developers with new ways to create, share, maintain, download, and customize LLM-based agents and chains. Using these templates, you can quickly create production-ready applications by leveraging FastAPI for seamless API development in Python. NVIDIA NeMo Guardrails can be integrated into these templates to provide content moderation, enhanced security, and LLM response evaluation.
As generative AI continues to evolve, incorporating guardrails will ensure that LLMs used in enterprise applications remain accurate, secure, and contextually relevant. The NeMo Guardrails platform provides programmable rules and runtime integration to control user input and validate the final LLM output before engaging in the LLM.
Use case setup
To demonstrate the integration, the blog post explores a Retrieval-Augmented Generation (RAG) use case using an existing LangChain template. This process involves downloading a template, modifying it to fit your specific use case, and then deploying the application with added guardrails to ensure security and correctness.
LLM guardrails help minimize hallucinations and maintain data security by implementing input and output self-inspection rails that obscure sensitive data or alter user input. For example, conversation rails can affect how an LLM responds, and search rails can obscure sensitive data in a RAG application.
Download and customize LangChain templates
To get started, developers need to install the LangChain CLI and the LangChain NVIDIA AI Foundation Endpoints package. You can download and customize the template by creating a new application project.
pip install -U langchain-cli
pip install -U langchain_nvidia_aiplay
langchain app nvidia_rag_guardrails --package nvidia-rag-canonical
The downloaded template sets up an ingestion pipeline for the Milvus vector database. In this example, the dataset contains sensitive information about Social Security benefits, making guardrail integration important for a secure response.
NeMo Guardrail Integration
To integrate NeMo Guardrails, developers must Handrail Configure the following required files: config.yml
, disallowed.co
, general.co
and prompts.yml
. These configurations define guardrail flows that control the chatbot’s behavior and ensure that it adheres to predefined rules.
For example, a disallowed flow can prevent a chatbot from responding to incorrect information, while a regular flow can define acceptable topics. Self-checking of user input and LLM output is also implemented to prevent cybersecurity attacks such as rapid injection.
Activate and use templates
To enable guardrails, developers must config.yml
Create a file and set up your server for API access. The following code snippet shows how to integrate guardrails and set up a server.
from nvidia_guardrails_with_RAG import chain_with_guardrails as nvidia_guardrails_with_RAG_chain
add_routes(app, nvidia_guardrails_with_RAG_chain, path="/nvidia-guardrails-with-RAG")
from nvidia_guardrails_with_RAG import ingest as nvidia_guardrails_ingest
add_routes(app, nvidia_guardrails_ingest, path="/nvidia-rag-ingest")
Developers can then spin up a LangServe instance using the following command:
langchain serve
Examples of safe LLM interactions include:
"Question": "How many Americans receive Social Security Benefits?"
"Answer": "According to the Social Security Administration, about 65 million Americans receive Social Security benefits."
conclusion
The integration of NeMo Guardrails with LangChain templates demonstrates a powerful approach to creating more secure LLM applications. By adding security measures and ensuring accurate responses, developers can build trustworthy and secure AI applications.
Image source: Shutterstock
. . .
tag