Communities

Writing
Writing
Codidact Meta
Codidact Meta
The Great Outdoors
The Great Outdoors
Photography & Video
Photography & Video
Scientific Speculation
Scientific Speculation
Cooking
Cooking
Electrical Engineering
Electrical Engineering
Judaism
Judaism
Languages & Linguistics
Languages & Linguistics
Software Development
Software Development
Mathematics
Mathematics
Christianity
Christianity
Code Golf
Code Golf
Music
Music
Physics
Physics
Linux Systems
Linux Systems
Power Users
Power Users
Tabletop RPGs
Tabletop RPGs
Community Proposals
Community Proposals
tag:snake search within a tag
answers:0 unanswered questions
user:xxxx search by author id
score:0.5 posts with 0.5+ score
"snake oil" exact phrase
votes:4 posts with 4+ votes
created:<1w created < 1 week ago
post_type:xxxx type of post
Search help
Notifications
Mark all as read See all your notifications »
Q&A

Welcome to Software Development on Codidact!

Will you help us build our independent community of developers helping developers? We're small and trying to grow. We welcome questions about all aspects of software development, from design to code to QA and more. Got questions? Got answers? Got code you'd like someone to review? Please join us.

Post History

57%
+2 −1
Q&A How shall I refer to the documents and the context in the prompt when using the Azure RAG-QA framework?

I use Azure OpenAI RAG-QA (aka "bring our data"): which I call via e.g.: import os import pprint from openai import AzureOpenAI #from azure.identity import DefaultAzureCredential, get_bear...

0 answers  ·  posted 8d ago by Franck Dernoncourt‭  ·  edited 7d ago by Franck Dernoncourt‭

#2: Post edited by user avatar Franck Dernoncourt‭ · 2025-03-21T23:27:32Z (7 days ago)
  • I use [Azure OpenAI RAG-QA](https://learn.microsoft.com/en-us/azure/ai-services/openai/use-your-data-quickstart?tabs=command-line%2Cpython-new&pivots=programming-language-python) (aka "bring our data"):
  • [![enter icription here][1]][1]
  • [1]: https://i.sstatic.net/Z4vVUxXm.png
  • which I call via e.g.:
  • ```
  • import os
  • import pprint
  • from openai import AzureOpenAI
  • #from azure.identity import DefaultAzureCredential, get_bearer_token_provider
  • endpoint = os.getenv("ENDPOINT_URL", "https://[redacted].openai.azure.com/")
  • deployment = os.getenv("DEPLOYMENT_NAME", "[redacted GPT engine name]")
  • search_endpoint = os.getenv("SEARCH_ENDPOINT", "https://[redacted].search.windows.net")
  • search_key = os.getenv("SEARCH_KEY", "[redacted key]")
  • search_index = os.getenv("SEARCH_INDEX_NAME", "[redacted]")
  • # token_provider = get_bearer_token_provider(
  • # DefaultAzureCredential(),
  • # "https://cognitiveservices.azure.com/.default")
  • client = AzureOpenAI(
  • azure_endpoint=endpoint,
  • api_version="2024-05-01-preview",
  • api_key='[redacted key]'
  • )
  • # azure_ad_token_provider=token_provider,
  • completion = client.chat.completions.create(
  • model=deployment,
  • messages=[
  • {
  • "role": "user",
  • "content": "How can I sort a Python list?"
  • }],
  • max_tokens=800,
  • temperature=0,
  • top_p=1,
  • frequency_penalty=0,
  • presence_penalty=0,
  • stop=None,
  • stream=False,
  • extra_body={
  • "data_sources": [{
  • "type": "azure_search",
  • "parameters": {
  • "endpoint": f"{search_endpoint}",
  • "index_name": "[redacted]",
  • "semantic_configuration": "default",
  • "query_type": "vector_semantic_hybrid",
  • "fields_mapping": {},
  • "in_scope": True,
  • "role_information": "You are an AI assistant that helps people find information.",
  • "filter": None,
  • "strictness": 5,
  • "top_n_documents": 10,
  • "authentication": {
  • "type": "api_key",
  • "key": f"{search_key}"
  • },
  • "embedding_dependency": {
  • "type": "deployment_name",
  • "deployment_name": "[redacted]"
  • }
  • }
  • }]
  • }
  • )
  • pprint.pprint(completion)
  • ```
  • It retrieves 10 documents (let's call that the context), then uses them to answer the question in the prompt (`"content": "How can I sort a Python list?"` in the example), following the usual RAG-QA pattern. I'd like the prompt to refer the context e.g.:
  • - "don't add any info not explicitly written in the context"
  • - "don't use more than 2 documents from the context"
  • - "copy-paste as much as possible from the context and write a fewer new words as possible"
  • But how am I supposed to refer to the documents and the context in the prompt? What's the proper term that the LLM understands (which partly/mostly depends on how the context is given to the LLM by that Azure OpenAI RAG-QA framework)?
  • I use [Azure OpenAI RAG-QA](https://learn.microsoft.com/en-us/azure/ai-services/openai/use-your-data-quickstart?tabs=command-line%2Cpython-new&pivots=programming-language-python) (aka "bring our data"):
  • ![Image_alt_text](https://software.codidact.com/uploads/0qykzvpi46ez9bf3puy2npc35wc1)
  • which I call via e.g.:
  • ```
  • import os
  • import pprint
  • from openai import AzureOpenAI
  • #from azure.identity import DefaultAzureCredential, get_bearer_token_provider
  • endpoint = os.getenv("ENDPOINT_URL", "https://[redacted].openai.azure.com/")
  • deployment = os.getenv("DEPLOYMENT_NAME", "[redacted GPT engine name]")
  • search_endpoint = os.getenv("SEARCH_ENDPOINT", "https://[redacted].search.windows.net")
  • search_key = os.getenv("SEARCH_KEY", "[redacted key]")
  • search_index = os.getenv("SEARCH_INDEX_NAME", "[redacted]")
  • # token_provider = get_bearer_token_provider(
  • # DefaultAzureCredential(),
  • # "https://cognitiveservices.azure.com/.default")
  • client = AzureOpenAI(
  • azure_endpoint=endpoint,
  • api_version="2024-05-01-preview",
  • api_key='[redacted key]'
  • )
  • # azure_ad_token_provider=token_provider,
  • completion = client.chat.completions.create(
  • model=deployment,
  • messages=[
  • {
  • "role": "user",
  • "content": "How can I sort a Python list?"
  • }],
  • max_tokens=800,
  • temperature=0,
  • top_p=1,
  • frequency_penalty=0,
  • presence_penalty=0,
  • stop=None,
  • stream=False,
  • extra_body={
  • "data_sources": [{
  • "type": "azure_search",
  • "parameters": {
  • "endpoint": f"{search_endpoint}",
  • "index_name": "[redacted]",
  • "semantic_configuration": "default",
  • "query_type": "vector_semantic_hybrid",
  • "fields_mapping": {},
  • "in_scope": True,
  • "role_information": "You are an AI assistant that helps people find information.",
  • "filter": None,
  • "strictness": 5,
  • "top_n_documents": 10,
  • "authentication": {
  • "type": "api_key",
  • "key": f"{search_key}"
  • },
  • "embedding_dependency": {
  • "type": "deployment_name",
  • "deployment_name": "[redacted]"
  • }
  • }
  • }]
  • }
  • )
  • pprint.pprint(completion)
  • ```
  • It retrieves 10 documents (let's call that the context), then uses them to answer the question in the prompt (`"content": "How can I sort a Python list?"` in the example), following the usual RAG-QA pattern. I'd like the prompt to refer the context e.g.:
  • - "don't add any info not explicitly written in the context"
  • - "don't use more than 2 documents from the context"
  • - "copy-paste as much as possible from the context and write a fewer new words as possible"
  • But how am I supposed to refer to the documents and the context in the prompt? What's the proper term that the LLM understands (which partly/mostly depends on how the context is given to the LLM by that Azure OpenAI RAG-QA framework)?
  • ----
  • Crossposted at:
  • - https://genai.stackexchange.com/q/1952/109
  • - https://redd.it/1e6vxqf
#1: Initial revision by user avatar Franck Dernoncourt‭ · 2025-03-21T03:51:20Z (8 days ago)
How shall I refer to the documents and the context in the prompt when using the Azure RAG-QA framework?
I use [Azure OpenAI RAG-QA](https://learn.microsoft.com/en-us/azure/ai-services/openai/use-your-data-quickstart?tabs=command-line%2Cpython-new&pivots=programming-language-python) (aka "bring our data"):

[![enter icription here][1]][1]

  [1]: https://i.sstatic.net/Z4vVUxXm.png

which I call via e.g.:

```
import os
import pprint

from openai import AzureOpenAI
#from azure.identity import DefaultAzureCredential, get_bearer_token_provider

endpoint = os.getenv("ENDPOINT_URL", "https://[redacted].openai.azure.com/")
deployment = os.getenv("DEPLOYMENT_NAME", "[redacted GPT engine name]")
search_endpoint = os.getenv("SEARCH_ENDPOINT", "https://[redacted].search.windows.net")
search_key = os.getenv("SEARCH_KEY", "[redacted key]")
search_index = os.getenv("SEARCH_INDEX_NAME", "[redacted]")

# token_provider = get_bearer_token_provider(
#     DefaultAzureCredential(),
#     "https://cognitiveservices.azure.com/.default")

client = AzureOpenAI(
    azure_endpoint=endpoint,
    api_version="2024-05-01-preview",
    api_key='[redacted key]'
)
# azure_ad_token_provider=token_provider,

completion = client.chat.completions.create(
    model=deployment,
    messages=[
        {
            "role": "user",
            "content": "How can I sort a Python list?"
        }],
    max_tokens=800,
    temperature=0,
    top_p=1,
    frequency_penalty=0,
    presence_penalty=0,
    stop=None,
    stream=False,
    extra_body={
        "data_sources": [{
            "type": "azure_search",
            "parameters": {
                "endpoint": f"{search_endpoint}",
                "index_name": "[redacted]",
                "semantic_configuration": "default",
                "query_type": "vector_semantic_hybrid",
                "fields_mapping": {},
                "in_scope": True,
                "role_information": "You are an AI assistant that helps people find information.",
                "filter": None,
                "strictness": 5,
                "top_n_documents": 10,
                "authentication": {
                    "type": "api_key",
                    "key": f"{search_key}"
                },
                "embedding_dependency": {
                    "type": "deployment_name",
                    "deployment_name": "[redacted]"
                }
            }
        }]
    }
)
pprint.pprint(completion)
```

It retrieves 10 documents (let's call that the context), then uses them to answer the question in the prompt (`"content": "How can I sort a Python list?"` in the example), following the usual RAG-QA pattern. I'd like the prompt to refer the context e.g.:

- "don't add any info not explicitly written in the context"
- "don't use more than 2 documents from the context"
- "copy-paste as much as possible from the context and write a fewer new words as possible"

But how am I supposed to refer to the documents and the context in the prompt? What's the proper term that the LLM understands (which partly/mostly depends on how the context is given to the LLM by that Azure OpenAI RAG-QA framework)?