Welcome to Software Development on Codidact!

Will you help us build our independent community of developers helping developers? We're small and trying to grow. We welcome questions about all aspects of software development, from design to code to QA and more. Got questions? Got answers? Got code you'd like someone to review? Please join us.

Post History

57%

+2 −1

Q&A How shall I refer to the documents and the context in the prompt when using the Azure RAG-QA framework?

I use Azure OpenAI RAG-QA (aka "bring our data"): which I call via e.g.: import os import pprint from openai import AzureOpenAI #from azure.identity import DefaultAzureCredential, get_bear...

0 answers · posted 2mo ago by Franck Dernoncourt‭ · edited 2mo ago by Franck Dernoncourt‭

Question azure azure-cognitive-services

#2: Post edited by

Franck Dernoncourt‭ · 2025-03-21T23:27:32Z (2 months ago)

Copy Link

Raw

Markdown

I use [Azure OpenAI RAG-QA](https://learn.microsoft.com/en-us/azure/ai-services/openai/use-your-data-quickstart?tabs=command-line%2Cpython-new&pivots=programming-language-python) (aka "bring our data"):
[![enter icription here][1]][1]
~~[1]: https://i.sstatic.net/Z4vVUxXm.png~~
which I call via e.g.:
```
import os
import pprint
from openai import AzureOpenAI
#from azure.identity import DefaultAzureCredential, get_bearer_token_provider
endpoint = os.getenv("ENDPOINT_URL", "https://[redacted].openai.azure.com/")
deployment = os.getenv("DEPLOYMENT_NAME", "[redacted GPT engine name]")
search_endpoint = os.getenv("SEARCH_ENDPOINT", "https://[redacted].search.windows.net")
search_key = os.getenv("SEARCH_KEY", "[redacted key]")
search_index = os.getenv("SEARCH_INDEX_NAME", "[redacted]")
# token_provider = get_bearer_token_provider(
# DefaultAzureCredential(),
# "https://cognitiveservices.azure.com/.default")
client = AzureOpenAI(
azure_endpoint=endpoint,
api_version="2024-05-01-preview",
api_key='[redacted key]'
)
# azure_ad_token_provider=token_provider,
completion = client.chat.completions.create(
model=deployment,
messages=[
{
"role": "user",
"content": "How can I sort a Python list?"
}],
max_tokens=800,
temperature=0,
top_p=1,
frequency_penalty=0,
presence_penalty=0,
stop=None,
stream=False,
extra_body={
"data_sources": [{
"type": "azure_search",
"parameters": {
"endpoint": f"{search_endpoint}",
"index_name": "[redacted]",
"semantic_configuration": "default",
"query_type": "vector_semantic_hybrid",
"fields_mapping": {},
"in_scope": True,
"role_information": "You are an AI assistant that helps people find information.",
"filter": None,
"strictness": 5,
"top_n_documents": 10,
"authentication": {
"type": "api_key",
"key": f"{search_key}"
},
"embedding_dependency": {
"type": "deployment_name",
"deployment_name": "[redacted]"
}
}
}]
}
)
pprint.pprint(completion)
```
It retrieves 10 documents (let's call that the context), then uses them to answer the question in the prompt (`"content": "How can I sort a Python list?"` in the example), following the usual RAG-QA pattern. I'd like the prompt to refer the context e.g.:
- "don't add any info not explicitly written in the context"
- "don't use more than 2 documents from the context"
- "copy-paste as much as possible from the context and write a fewer new words as possible"
But how am I supposed to refer to the documents and the context in the prompt? What's the proper term that the LLM understands (which partly/mostly depends on how the context is given to the LLM by that Azure OpenAI RAG-QA framework)?

I use [Azure OpenAI RAG-QA](https://learn.microsoft.com/en-us/azure/ai-services/openai/use-your-data-quickstart?tabs=command-line%2Cpython-new&pivots=programming-language-python) (aka "bring our data"):
![Image_alt_text](https://software.codidact.com/uploads/0qykzvpi46ez9bf3puy2npc35wc1)
which I call via e.g.:
```
import os
import pprint
from openai import AzureOpenAI
#from azure.identity import DefaultAzureCredential, get_bearer_token_provider
endpoint = os.getenv("ENDPOINT_URL", "https://[redacted].openai.azure.com/")
deployment = os.getenv("DEPLOYMENT_NAME", "[redacted GPT engine name]")
search_endpoint = os.getenv("SEARCH_ENDPOINT", "https://[redacted].search.windows.net")
search_key = os.getenv("SEARCH_KEY", "[redacted key]")
search_index = os.getenv("SEARCH_INDEX_NAME", "[redacted]")
# token_provider = get_bearer_token_provider(
# DefaultAzureCredential(),
# "https://cognitiveservices.azure.com/.default")
client = AzureOpenAI(
azure_endpoint=endpoint,
api_version="2024-05-01-preview",
api_key='[redacted key]'
)
# azure_ad_token_provider=token_provider,
completion = client.chat.completions.create(
model=deployment,
messages=[
{
"role": "user",
"content": "How can I sort a Python list?"
}],
max_tokens=800,
temperature=0,
top_p=1,
frequency_penalty=0,
presence_penalty=0,
stop=None,
stream=False,
extra_body={
"data_sources": [{
"type": "azure_search",
"parameters": {
"endpoint": f"{search_endpoint}",
"index_name": "[redacted]",
"semantic_configuration": "default",
"query_type": "vector_semantic_hybrid",
"fields_mapping": {},
"in_scope": True,
"role_information": "You are an AI assistant that helps people find information.",
"filter": None,
"strictness": 5,
"top_n_documents": 10,
"authentication": {
"type": "api_key",
"key": f"{search_key}"
},
"embedding_dependency": {
"type": "deployment_name",
"deployment_name": "[redacted]"
}
}
}]
}
)
pprint.pprint(completion)
```
It retrieves 10 documents (let's call that the context), then uses them to answer the question in the prompt (`"content": "How can I sort a Python list?"` in the example), following the usual RAG-QA pattern. I'd like the prompt to refer the context e.g.:
- "don't add any info not explicitly written in the context"
- "don't use more than 2 documents from the context"
- "copy-paste as much as possible from the context and write a fewer new words as possible"
But how am I supposed to refer to the documents and the context in the prompt? What's the proper term that the LLM understands (which partly/mostly depends on how the context is given to the LLM by that Azure OpenAI RAG-QA framework)?
----
Crossposted at:
- https://genai.stackexchange.com/q/1952/109
- https://redd.it/1e6vxqf

#1: Initial revision by

Franck Dernoncourt‭ · 2025-03-21T03:51:20Z (2 months ago)

Copy Link

Raw

Markdown

How shall I refer to the documents and the context in the prompt when using the Azure RAG-QA framework?

I use [Azure OpenAI RAG-QA](https://learn.microsoft.com/en-us/azure/ai-services/openai/use-your-data-quickstart?tabs=command-line%2Cpython-new&pivots=programming-language-python) (aka "bring our data"):

[![enter icription here][1]][1]

  [1]: https://i.sstatic.net/Z4vVUxXm.png

which I call via e.g.:

```
import os
import pprint

from openai import AzureOpenAI
#from azure.identity import DefaultAzureCredential, get_bearer_token_provider

endpoint = os.getenv("ENDPOINT_URL", "https://[redacted].openai.azure.com/")
deployment = os.getenv("DEPLOYMENT_NAME", "[redacted GPT engine name]")
search_endpoint = os.getenv("SEARCH_ENDPOINT", "https://[redacted].search.windows.net")
search_key = os.getenv("SEARCH_KEY", "[redacted key]")
search_index = os.getenv("SEARCH_INDEX_NAME", "[redacted]")

# token_provider = get_bearer_token_provider(
#     DefaultAzureCredential(),
#     "https://cognitiveservices.azure.com/.default")

client = AzureOpenAI(
    azure_endpoint=endpoint,
    api_version="2024-05-01-preview",
    api_key='[redacted key]'
)
# azure_ad_token_provider=token_provider,

completion = client.chat.completions.create(
    model=deployment,
    messages=[
        {
            "role": "user",
            "content": "How can I sort a Python list?"
        }],
    max_tokens=800,
    temperature=0,
    top_p=1,
    frequency_penalty=0,
    presence_penalty=0,
    stop=None,
    stream=False,
    extra_body={
        "data_sources": [{
            "type": "azure_search",
            "parameters": {
                "endpoint": f"{search_endpoint}",
                "index_name": "[redacted]",
                "semantic_configuration": "default",
                "query_type": "vector_semantic_hybrid",
                "fields_mapping": {},
                "in_scope": True,
                "role_information": "You are an AI assistant that helps people find information.",
                "filter": None,
                "strictness": 5,
                "top_n_documents": 10,
                "authentication": {
                    "type": "api_key",
                    "key": f"{search_key}"
                },
                "embedding_dependency": {
                    "type": "deployment_name",
                    "deployment_name": "[redacted]"
                }
            }
        }]
    }
)
pprint.pprint(completion)
```

It retrieves 10 documents (let's call that the context), then uses them to answer the question in the prompt (`"content": "How can I sort a Python list?"` in the example), following the usual RAG-QA pattern. I'd like the prompt to refer the context e.g.:

- "don't add any info not explicitly written in the context"
- "don't use more than 2 documents from the context"
- "copy-paste as much as possible from the context and write a fewer new words as possible"

But how am I supposed to refer to the documents and the context in the prompt? What's the proper term that the LLM understands (which partly/mostly depends on how the context is given to the LLM by that Azure OpenAI RAG-QA framework)?

azure azure-cognitive-services

Communities

Post History