Search This Blog

Monday, June 8, 2026

How I Connected Oracle OCI LLM Endpoints to Postman and VS Code

I recently spent some time wiring up Oracle Cloud Infrastructure Generative AI endpoints so I could call models directly with an OCI Generative AI API key, validate everything in Postman, and then take the same setup into Visual Studio Code. The process ended up being a great example of how OCI can fit into modern AI workflows without forcing teams to abandon the tools they already use every day. Oracle’s Generative AI service supports service specific API keys, OpenAI compatible endpoints, and both Chat Completions and Responses APIs, which makes it much easier to integrate enterprise hosted models into familiar developer tooling.

What stood out to me most was how practical the setup became once the moving parts were understood. OCI Generative AI API keys are not the same as standard OCI IAM API keys. They are service generated secrets created specifically for OCI Generative AI, and Oracle documents that either of the two generated secrets can be used interchangeably in the Authorization header when calling supported model endpoints in the same region where the key was created.

Step 1: Creating the OCI Generative AI API Key

The first step was creating an API key inside the OCI Generative AI area of the Oracle Cloud console. Oracle documents these as service specific credentials for model access, and that distinction matters because they are meant for LLM endpoint authentication, not for general tenancy administration. Oracle also notes that the key must be used in the same OCI region as the model endpoint, which is important when you are testing against a region specific inference URL.

Another helpful detail is that OCI gives you two interchangeable secrets per API key. That makes rotation easier because one secret can be regenerated while the other remains active. Oracle explicitly recommends using one of those secrets as the bearer token when calling a supported model endpoint.

Step 2: Adding the IAM Policy That Makes the Key Work

Creating the API key was only part of the story. The key also needed permission to call the OCI Generative AI service. Oracle’s documentation explains that a separate IAM policy is required for principals of type generativeaiapikey, and this is where a lot of integration attempts can stall if the policy step is skipped.

In my case, I created the policy in the OCI Console under *Identity & Security > Policies* and used the manual editor in the root compartment. For testing, the policy that finally unlocked the flow was:

```text

allow any-user to use generative-ai-family in tenancy where ALL {request.principal.type='generativeaiapikey'}

```

Oracle documents this pattern for broad testing access and also explains that policies can be narrowed later by compartment, model, operation type, or even a specific API key OCID. That is a good way to start wide enough to validate the integration and then tighten the scope after the path is proven.

Step 3: Calling the OCI Endpoint from Postman

Once the key and policy were in place, I moved into Postman. Oracle documents a REST endpoint pattern for Chat Completions using OCI Generative AI API keys:

```text

https://inference.generativeai.<region>.oci.oraclecloud.com/20231130/actions/v1/chat/completions

```

Oracle’s API key documentation also shows that the request should send the API key secret in the Authorization: Bearer ... header and include a valid supported model in the request body. Supported models for this API key based REST path include xAI Grok and OpenAI GPT OSS, which are the models I focused on, but there's more available models in OCI like Llama, Cohere and Gemini.

My working request in Postman ended up looking like this:

```http

POST https://inference.generativeai.us-chicago-1.oci.oraclecloud.com/20231130/actions/v1/chat/completions

Authorization: Bearer sk-<your-secret>

Content-Type: application/json

Accept: application/json

```

With a body like this:

```json

{

  "model": "openai.gpt-oss-20b",

  "messages": [

    {

      "role": "user",

      "content": "Say hello in one sentence."

    }

  ]

}

```

After the policy was added, that request returned a successful response. That was the turning point because it confirmed the key, the policy, the endpoint, and the region were all aligned correctly. Oracle’s documentation supports this exact pattern, including the Chat Completions endpoint and the use of a bearer token generated by OCI Generative AI.


Click the image to expand it~

What I Learned From the Postman Troubleshooting

The troubleshooting was actually a useful part of the exercise. Early failures came down to a few very specific issues. First, the bearer token has to be the actual `sk-...` secret, not the API key OCID or a console URL. Second, the key has to be used in the same region where it was created. Third, the IAM policy really is required or the request will not succeed even if the secret is correct. Oracle’s docs are clear on all three of those points, and once those pieces clicked, the flow became much more predictable.

That experience also reinforced something broader. OCI is not just exposing models in its own console. Oracle is making the service available in ways that work with the tools many engineering teams already use, which lowers the friction to experiment and integrate. Oracle’s newer OpenAI compatible endpoint documentation makes that intent even more obvious by describing a base endpoint that works with familiar OpenAI style request patterns while still keeping authentication, execution, and resource management inside OCI.

Step 4: Taking the Same OCI Endpoint into VS Code

After confirming the endpoint in Postman, I wanted the same model access inside Visual Studio Code, given it's broad use in our organization. That turned out to be possible using VS Code’s bring your own key model support. Microsoft’s documentation explains that VS Code lets you add language models through the model picker and that custom providers can expose one provider with many models. The model metadata includes fields such as id, name, maxInputTokens, maxOutputTokens, and capability flags like tool calling and image input.

Using the custom endpoint option in the VS Code model picker opened a configuration file where I defined the OCI model settings. I used a Chat Completions style configuration because I had already validated that path in Postman. The key pieces were the OCI endpoint URL, the OCI API key, and the exact model ID. For example, openai.gpt-oss-20b worked well for a first pass because it had already succeeded in Postman, and Oracle lists GPT OSS as a supported model family for API key based Chat Completions.

The result was that I could select the OCI backed model in VS Code’s chat experience and use it like any other language model available through the model picker. Microsoft documents that the model picker is how users switch chat models, and Oracle documents that OCI supports both native and OpenAI compatible inference patterns. That combination is what made this work so well.

Click the image to expand it~

Why This Matters

To me, the bigger takeaway is not just that I got Postman and VS Code working. It is that OCI can participate in the same AI development workflows people already use for testing, coding, prototyping, and experimentation. Oracle’s documentation highlights support for OpenAI compatible endpoints, supported SDKs, and familiar APIs like Chat Completions and Responses. Microsoft’s documentation shows that VS Code is increasingly flexible about model choice through bring your own key and custom providers. Put those together and you have a path for teams that want enterprise hosted AI models without giving up developer ergonomics.

That opens up a lot of possibilities. It means OCI hosted models can be validated in Postman, consumed by scripts and SDKs, and surfaced directly in the editor where developers work. It also means organizations that are already invested in Oracle can extend that platform into modern AI workflows instead of treating it as something separate. OCI Generative AI is positioned by Oracle as an enterprise scale AI platform that supports hosted models, OpenAI compatible APIs, governance, and agent related features. This kind of integration work shows what that can look like in practice.

Final Thoughts

What started as a simple attempt to call a model with an OCI API key turned into a good exercise in understanding how Oracle has structured access, authorization, and compatibility with recent product enhancements. The final setup was straightforward once the pieces were in place: create the API key, grant the IAM policy, validate the endpoint in Postman, and then carry the same working endpoint into VS Code. Oracle’s API key model, policy framework, and OpenAI compatible options make that path realistic, and it is a strong example of OCI being useful well beyond the console itself.

If you are working in OCI and want to make enterprise hosted LLMs available in tools your team already trusts, this is absolutely worth trying.