Generate completion

Generate completion

curl --request POST \
  --url https://api.sierra.absconsulting.com/v1/inference/{model} \
  --header 'Api-Key: <api-key>' \
  --header 'Content-Type: application/json' \
  --data '
{
  "max_completion_tokens": 100,
  "max_tokens": 100,
  "project_code": "NA",
  "prompt": "What is the capital of France?",
  "response_format": {
    "json_schema": {
      "description": "<string>",
      "name": "<string>",
      "schema": {},
      "strict": true
    },
    "type": "text"
  },
  "stream": false,
  "system_prompt": "You are a helpful assistant.",
  "temperature": 0.7
}
'

{
  "model": "<string>",
  "response": "<string>",
  "usage": {
    "completion_tokens": 123,
    "prompt_tokens": 123,
    "total_tokens": 123
  }
}

POST

inference

{model}

Generate completion

curl --request POST \
  --url https://api.sierra.absconsulting.com/v1/inference/{model} \
  --header 'Api-Key: <api-key>' \
  --header 'Content-Type: application/json' \
  --data '
{
  "max_completion_tokens": 100,
  "max_tokens": 100,
  "project_code": "NA",
  "prompt": "What is the capital of France?",
  "response_format": {
    "json_schema": {
      "description": "<string>",
      "name": "<string>",
      "schema": {},
      "strict": true
    },
    "type": "text"
  },
  "stream": false,
  "system_prompt": "You are a helpful assistant.",
  "temperature": 0.7
}
'

{
  "model": "<string>",
  "response": "<string>",
  "usage": {
    "completion_tokens": 123,
    "prompt_tokens": 123,
    "total_tokens": 123
  }
}

Authorizations

Api-Key

string

header

required

Access the API as yourself. You can find your API key in your profile menu in Portal.

Path Parameters

model

string

required

The model to use for inference. Currently, gpt-4.1, gpt-4o, gpt-4o-mini, o1, o3, o3-mini, o4-mini, llama-4-maverick, and nemo are supported.

Body

application/json

The request parameters for the inference. Only the prompt is required and project code is required. The project code can either be 'NA' or the Royal Caribbean Group project code. Other parameters (system_prompt, temperature, max_tokens, max_completion_tokens, stream, response_format) are optional and model-dependent.

max_completion_tokens

integer

The maximum number of tokens the model can generate in the completion response. Use this parameter for models that support it (e.g., GPT-4o, GPT-4o-mini). Model-specific defaults apply if not provided.

Example:

100

max_tokens

integer

The maximum number of tokens (including both prompt and completion) that the model can generate in a single request. Use this parameter for models that don't support max_completion_tokens. Model-specific defaults apply if not provided.

Example:

100

project_code

string

The Royal Caribbean Group project code for tracking and billing purposes. Use 'NA' if not associated with a specific project. This is a required field.

Example:

"NA"

prompt

string

The user's input message or question that the AI model will process and respond to. This is a required field.

Example:

"What is the capital of France?"

response_format

object

Optional format specification for the response. Can be used to request structured JSON output or enforce a specific JSON schema.

Show child attributes

stream

boolean

If true, the response will be streamed back as it is generated, allowing for real-time output. If false, the complete response is returned after generation finishes.

Example:

false

system_prompt

string

Optional system message that sets the behavior and context for the AI assistant. If not provided, a default system prompt will be used.

Example:

"You are a helpful assistant."

temperature

number

Controls the randomness of the model's output. Lower values (e.g., 0.2) make the output more deterministic and focused, while higher values (e.g., 1.0) make it more creative and varied. Range typically 0.0-2.0. Model-specific defaults apply if not provided.

Example:

0.7

Response

model

string

The model used for the inference.

response

string

The response from the model.

usage

object

The usage for the inference.

Show child attributes

Generate Embedding

Search SkillShare

⌘I

Getting started

SIERRA API

Documents Service

Authorizations

Path Parameters

Body

Response