> ## Documentation Index
> Fetch the complete documentation index at: https://portkey-docs-add-third-party-integration-issues-fixes.mintlify.site/llms.txt
> Use this file to discover all available pages before exploring further.

# vLLM

> Integrate vLLM-hosted custom models with Portkey for production observability and reliability.

Portkey provides a robust platform to observe, govern, and manage your **locally** or **privately** hosted custom models using vLLM.

<Info>
  Here's a [list](https://docs.vllm.ai/en/latest/models/supported_models.html) of all model architectures supported on vLLM.
</Info>

## Integration Steps

<Steps>
  <Step title="Expose your vLLM Server">
    Expose your vLLM server using a tunneling service like [ngrok](https://ngrok.com/) or make it publicly accessible. Skip this if you're self-hosting the Gateway.

    ```sh theme={null}
    ngrok http 8000 --host-header="localhost:8080"
    ```
  </Step>

  <Step title="Add to Model Catalog">
    1. Go to [**Model Catalog → Add Provider**](https://app.portkey.ai/model-catalog/providers)
    2. Enable **"Local/Privately hosted provider"** toggle
    3. Select **OpenAI** as the provider type (vLLM follows OpenAI API schema)
    4. Enter your vLLM server URL in **Custom Host**: `https://your-vllm-server.ngrok-free.app`
    5. Add authentication headers if needed
    6. Name your provider (e.g., `my-vllm`)

    <Card title="Complete Setup Guide" icon="book" href="/product/model-catalog">
      See all setup options
    </Card>
  </Step>

  <Step title="Use in Your Application">
    <CodeGroup>
      ```python Python theme={null}
      from portkey_ai import Portkey

      portkey = Portkey(
          api_key="PORTKEY_API_KEY",
          provider="@my-vllm"
      )

      response = portkey.chat.completions.create(
          model="your-model-name",
          messages=[{"role": "user", "content": "Hello!"}]
      )

      print(response.choices[0].message.content)
      ```

      ```javascript Node.js theme={null}
      import Portkey from 'portkey-ai';

      const portkey = new Portkey({
          apiKey: 'PORTKEY_API_KEY',
          provider: '@my-vllm'
      });

      const response = await portkey.chat.completions.create({
          model: 'your-model-name',
          messages: [{ role: 'user', content: 'Hello!' }]
      });

      console.log(response.choices[0].message.content);
      ```
    </CodeGroup>

    **Or use custom host directly:**

    <CodeGroup>
      ```python Python theme={null}
      from portkey_ai import Portkey

      portkey = Portkey(
          api_key="PORTKEY_API_KEY",
          provider="openai",
          custom_host="https://your-vllm-server.ngrok-free.app",
          Authorization="AUTH_KEY"  # If needed
      )
      ```

      ```javascript Node.js theme={null}
      import Portkey from 'portkey-ai';

      const portkey = new Portkey({
          apiKey: 'PORTKEY_API_KEY',
          provider: 'openai',
          customHost: 'https://your-vllm-server.ngrok-free.app',
          Authorization: 'AUTH_KEY'  // If needed
      });
      ```
    </CodeGroup>
  </Step>
</Steps>

<Note>
  **Important:** vLLM follows the OpenAI API specification, so set the provider as `openai` when using custom host directly. By default, vLLM runs on `http://localhost:8000/v1`.
</Note>

***

## Next Steps

<CardGroup cols={2}>
  <Card title="Gateway Configs" icon="sliders" href="/product/ai-gateway">
    Add retries, timeouts, and fallbacks
  </Card>

  <Card title="Observability" icon="chart-line" href="/product/observability">
    Monitor your vLLM deployments
  </Card>

  <Card title="Custom Host Guide" icon="server" href="/product/ai-gateway/universal-api#integrating-local-or-private-models">
    Learn more about custom host setup
  </Card>

  <Card title="BYOLLM Guide" icon="book" href="/integrations/llms/byollm">
    Complete guide for private LLMs
  </Card>
</CardGroup>

For complete SDK documentation:

<Card title="SDK Reference" icon="code" href="/api-reference/sdk/list">
  Complete Portkey SDK documentation
</Card>
