Add default model for the nvidia classes in local NIM mode #14652

raspawar · 2024-07-09T09:39:25Z

Description

Provide default model using the first available model if none is provided for local NIM, e.g.
Settings.llm = NVIDIA(base_url="http://localhost:1234/v1")
query_engine = index.as_query_engine(similarity_top_k=20)
response = query_engine.query(
"How many new housing units were built in San Francisco in 2021?"
)
llm._client.model => default model => first available model in local NIM
Fixes # (issue)

New Package?

Did I fill in the tool.llamahub section in the pyproject.toml and provide a detailed README.md for my new integration or package?

Yes
No

Version Bump?

Did I bump the version in the pyproject.toml file of the package I am updating? (Except for the llama-index-core package)

Yes
No

Type of Change

Please delete options that are not relevant.

Bug fix (non-breaking change which fixes an issue)
New feature (non-breaking change which adds functionality)
Breaking change (fix or feature that would cause existing functionality to not work as expected)
This change requires a documentation update

How Has This Been Tested?

Please describe the tests that you ran to verify your changes. Provide instructions so we can reproduce. Please also list any relevant details for your test configuration

Added new unit/integration tests
Added new notebook (that tests end-to-end)
I stared at the code and made sure it makes sense

Suggested Checklist:

I have performed a self-review of my own code
I have commented my code, particularly in hard-to-understand areas
I have made corresponding changes to the documentation
I have added Google Colab support for the newly added notebooks.
My changes generate no new warnings
I have added tests that prove my fix is effective or that my feature works
New and existing unit tests pass locally with my changes
I ran make format; make lint to appease the lint gods

@sumitkbh

review-notebook-app · 2024-07-09T09:39:30Z

Check out this pull request on

See visual diffs & provide feedback on Jupyter Notebooks.

Powered by ReviewNB

raspawar · 2024-07-09T09:41:12Z

@mattf please have a look

logan-markewich · 2024-07-12T00:34:20Z

Looks like this maybe broke tests?

mattf

@raspawar Logan is right. i just checked main vs this PR. main passes and PR fails -

FAILED tests/test_api_key.py::test_missing_api_key_error - AssertionError: assert '401' in '404 page not found'
FAILED tests/test_api_key.py::test_bogus_api_key_error - AssertionError: assert '401' in '404 page not found'
FAILED tests/test_embeddings_nvidia.py::test_nvidia_embedding_callback - openai.NotFoundError: 404 page not found
FAILED tests/test_embeddings_nvidia.py::test_nvidia_embedding_throws_with_invalid_key - openai.NotFoundError: 404 page not found

the 404 suggests the model name is being mangled, but in my testing it may be an issue w/ using the openai client.

>>> import os
>>> from openai import OpenAI
>>> client = OpenAI(api_key=os.environ["NVIDIA_API_KEY"], base_url="https://integrate.api.nvidia.com/v1")
>>> x = client.embeddings.create(input=['hello'], model="NV-Embed-QA", extra_body={"input_type": "query"})
Traceback (most recent call last):
  File "<stdin>", line 1, in <module>
  File "/home/matt/.cache/pypoetry/virtualenvs/llama-index-embeddings-nvidia-LEYbhevy-py3.10/lib/python3.10/site-packages/openai/resources/embeddings.py", line 114, in create
    return self._post(
  File "/home/matt/.cache/pypoetry/virtualenvs/llama-index-embeddings-nvidia-LEYbhevy-py3.10/lib/python3.10/site-packages/openai/_base_client.py", line 1266, in post
    return cast(ResponseT, self.request(cast_to, opts, stream=stream, stream_cls=stream_cls))
  File "/home/matt/.cache/pypoetry/virtualenvs/llama-index-embeddings-nvidia-LEYbhevy-py3.10/lib/python3.10/site-packages/openai/_base_client.py", line 942, in request
    return self._request(
  File "/home/matt/.cache/pypoetry/virtualenvs/llama-index-embeddings-nvidia-LEYbhevy-py3.10/lib/python3.10/site-packages/openai/_base_client.py", line 1046, in _request
    raise self._make_status_error_from_response(err.response) from None
openai.NotFoundError: 404 page not found

mattf · 2024-07-12T13:58:09Z

...-integrations/embeddings/llama-index-embeddings-nvidia/llama_index/embeddings/nvidia/base.py

@@ -146,14 +145,44 @@ def __init__(
        )
        self._aclient._custom_headers = {"User-Agent": "llama-index-embeddings-nvidia"}

+        if not model:
+            self.__set_default_model()


style comment, avoid side-effects: self.model = self.__get_default_model()

raspawar · 2024-07-15T06:21:39Z

@raspawar Logan is right. i just checked main vs this PR. main passes and PR fails -

FAILED tests/test_api_key.py::test_missing_api_key_error - AssertionError: assert '401' in '404 page not found'
FAILED tests/test_api_key.py::test_bogus_api_key_error - AssertionError: assert '401' in '404 page not found'
FAILED tests/test_embeddings_nvidia.py::test_nvidia_embedding_callback - openai.NotFoundError: 404 page not found
FAILED tests/test_embeddings_nvidia.py::test_nvidia_embedding_throws_with_invalid_key - openai.NotFoundError: 404 page not found

the 404 suggests the model name is being mangled, but in my testing it may be an issue w/ using the openai client.

>>> import os
>>> from openai import OpenAI
>>> client = OpenAI(api_key=os.environ["NVIDIA_API_KEY"], base_url="https://integrate.api.nvidia.com/v1")
>>> x = client.embeddings.create(input=['hello'], model="NV-Embed-QA", extra_body={"input_type": "query"})
Traceback (most recent call last):
  File "<stdin>", line 1, in <module>
  File "/home/matt/.cache/pypoetry/virtualenvs/llama-index-embeddings-nvidia-LEYbhevy-py3.10/lib/python3.10/site-packages/openai/resources/embeddings.py", line 114, in create
    return self._post(
  File "/home/matt/.cache/pypoetry/virtualenvs/llama-index-embeddings-nvidia-LEYbhevy-py3.10/lib/python3.10/site-packages/openai/_base_client.py", line 1266, in post
    return cast(ResponseT, self.request(cast_to, opts, stream=stream, stream_cls=stream_cls))
  File "/home/matt/.cache/pypoetry/virtualenvs/llama-index-embeddings-nvidia-LEYbhevy-py3.10/lib/python3.10/site-packages/openai/_base_client.py", line 942, in request
    return self._request(
  File "/home/matt/.cache/pypoetry/virtualenvs/llama-index-embeddings-nvidia-LEYbhevy-py3.10/lib/python3.10/site-packages/openai/_base_client.py", line 1046, in _request
    raise self._make_status_error_from_response(err.response) from None
openai.NotFoundError: 404 page not found

I will investigate this

raspawar · 2024-07-17T09:36:27Z

@logan-markewich I have mocked the failing test case API calls, can you ptal?

mattf

please add a test for the embedding interface, then ok to merge

mattf · 2024-07-22T13:05:34Z

👍

…aspawar/default-model

logan-markewich · 2024-08-22T15:38:40Z

llama-index-integrations/postprocessor/llama-index-postprocessor-nvidia-rerank/poetry.lock

please don't commit the lock files, they just take up space

Yeap, will remove it

…aspawar/default-model

logan-markewich · 2024-08-28T02:00:30Z

Ok @raspawar fixed one issue with tests, now it's up to you :)

raspawar added 8 commits June 28, 2024 11:36

local default model for embedding

9d5c0eb

default model code for llm

2b1c087

add default model code for reranker

bd5bce4

notebook rerun changes

6397709

test cases update

335079e

remove poetry lock files

c29b429

code cleanup

e5a140b

update nvidia reranker notebook

c07c929

dosubot bot added the size:XL This PR changes 500-999 lines, ignoring generated files. label Jul 9, 2024

mattf reviewed Jul 12, 2024

View reviewed changes

raspawar added 3 commits July 17, 2024 14:52

mock the integrate api calls

91c271a

mock integrate api calls in test case

459bb79

mock test cases in api_key

5cc064b

mattf reviewed Jul 19, 2024

View reviewed changes

add test cases for embeddings

2c61bd4

logan-markewich approved these changes Jul 24, 2024

View reviewed changes

dosubot bot added the lgtm This PR has been approved by a maintainer label Jul 24, 2024

logan-markewich enabled auto-merge (squash) July 24, 2024 15:01

raspawar and others added 2 commits August 6, 2024 18:04

Merge branch 'main' of https://github.com/raspawar/llama_index into r…

73677be

…aspawar/default-model

Merge branch 'main' into raspawar/default-model

da42432

auto-merge was automatically disabled August 6, 2024 12:50
Head branch was pushed to by a user without write access

raspawar added 2 commits August 6, 2024 19:51

fix lint, test cases

7f2fd6c

fix bug in rerank url parsing, test cases

528215d

logan-markewich closed this Aug 22, 2024

logan-markewich reopened this Aug 22, 2024

logan-markewich reviewed Aug 22, 2024

View reviewed changes

dosubot bot added size:XXL This PR changes 1000+ lines, ignoring generated files. and removed size:XL This PR changes 500-999 lines, ignoring generated files. labels Aug 22, 2024

raspawar force-pushed the raspawar/default-model branch 2 times, most recently from e154a39 to 528215d Compare August 22, 2024 17:00

dosubot bot added size:XL This PR changes 500-999 lines, ignoring generated files. and removed size:XXL This PR changes 1000+ lines, ignoring generated files. labels Aug 22, 2024

raspawar and others added 8 commits August 22, 2024 22:32

remove poetry lock files

293a63d

Merge branch 'main' of https://github.com/raspawar/llama_index into r…

b39b069

…aspawar/default-model

fix test cases, code

ec4d5bd

Merge branch 'main' of https://github.com/raspawar/llama_index into r…

d8241b4

…aspawar/default-model

fix lint issues

50060bd

Merge branch 'run-llama:main' into raspawar/default-model

31f32a1

Merge branch 'main' into raspawar/default-model

7f9cdca

less restrictive dep

0faee71

raspawar and others added 4 commits August 28, 2024 19:38

Merge branch 'main' into raspawar/default-model

b73ce90

fix for the url warning and test cases

42cd380

add masked env to test case

de0afdb

remove test case for now

0e9abe6

logan-markewich merged commit d4c058b into run-llama:main Aug 29, 2024
9 checks passed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Add default model for the nvidia classes in local NIM mode #14652

Add default model for the nvidia classes in local NIM mode #14652

raspawar commented Jul 9, 2024 •

edited

Loading

review-notebook-app bot commented Jul 9, 2024

raspawar commented Jul 9, 2024

logan-markewich commented Jul 12, 2024

mattf left a comment

mattf Jul 12, 2024

raspawar commented Jul 15, 2024

raspawar commented Jul 17, 2024

mattf left a comment

mattf commented Jul 22, 2024

logan-markewich Aug 22, 2024

raspawar Aug 22, 2024

logan-markewich commented Aug 28, 2024

Add default model for the nvidia classes in local NIM mode #14652

Add default model for the nvidia classes in local NIM mode #14652

Conversation

raspawar commented Jul 9, 2024 • edited Loading

Description

New Package?

Version Bump?

Type of Change

How Has This Been Tested?

Suggested Checklist:

review-notebook-app bot commented Jul 9, 2024

raspawar commented Jul 9, 2024

logan-markewich commented Jul 12, 2024

mattf left a comment

Choose a reason for hiding this comment

mattf Jul 12, 2024

Choose a reason for hiding this comment

raspawar commented Jul 15, 2024

raspawar commented Jul 17, 2024

mattf left a comment

Choose a reason for hiding this comment

mattf commented Jul 22, 2024

logan-markewich Aug 22, 2024

Choose a reason for hiding this comment

raspawar Aug 22, 2024

Choose a reason for hiding this comment

logan-markewich commented Aug 28, 2024

raspawar commented Jul 9, 2024 •

edited

Loading