Browse Source

feat(AI): Create abstractions for generic LLM calls within sentry (#68771)

### Background
For the User Feedback https://github.com/getsentry/sentry/issues/61372
Spam Detection feature, we intend to use an LLM. Along with other use
cases such as suggested fix, and code integration, there is a need
across the Sentry codebase to be able to call LLMs.

Because Sentry is self hosted, some features use different LLMs and we
want to provide modularity, we need to be able to configure different
LLM providers, models, and usecases.

### Solution:

- We define an options based config for Providers and use cases, where
you can specify a LLM provider's options and settings.
- For each use case, you then define what LLM provider it uses, and what
model.


Within `sentry/llm` we define a `providers` module, which consists of a
base and implementations. To start, we have OpenAI, Google Vertex, and a
Preview implementation used for testing. These will use the provider
options to initialize a client and connect to the LLM provider. The
providers inherit from `LazyServiceWrapper`.

Also within `sentry/llm`, we define a `usecases` module, which simply
consists of a function `complete_prompt`, along with an enum of use
cases. These options are passed to the LLM provider per use case, and
can be configured via the above option.


### Testing
I've added unit tests which mock the LLM calls, and I've tested in my
local environment that calls to the actual services work.


### In practice:
So to use an LLM, you do the following steps:
1. define your usecase in the [usecase
enum](https://github.com/getsentry/sentry/blob/a4e7a0e4af8c09a1d4007a3d7c53b71a2d4db5ff/src/sentry/llm/usecases/__init__.py#L14)
2. Call the `complete_prompt` function with your `usecase`, prompt,
content, temperature, and max_tokens)



### Limitations:

Because each LLM right now has a different interface, some things that
are specific, say to OpenAI like "function calling", where an output is
guaranteed to be in a specific JSON format, this solution does not
currently support. Advanced usecases beyond simple "prompt" + "text" and
a singe output, are not currently supported. It is likely possible to
add support for these on a case by case basis.

LLM providers are not quite to the point where they have standardized on
a consistent API, which makes supporting these somewhat difficult. Third
parties have come up with various solutions
[LangChain](https://github.com/langchain-ai/langchain),
[LiteLLM](https://github.com/BerriAI/litellm),
[LocalAI](https://github.com/mudler/LocalAI),
[OpenRouter](https://openrouter.ai/).

It will probably make sense eventually to adopt one of these tools, or
our own advanced tooling, once our use cases outgrow this solution.

There is also a possible future where we want different use cases to use
different API keys, but for now, one provider only has one set of
credentials.



### TODO

- [ ] create develop docs for how to add a usecase, or new LLM provider
- [x] Follow up PR to replace suggested fix openai calls with new
abstraction
- [ ] PR in getsentry to set provider / usecase values for SaaS
- [ ] PR followup to add telemetry information
- [ ] We'll likely want to support streaming responses.

---------

Co-authored-by: Michelle Zhang <56095982+michellewzhang@users.noreply.github.com>
Co-authored-by: getsantry[bot] <66042841+getsantry[bot]@users.noreply.github.com>
Josh Ferge 10 months ago
parent
commit
32f7e6f370

+ 1 - 0
pyproject.toml

@@ -594,6 +594,7 @@ module = [
     "sentry.eventstore.reprocessing.redis",
     "sentry.issues.related.*",
     "sentry.lang.java.processing",
+    "sentry.llm.*",
     "sentry.migrations.*",
     "sentry.nodestore.base",
     "sentry.nodestore.bigtable.backend",

+ 0 - 5
src/sentry/conf/server.py

@@ -2328,11 +2328,6 @@ SENTRY_METRICS_INDEXER_ENABLE_SLICED_PRODUCER = False
 SENTRY_CHART_RENDERER = "sentry.charts.chartcuterie.Chartcuterie"
 SENTRY_CHART_RENDERER_OPTIONS: dict[str, Any] = {}
 
-# User Feedback Spam Detection
-SENTRY_USER_FEEDBACK_SPAM = "sentry.feedback.spam.stub.StubFeedbackSpamDetection"
-SENTRY_USER_FEEDBACK_SPAM_OPTIONS: dict[str, str] = {}
-
-
 # URI Prefixes for generating DSN URLs
 # (Defaults to URL_PREFIX by default)
 SENTRY_ENDPOINT: str | None = None

+ 0 - 13
src/sentry/feedback/spam/__init__.py

@@ -1,13 +0,0 @@
-from django.conf import settings
-
-from sentry.utils.services import LazyServiceWrapper
-
-from .base import FeedbackSpamDetectionBase
-
-backend = LazyServiceWrapper(
-    FeedbackSpamDetectionBase,
-    settings.SENTRY_USER_FEEDBACK_SPAM,
-    settings.SENTRY_USER_FEEDBACK_SPAM_OPTIONS,
-)
-
-backend.expose(locals())

+ 0 - 9
src/sentry/feedback/spam/base.py

@@ -1,9 +0,0 @@
-from sentry.utils.services import Service
-
-
-class FeedbackSpamDetectionBase(Service):
-    def __init__(self, **options):
-        pass
-
-    def spam_detection(self, text: str):
-        raise NotImplementedError

+ 0 - 9
src/sentry/feedback/spam/stub.py

@@ -1,9 +0,0 @@
-from sentry.feedback.spam.base import FeedbackSpamDetectionBase
-
-
-class StubFeedbackSpamDetection(FeedbackSpamDetectionBase):
-    def __init__(self, **options):
-        pass
-
-    def spam_detection(self, text):
-        return False

+ 0 - 0
src/sentry/llm/__init__.py


+ 14 - 0
src/sentry/llm/exceptions.py

@@ -0,0 +1,14 @@
+class InvalidUsecaseError(ValueError):
+    pass
+
+
+class InvalidProviderError(ValueError):
+    pass
+
+
+class InvalidModelError(ValueError):
+    pass
+
+
+class InvalidTemperature(ValueError):
+    pass

+ 0 - 0
src/sentry/llm/providers/__init__.py


+ 45 - 0
src/sentry/llm/providers/base.py

@@ -0,0 +1,45 @@
+from sentry.llm.exceptions import InvalidModelError, InvalidProviderError
+from sentry.llm.types import ProviderConfig, UseCaseConfig
+from sentry.utils.services import Service
+
+
+class LlmModelBase(Service):
+    def __init__(self, provider_config: ProviderConfig) -> None:
+        self.provider_config = provider_config
+
+    def complete_prompt(
+        self,
+        *,
+        usecase_config: UseCaseConfig,
+        prompt: str,
+        message: str,
+        temperature: float,
+        max_output_tokens: int,
+    ) -> str | None:
+        self.validate_model(usecase_config["options"]["model"])
+
+        return self._complete_prompt(
+            usecase_config=usecase_config,
+            prompt=prompt,
+            message=message,
+            temperature=temperature,
+            max_output_tokens=max_output_tokens,
+        )
+
+    def _complete_prompt(
+        self,
+        *,
+        usecase_config: UseCaseConfig,
+        prompt: str,
+        message: str,
+        temperature: float,
+        max_output_tokens: int,
+    ) -> str | None:
+        raise NotImplementedError
+
+    def validate_model(self, model_name: str) -> None:
+        if "models" not in self.provider_config:
+            raise InvalidProviderError(f"No models defined for provider {self.__class__.__name__}")
+
+        if model_name not in self.provider_config["models"]:
+            raise InvalidModelError(f"Invalid model: {model_name}")

+ 60 - 0
src/sentry/llm/providers/openai.py

@@ -0,0 +1,60 @@
+from openai import OpenAI
+
+from sentry.llm.providers.base import LlmModelBase
+from sentry.llm.types import UseCaseConfig
+
+
+class OpenAIProvider(LlmModelBase):
+
+    provider_name = "openai"
+
+    def _complete_prompt(
+        self,
+        *,
+        usecase_config: UseCaseConfig,
+        prompt: str,
+        message: str,
+        temperature: float,
+        max_output_tokens: int,
+    ) -> str | None:
+        model = usecase_config["options"]["model"]
+        client = get_openai_client(self.provider_config["options"]["api_key"])
+
+        response = client.chat.completions.create(
+            model=model,
+            temperature=temperature
+            * 2,  # open AI temp range is [0.0 - 2.0], so we have to multiply by two
+            messages=[
+                {"role": "system", "content": prompt},
+                {
+                    "role": "user",
+                    "content": message,
+                },
+            ],
+            stream=False,
+            max_tokens=max_output_tokens,
+        )
+
+        return response.choices[0].message.content
+
+
+openai_client: OpenAI | None = None
+
+
+class OpenAIClientSingleton:
+    _instance = None
+    client: OpenAI
+
+    def __init__(self) -> None:
+        raise RuntimeError("Call instance() instead")
+
+    @classmethod
+    def instance(cls, api_key: str) -> "OpenAIClientSingleton":
+        if cls._instance is None:
+            cls._instance = cls.__new__(cls)
+            cls._instance.client = OpenAI(api_key=api_key)
+        return cls._instance
+
+
+def get_openai_client(api_key: str) -> OpenAI:
+    return OpenAIClientSingleton.instance(api_key).client

Some files were not shown because too many files changed in this diff