Evaluates instances based on a given metric.
HTTP request
POST https://{service-endpoint}/v1/{location}:evaluateInstances
Where {service-endpoint}
is one of the supported service endpoints.
Path parameters
Parameters | |
---|---|
location |
Required. The resource name of the Location to evaluate the instances. Format: |
Request body
The request body contains data with the following structure:
JSON representation |
---|
{ // Union field |
Fields | |
---|---|
Union field metric_inputs . Instances and specs for evaluation metric_inputs can be only one of the following: |
|
exactMatchInput |
Auto metric instances. Instances and metric spec for exact match metric. |
bleuInput |
Instances and metric spec for bleu metric. |
rougeInput |
Instances and metric spec for rouge metric. |
fluencyInput |
LLM-based metric instance. General text generation metrics, applicable to other categories. Input for fluency metric. |
coherenceInput |
Input for coherence metric. |
safetyInput |
Input for safety metric. |
groundednessInput |
Input for groundedness metric. |
fulfillmentInput |
Input for fulfillment metric. |
summarizationQualityInput |
Input for summarization quality metric. |
pairwiseSummarizationQualityInput |
Input for pairwise summarization quality metric. |
summarizationHelpfulnessInput |
Input for summarization helpfulness metric. |
summarizationVerbosityInput |
Input for summarization verbosity metric. |
questionAnsweringQualityInput |
Input for question answering quality metric. |
pairwiseQuestionAnsweringQualityInput |
Input for pairwise question answering quality metric. |
questionAnsweringRelevanceInput |
Input for question answering relevance metric. |
questionAnsweringHelpfulnessInput |
Input for question answering helpfulness metric. |
questionAnsweringCorrectnessInput |
Input for question answering correctness metric. |
toolCallValidInput |
Tool call metric instances. Input for tool call valid metric. |
toolNameMatchInput |
Input for tool name match metric. |
toolParameterKeyMatchInput |
Input for tool parameter key match metric. |
toolParameterKvMatchInput |
Input for tool parameter key value match metric. |
Response body
Response message for EvaluationService.EvaluateInstances.
If successful, the response body contains data with the following structure:
JSON representation |
---|
{ // Union field |
Fields | |
---|---|
Union field evaluation_results . Evaluation results will be served in the same order as presented in EvaluationRequest.instances. evaluation_results can be only one of the following: |
|
exactMatchResults |
Auto metric evaluation results. Results for exact match metric. |
bleuResults |
Results for bleu metric. |
rougeResults |
Results for rouge metric. |
fluencyResult |
LLM-based metric evaluation result. General text generation metrics, applicable to other categories. result for fluency metric. |
coherenceResult |
result for coherence metric. |
safetyResult |
result for safety metric. |
groundednessResult |
result for groundedness metric. |
fulfillmentResult |
result for fulfillment metric. |
summarizationQualityResult |
Summarization only metrics. result for summarization quality metric. |
pairwiseSummarizationQualityResult |
result for pairwise summarization quality metric. |
summarizationHelpfulnessResult |
result for summarization helpfulness metric. |
summarizationVerbosityResult |
result for summarization verbosity metric. |
questionAnsweringQualityResult |
Question answering only metrics. result for question answering quality metric. |
pairwiseQuestionAnsweringQualityResult |
result for pairwise question answering quality metric. |
questionAnsweringRelevanceResult |
result for question answering relevance metric. |
questionAnsweringHelpfulnessResult |
result for question answering helpfulness metric. |
questionAnsweringCorrectnessResult |
result for question answering correctness metric. |
toolCallValidResults |
Tool call metrics. Results for tool call valid metric. |
toolNameMatchResults |
Results for tool name match metric. |
toolParameterKeyMatchResults |
Results for tool parameter key match metric. |
toolParameterKvMatchResults |
Results for tool parameter key value match metric. |
Authorization scopes
Requires the following OAuth scope:
https://www.googleapis.com/auth/cloud-platform
For more information, see the Authentication Overview.
ExactMatchInput
Input for exact match metric.
JSON representation |
---|
{ "metricSpec": { object ( |
Fields | |
---|---|
metricSpec |
Required. Spec for exact match metric. |
instances[] |
Required. Repeated exact match instances. |
ExactMatchSpec
This type has no fields.
Spec for exact match metric - returns 1 if prediction and reference exactly matches, otherwise 0.
ExactMatchInstance
Spec for exact match instance.
JSON representation |
---|
{ "prediction": string, "reference": string } |
Fields | |
---|---|
prediction |
Required. Output of the evaluated model. |
reference |
Required. Ground truth used to compare against the prediction. |
BleuInput
Input for bleu metric.
JSON representation |
---|
{ "metricSpec": { object ( |
Fields | |
---|---|
metricSpec |
Required. Spec for bleu score metric. |
instances[] |
Required. Repeated bleu instances. |
BleuSpec
Spec for bleu score metric - calculates the precision of n-grams in the prediction as compared to reference - returns a score ranging between 0 to 1.
JSON representation |
---|
{ "useEffectiveOrder": boolean } |
Fields | |
---|---|
useEffectiveOrder |
Optional. Whether to useEffectiveOrder to compute bleu score. |
BleuInstance
Spec for bleu instance.
JSON representation |
---|
{ "prediction": string, "reference": string } |
Fields | |
---|---|
prediction |
Required. Output of the evaluated model. |
reference |
Required. Ground truth used to compare against the prediction. |
RougeInput
Input for rouge metric.
JSON representation |
---|
{ "metricSpec": { object ( |
Fields | |
---|---|
metricSpec |
Required. Spec for rouge score metric. |
instances[] |
Required. Repeated rouge instances. |
RougeSpec
Spec for rouge score metric - calculates the recall of n-grams in prediction as compared to reference - returns a score ranging between 0 and 1.
JSON representation |
---|
{ "rougeType": string, "useStemmer": boolean, "splitSummaries": boolean } |
Fields | |
---|---|
rougeType |
Optional. Supported rouge types are rougen[1-9], rougeL, and rougeLsum. |
useStemmer |
Optional. Whether to use stemmer to compute rouge score. |
splitSummaries |
Optional. Whether to split summaries while using rougeLsum. |
RougeInstance
Spec for rouge instance.
JSON representation |
---|
{ "prediction": string, "reference": string } |
Fields | |
---|---|
prediction |
Required. Output of the evaluated model. |
reference |
Required. Ground truth used to compare against the prediction. |
FluencyInput
Input for fluency metric.
JSON representation |
---|
{ "metricSpec": { object ( |
Fields | |
---|---|
metricSpec |
Required. Spec for fluency score metric. |
instance |
Required. Fluency instance. |
FluencySpec
Spec for fluency score metric.
JSON representation |
---|
{ "version": integer } |
Fields | |
---|---|
version |
Optional. Which version to use for evaluation. |
FluencyInstance
Spec for fluency instance.
JSON representation |
---|
{ "prediction": string } |
Fields | |
---|---|
prediction |
Required. Output of the evaluated model. |
CoherenceInput
Input for coherence metric.
JSON representation |
---|
{ "metricSpec": { object ( |
Fields | |
---|---|
metricSpec |
Required. Spec for coherence score metric. |
instance |
Required. Coherence instance. |
CoherenceSpec
Spec for coherence score metric.
JSON representation |
---|
{ "version": integer } |
Fields | |
---|---|
version |
Optional. Which version to use for evaluation. |
CoherenceInstance
Spec for coherence instance.
JSON representation |
---|
{ "prediction": string } |
Fields | |
---|---|
prediction |
Required. Output of the evaluated model. |
SafetyInput
Input for safety metric.
JSON representation |
---|
{ "metricSpec": { object ( |
Fields | |
---|---|
metricSpec |
Required. Spec for safety metric. |
instance |
Required. Safety instance. |
SafetySpec
Spec for safety metric.
JSON representation |
---|
{ "version": integer } |
Fields | |
---|---|
version |
Optional. Which version to use for evaluation. |
SafetyInstance
Spec for safety instance.
JSON representation |
---|
{ "prediction": string } |
Fields | |
---|---|
prediction |
Required. Output of the evaluated model. |
GroundednessInput
Input for groundedness metric.
JSON representation |
---|
{ "metricSpec": { object ( |
Fields | |
---|---|
metricSpec |
Required. Spec for groundedness metric. |
instance |
Required. Groundedness instance. |
GroundednessSpec
Spec for groundedness metric.
JSON representation |
---|
{ "version": integer } |
Fields | |
---|---|
version |
Optional. Which version to use for evaluation. |
GroundednessInstance
Spec for groundedness instance.
JSON representation |
---|
{ "prediction": string, "context": string } |
Fields | |
---|---|
prediction |
Required. Output of the evaluated model. |
context |
Required. Background information provided in context used to compare against the prediction. |
FulfillmentInput
Input for fulfillment metric.
JSON representation |
---|
{ "metricSpec": { object ( |
Fields | |
---|---|
metricSpec |
Required. Spec for fulfillment score metric. |
instance |
Required. Fulfillment instance. |
FulfillmentSpec
Spec for fulfillment metric.
JSON representation |
---|
{ "version": integer } |
Fields | |
---|---|
version |
Optional. Which version to use for evaluation. |
FulfillmentInstance
Spec for fulfillment instance.
JSON representation |
---|
{ "prediction": string, "instruction": string } |
Fields | |
---|---|
prediction |
Required. Output of the evaluated model. |
instruction |
Required. Inference instruction prompt to compare prediction with. |
SummarizationQualityInput
Input for summarization quality metric.
JSON representation |
---|
{ "metricSpec": { object ( |
Fields | |
---|---|
metricSpec |
Required. Spec for summarization quality score metric. |
instance |
Required. Summarization quality instance. |
SummarizationQualitySpec
Spec for summarization quality score metric.
JSON representation |
---|
{ "useReference": boolean, "version": integer } |
Fields | |
---|---|
useReference |
Optional. Whether to use instance.reference to compute summarization quality. |
version |
Optional. Which version to use for evaluation. |
SummarizationQualityInstance
Spec for summarization quality instance.
JSON representation |
---|
{ "prediction": string, "reference": string, "context": string, "instruction": string } |
Fields | |
---|---|
prediction |
Required. Output of the evaluated model. |
reference |
Optional. Ground truth used to compare against the prediction. |
context |
Required. Text to be summarized. |
instruction |
Required. Summarization prompt for LLM. |
PairwiseSummarizationQualityInput
Input for pairwise summarization quality metric.
JSON representation |
---|
{ "metricSpec": { object ( |
Fields | |
---|---|
metricSpec |
Required. Spec for pairwise summarization quality score metric. |
instance |
Required. Pairwise summarization quality instance. |
PairwiseSummarizationQualitySpec
Spec for pairwise summarization quality score metric.
JSON representation |
---|
{ "useReference": boolean, "version": integer } |
Fields | |
---|---|
useReference |
Optional. Whether to use instance.reference to compute pairwise summarization quality. |
version |
Optional. Which version to use for evaluation. |
PairwiseSummarizationQualityInstance
Spec for pairwise summarization quality instance.
JSON representation |
---|
{ "prediction": string, "baselinePrediction": string, "reference": string, "context": string, "instruction": string } |
Fields | |
---|---|
prediction |
Required. Output of the candidate model. |
baselinePrediction |
Required. Output of the baseline model. |
reference |
Optional. Ground truth used to compare against the prediction. |
context |
Required. Text to be summarized. |
instruction |
Required. Summarization prompt for LLM. |
SummarizationHelpfulnessInput
Input for summarization helpfulness metric.
JSON representation |
---|
{ "metricSpec": { object ( |
Fields | |
---|---|
metricSpec |
Required. Spec for summarization helpfulness score metric. |
instance |
Required. Summarization helpfulness instance. |
SummarizationHelpfulnessSpec
Spec for summarization helpfulness score metric.
JSON representation |
---|
{ "useReference": boolean, "version": integer } |
Fields | |
---|---|
useReference |
Optional. Whether to use instance.reference to compute summarization helpfulness. |
version |
Optional. Which version to use for evaluation. |
SummarizationHelpfulnessInstance
Spec for summarization helpfulness instance.
JSON representation |
---|
{ "prediction": string, "reference": string, "context": string, "instruction": string } |
Fields | |
---|---|
prediction |
Required. Output of the evaluated model. |
reference |
Optional. Ground truth used to compare against the prediction. |
context |
Required. Text to be summarized. |
instruction |
Optional. Summarization prompt for LLM. |
SummarizationVerbosityInput
Input for summarization verbosity metric.
JSON representation |
---|
{ "metricSpec": { object ( |
Fields | |
---|---|
metricSpec |
Required. Spec for summarization verbosity score metric. |
instance |
Required. Summarization verbosity instance. |
SummarizationVerbositySpec
Spec for summarization verbosity score metric.
JSON representation |
---|
{ "useReference": boolean, "version": integer } |
Fields | |
---|---|
useReference |
Optional. Whether to use instance.reference to compute summarization verbosity. |
version |
Optional. Which version to use for evaluation. |
SummarizationVerbosityInstance
Spec for summarization verbosity instance.
JSON representation |
---|
{ "prediction": string, "reference": string, "context": string, "instruction": string } |
Fields | |
---|---|
prediction |
Required. Output of the evaluated model. |
reference |
Optional. Ground truth used to compare against the prediction. |
context |
Required. Text to be summarized. |
instruction |
Optional. Summarization prompt for LLM. |
QuestionAnsweringQualityInput
Input for question answering quality metric.
JSON representation |
---|
{ "metricSpec": { object ( |
Fields | |
---|---|
metricSpec |
Required. Spec for question answering quality score metric. |
instance |
Required. Question answering quality instance. |
QuestionAnsweringQualitySpec
Spec for question answering quality score metric.
JSON representation |
---|
{ "useReference": boolean, "version": integer } |
Fields | |
---|---|
useReference |
Optional. Whether to use instance.reference to compute question answering quality. |
version |
Optional. Which version to use for evaluation. |
QuestionAnsweringQualityInstance
Spec for question answering quality instance.
JSON representation |
---|
{ "prediction": string, "reference": string, "context": string, "instruction": string } |
Fields | |
---|---|
prediction |
Required. Output of the evaluated model. |
reference |
Optional. Ground truth used to compare against the prediction. |
context |
Required. Text to answer the question. |
instruction |
Required. Question Answering prompt for LLM. |
PairwiseQuestionAnsweringQualityInput
Input for pairwise question answering quality metric.
JSON representation |
---|
{ "metricSpec": { object ( |
Fields | |
---|---|
metricSpec |
Required. Spec for pairwise question answering quality score metric. |
instance |
Required. Pairwise question answering quality instance. |
PairwiseQuestionAnsweringQualitySpec
Spec for pairwise question answering quality score metric.
JSON representation |
---|
{ "useReference": boolean, "version": integer } |
Fields | |
---|---|
useReference |
Optional. Whether to use instance.reference to compute question answering quality. |
version |
Optional. Which version to use for evaluation. |
PairwiseQuestionAnsweringQualityInstance
Spec for pairwise question answering quality instance.
JSON representation |
---|
{ "prediction": string, "baselinePrediction": string, "reference": string, "context": string, "instruction": string } |
Fields | |
---|---|
prediction |
Required. Output of the candidate model. |
baselinePrediction |
Required. Output of the baseline model. |
reference |
Optional. Ground truth used to compare against the prediction. |
context |
Required. Text to answer the question. |
instruction |
Required. Question Answering prompt for LLM. |
QuestionAnsweringRelevanceInput
Input for question answering relevance metric.
JSON representation |
---|
{ "metricSpec": { object ( |
Fields | |
---|---|
metricSpec |
Required. Spec for question answering relevance score metric. |
instance |
Required. Question answering relevance instance. |
QuestionAnsweringRelevanceSpec
Spec for question answering relevance metric.
JSON representation |
---|
{ "useReference": boolean, "version": integer } |
Fields | |
---|---|
useReference |
Optional. Whether to use instance.reference to compute question answering relevance. |
version |
Optional. Which version to use for evaluation. |
QuestionAnsweringRelevanceInstance
Spec for question answering relevance instance.
JSON representation |
---|
{ "prediction": string, "reference": string, "context": string, "instruction": string } |
Fields | |
---|---|
prediction |
Required. Output of the evaluated model. |
reference |
Optional. Ground truth used to compare against the prediction. |
context |
Optional. Text provided as context to answer the question. |
instruction |
Required. The question asked and other instruction in the inference prompt. |
QuestionAnsweringHelpfulnessInput
Input for question answering helpfulness metric.
JSON representation |
---|
{ "metricSpec": { object ( |
Fields | |
---|---|
metricSpec |
Required. Spec for question answering helpfulness score metric. |
instance |
Required. Question answering helpfulness instance. |
QuestionAnsweringHelpfulnessSpec
Spec for question answering helpfulness metric.
JSON representation |
---|
{ "useReference": boolean, "version": integer } |
Fields | |
---|---|
useReference |
Optional. Whether to use instance.reference to compute question answering helpfulness. |
version |
Optional. Which version to use for evaluation. |
QuestionAnsweringHelpfulnessInstance
Spec for question answering helpfulness instance.
JSON representation |
---|
{ "prediction": string, "reference": string, "context": string, "instruction": string } |
Fields | |
---|---|
prediction |
Required. Output of the evaluated model. |
reference |
Optional. Ground truth used to compare against the prediction. |
context |
Optional. Text provided as context to answer the question. |
instruction |
Required. The question asked and other instruction in the inference prompt. |
QuestionAnsweringCorrectnessInput
Input for question answering correctness metric.
JSON representation |
---|
{ "metricSpec": { object ( |
Fields | |
---|---|
metricSpec |
Required. Spec for question answering correctness score metric. |
instance |
Required. Question answering correctness instance. |
QuestionAnsweringCorrectnessSpec
Spec for question answering correctness metric.
JSON representation |
---|
{ "useReference": boolean, "version": integer } |
Fields | |
---|---|
useReference |
Optional. Whether to use instance.reference to compute question answering correctness. |
version |
Optional. Which version to use for evaluation. |
QuestionAnsweringCorrectnessInstance
Spec for question answering correctness instance.
JSON representation |
---|
{ "prediction": string, "reference": string, "context": string, "instruction": string } |
Fields | |
---|---|
prediction |
Required. Output of the evaluated model. |
reference |
Optional. Ground truth used to compare against the prediction. |
context |
Optional. Text provided as context to answer the question. |
instruction |
Required. The question asked and other instruction in the inference prompt. |
ToolCallValidInput
Input for tool call valid metric.
JSON representation |
---|
{ "metricSpec": { object ( |
Fields | |
---|---|
metricSpec |
Required. Spec for tool call valid metric. |
instances[] |
Required. Repeated tool call valid instances. |
ToolCallValidSpec
This type has no fields.
Spec for tool call valid metric.
ToolCallValidInstance
Spec for tool call valid instance.
JSON representation |
---|
{ "prediction": string, "reference": string } |
Fields | |
---|---|
prediction |
Required. Output of the evaluated model. |
reference |
Required. Ground truth used to compare against the prediction. |
ToolNameMatchInput
Input for tool name match metric.
JSON representation |
---|
{ "metricSpec": { object ( |
Fields | |
---|---|
metricSpec |
Required. Spec for tool name match metric. |
instances[] |
Required. Repeated tool name match instances. |
ToolNameMatchSpec
This type has no fields.
Spec for tool name match metric.
ToolNameMatchInstance
Spec for tool name match instance.
JSON representation |
---|
{ "prediction": string, "reference": string } |
Fields | |
---|---|
prediction |
Required. Output of the evaluated model. |
reference |
Required. Ground truth used to compare against the prediction. |
ToolParameterKeyMatchInput
Input for tool parameter key match metric.
JSON representation |
---|
{ "metricSpec": { object ( |
Fields | |
---|---|
metricSpec |
Required. Spec for tool parameter key match metric. |
instances[] |
Required. Repeated tool parameter key match instances. |
ToolParameterKeyMatchSpec
This type has no fields.
Spec for tool parameter key match metric.
ToolParameterKeyMatchInstance
Spec for tool parameter key match instance.
JSON representation |
---|
{ "prediction": string, "reference": string } |
Fields | |
---|---|
prediction |
Required. Output of the evaluated model. |
reference |
Required. Ground truth used to compare against the prediction. |
ToolParameterKVMatchInput
Input for tool parameter key value match metric.
JSON representation |
---|
{ "metricSpec": { object ( |
Fields | |
---|---|
metricSpec |
Required. Spec for tool parameter key value match metric. |
instances[] |
Required. Repeated tool parameter key value match instances. |
ToolParameterKVMatchSpec
Spec for tool parameter key value match metric.
JSON representation |
---|
{ "useStrictStringMatch": boolean } |
Fields | |
---|---|
useStrictStringMatch |
Optional. Whether to use STRCIT string match on parameter values. |
ToolParameterKVMatchInstance
Spec for tool parameter key value match instance.
JSON representation |
---|
{ "prediction": string, "reference": string } |
Fields | |
---|---|
prediction |
Required. Output of the evaluated model. |
reference |
Required. Ground truth used to compare against the prediction. |
ExactMatchResults
Results for exact match metric.
JSON representation |
---|
{
"exactMatchMetricValues": [
{
object ( |
Fields | |
---|---|
exactMatchMetricValues[] |
Output only. Exact match metric values. |
ExactMatchMetricValue
Exact match metric value for an instance.
JSON representation |
---|
{ "score": number } |
Fields | |
---|---|
score |
Output only. Exact match score. |
BleuResults
Results for bleu metric.
JSON representation |
---|
{
"bleuMetricValues": [
{
object ( |
Fields | |
---|---|
bleuMetricValues[] |
Output only. Bleu metric values. |
BleuMetricValue
Bleu metric value for an instance.
JSON representation |
---|
{ "score": number } |
Fields | |
---|---|
score |
Output only. Bleu score. |
RougeResults
Results for rouge metric.
JSON representation |
---|
{
"rougeMetricValues": [
{
object ( |
Fields | |
---|---|
rougeMetricValues[] |
Output only. Rouge metric values. |
RougeMetricValue
Rouge metric value for an instance.
JSON representation |
---|
{ "score": number } |
Fields | |
---|---|
score |
Output only. Rouge score. |
FluencyResult
Spec for fluency result.
JSON representation |
---|
{ "explanation": string, "score": number, "confidence": number } |
Fields | |
---|---|
explanation |
Output only. Explanation for fluency score. |
score |
Output only. Fluency score. |
confidence |
Output only. confidence for fluency score. |
CoherenceResult
Spec for coherence result.
JSON representation |
---|
{ "explanation": string, "score": number, "confidence": number } |
Fields | |
---|---|
explanation |
Output only. Explanation for coherence score. |
score |
Output only. Coherence score. |
confidence |
Output only. confidence for coherence score. |
SafetyResult
Spec for safety result.
JSON representation |
---|
{ "explanation": string, "score": number, "confidence": number } |
Fields | |
---|---|
explanation |
Output only. Explanation for safety score. |
score |
Output only. Safety score. |
confidence |
Output only. confidence for safety score. |
GroundednessResult
Spec for groundedness result.
JSON representation |
---|
{ "explanation": string, "score": number, "confidence": number } |
Fields | |
---|---|
explanation |
Output only. Explanation for groundedness score. |
score |
Output only. Groundedness score. |
confidence |
Output only. confidence for groundedness score. |
FulfillmentResult
Spec for fulfillment result.
JSON representation |
---|
{ "explanation": string, "score": number, "confidence": number } |
Fields | |
---|---|
explanation |
Output only. Explanation for fulfillment score. |
score |
Output only. Fulfillment score. |
confidence |
Output only. confidence for fulfillment score. |
SummarizationQualityResult
Spec for summarization quality result.
JSON representation |
---|
{ "explanation": string, "score": number, "confidence": number } |
Fields | |
---|---|
explanation |
Output only. Explanation for summarization quality score. |
score |
Output only. Summarization Quality score. |
confidence |
Output only. confidence for summarization quality score. |
PairwiseSummarizationQualityResult
Spec for pairwise summarization quality result.
JSON representation |
---|
{
"pairwiseChoice": enum ( |
Fields | |
---|---|
pairwiseChoice |
Output only. Pairwise summarization prediction choice. |
explanation |
Output only. Explanation for summarization quality score. |
confidence |
Output only. confidence for summarization quality score. |
PairwiseChoice
Pairwise prediction autorater preference.
Enums | |
---|---|
PAIRWISE_CHOICE_UNSPECIFIED |
Unspecified prediction choice. |
BASELINE |
baseline prediction wins |
CANDIDATE |
Candidate prediction wins |
TIE |
Winner cannot be determined |
SummarizationHelpfulnessResult
Spec for summarization helpfulness result.
JSON representation |
---|
{ "explanation": string, "score": number, "confidence": number } |
Fields | |
---|---|
explanation |
Output only. Explanation for summarization helpfulness score. |
score |
Output only. Summarization Helpfulness score. |
confidence |
Output only. confidence for summarization helpfulness score. |
SummarizationVerbosityResult
Spec for summarization verbosity result.
JSON representation |
---|
{ "explanation": string, "score": number, "confidence": number } |
Fields | |
---|---|
explanation |
Output only. Explanation for summarization verbosity score. |
score |
Output only. Summarization Verbosity score. |
confidence |
Output only. confidence for summarization verbosity score. |
QuestionAnsweringQualityResult
Spec for question answering quality result.
JSON representation |
---|
{ "explanation": string, "score": number, "confidence": number } |
Fields | |
---|---|
explanation |
Output only. Explanation for question answering quality score. |
score |
Output only. Question Answering Quality score. |
confidence |
Output only. confidence for question answering quality score. |
PairwiseQuestionAnsweringQualityResult
Spec for pairwise question answering quality result.
JSON representation |
---|
{
"pairwiseChoice": enum ( |
Fields | |
---|---|
pairwiseChoice |
Output only. Pairwise question answering prediction choice. |
explanation |
Output only. Explanation for question answering quality score. |
confidence |
Output only. confidence for question answering quality score. |
QuestionAnsweringRelevanceResult
Spec for question answering relevance result.
JSON representation |
---|
{ "explanation": string, "score": number, "confidence": number } |
Fields | |
---|---|
explanation |
Output only. Explanation for question answering relevance score. |
score |
Output only. Question Answering Relevance score. |
confidence |
Output only. confidence for question answering relevance score. |
QuestionAnsweringHelpfulnessResult
Spec for question answering helpfulness result.
JSON representation |
---|
{ "explanation": string, "score": number, "confidence": number } |
Fields | |
---|---|
explanation |
Output only. Explanation for question answering helpfulness score. |
score |
Output only. Question Answering Helpfulness score. |
confidence |
Output only. confidence for question answering helpfulness score. |
QuestionAnsweringCorrectnessResult
Spec for question answering correctness result.
JSON representation |
---|
{ "explanation": string, "score": number, "confidence": number } |
Fields | |
---|---|
explanation |
Output only. Explanation for question answering correctness score. |
score |
Output only. Question Answering Correctness score. |
confidence |
Output only. confidence for question answering correctness score. |
ToolCallValidResults
Results for tool call valid metric.
JSON representation |
---|
{
"toolCallValidMetricValues": [
{
object ( |
Fields | |
---|---|
toolCallValidMetricValues[] |
Output only. Tool call valid metric values. |
ToolCallValidMetricValue
Tool call valid metric value for an instance.
JSON representation |
---|
{ "score": number } |
Fields | |
---|---|
score |
Output only. Tool call valid score. |
ToolNameMatchResults
Results for tool name match metric.
JSON representation |
---|
{
"toolNameMatchMetricValues": [
{
object ( |
Fields | |
---|---|
toolNameMatchMetricValues[] |
Output only. Tool name match metric values. |
ToolNameMatchMetricValue
Tool name match metric value for an instance.
JSON representation |
---|
{ "score": number } |
Fields | |
---|---|
score |
Output only. Tool name match score. |
ToolParameterKeyMatchResults
Results for tool parameter key match metric.
JSON representation |
---|
{
"toolParameterKeyMatchMetricValues": [
{
object ( |
Fields | |
---|---|
toolParameterKeyMatchMetricValues[] |
Output only. Tool parameter key match metric values. |
ToolParameterKeyMatchMetricValue
Tool parameter key match metric value for an instance.
JSON representation |
---|
{ "score": number } |
Fields | |
---|---|
score |
Output only. Tool parameter key match score. |
ToolParameterKVMatchResults
Results for tool parameter key value match metric.
JSON representation |
---|
{
"toolParameterKvMatchMetricValues": [
{
object ( |
Fields | |
---|---|
toolParameterKvMatchMetricValues[] |
Output only. Tool parameter key value match metric values. |
ToolParameterKVMatchMetricValue
Tool parameter key value match metric value for an instance.
JSON representation |
---|
{ "score": number } |
Fields | |
---|---|
score |
Output only. Tool parameter key value match score. |