Skip to content

Unpredictable response time with query method in LlamaIndex #14680

Answered by dosubot bot
norbertkocon94 asked this question in Q&A
Discussion options

You must be logged in to vote

Given your project requirements for fast and accurate output, the best-suited response mode in LlamaIndex is ResponseMode.COMPACT.

elif response_mode == ResponseMode.COMPACT:
    return CompactAndRefine(
        service_context=service_context,
        text_qa_template=text_qa_template,
        refine_template=refine_template,
        output_cls=output_cls,
        streaming=streaming,
        structured_answer_filtering=structured_answer_filtering,
        program_factory=program_factory,
        verbose=verbose,
    )

The ResponseMode.COMPACT mode uses the CompactAndRefine synthesizer, which is designed to provide a balance between speed and accuracy by refining the responses in a comp…

Replies: 1 comment 4 replies

Comment options

You must be logged in to vote
4 replies
@norbertkocon94
Comment options

@dosubot
Comment options

@norbertkocon94
Comment options

@dosubot
Comment options

Answer selected by norbertkocon94
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Category
Q&A
Labels
None yet
1 participant