[Question]: Can I past a list of query strings to query_engine.query() or query_engine.retrieve() i.e., batch query/retrieve? #7218

kevon217 · 2023-08-10T18:18:15Z

Question Validation

I have searched both the documentation and discord for an answer.

Question

It's not entirely clear to me if a RetrieverQueryEngine() supports batch queries. I'm not using an LLM and just want to retrieve semantically similar results.

My query engine for my chromdb index is defined as follows:

index = VectorStoreIndex.from_vector_store(vector_store=vector_store,storage_context=storage_context, service_context=service_context,)

retriever = VectorIndexRetriever( index=index, similarity_top_k=cfg.semantic_search.query.similarity_top_k, vector_store_query_mode="text_search", )

response_synthesizer = get_response_synthesizer( response_mode="no_text", service_context=service_context )

rerank = SentenceTransformerRerank( model=cfg.semantic_search.query.rerank.cross_encoder.model_name, top_n=cfg.semantic_search.query.rerank.cross_encoder.top_n, )

query_engine = RetrieverQueryEngine( retriever=retriever, response_synthesizer=response_synthesizer, node_postprocessors=[DummyNodePostprocessor(), rerank], )

My hope was that I could pass a list of query strings as follows:

results_q = query_engine.query(queries)
OR
results_r = query_engine.retriev(queries)

Do async methods apply at all?

The text was updated successfully, but these errors were encountered:

dosubot · 2023-08-10T20:09:27Z

Answer generated by a 🤖

Answer

Hello,

Based on the current LlamaIndex codebase, the RetrieverQueryEngine class does not support batch queries out of the box. The query() and retrieve() methods are designed to handle a single QueryBundle at a time.

However, you can modify the RetrieverQueryEngine class to support batch queries by creating a new method that accepts a list of QueryBundle objects and then iterates over this list, calling the retrieve or aretrieve method for each QueryBundle in the list. Here is an example of how you could implement this:

def batch_retrieve(self, query_bundles: List[QueryBundle]) -> List[List[NodeWithScore]]:
    return [self.retrieve(query_bundle) for query_bundle in query_bundles]

async def batch_aretrieve(self, query_bundles: List[QueryBundle]) -> List[List[NodeWithScore]]:
    return [await self.aretrieve(query_bundle) for query_bundle in query_bundles]

These methods will return a list of lists, where each inner list contains the NodeWithScore objects returned by the retrieve or aretrieve method for a single QueryBundle.

Please note that this is a workaround and may not be the most efficient solution, especially for large batches of queries. It's also important to note that the async methods (aretrieve and batch_aretrieve) should be used within an event loop, as they are designed to be non-blocking and to allow other tasks to run concurrently.

If you feel this is a feature that would benefit the LlamaIndex community, I encourage you to contribute to the project by making a pull request with your proposed changes. Your contribution would be greatly appreciated.

Let me know if you have any other questions or need further clarification.

Sources

llama_index/query_engine/retriever_query_engine.py

This response is meant to be useful, save you time, and share context. It is not meant to be a precise solution, but rather a starting point for your own research.

Help me be more useful! Please leave a 👍 if this is helpful and 👎 if it is irrelevant.

logan-markewich · 2023-08-10T22:04:30Z

The bot is correct :) Just need to be careful about rate limits then when running so many queries

kevon217 · 2023-08-11T16:00:01Z

@logan-markewich, wow, that was a great bot response. Appreciate it and your confirmation.

I'm actually just using it for semantic search retrieval, so no calls to llm.

Data-drone · 2023-09-07T05:17:43Z

@logan-markewich Is there a Query engine designed for batch queries?

logan-markewich · 2023-09-07T05:30:31Z

@Data-drone nah, and I don't think there's one planned

Your best bet is async with aquery() and aretrieve()

Would welcome PR for it though

kevon217 added the question Further information is requested label Aug 10, 2023

logan-markewich closed this as completed Aug 10, 2023

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

[Question]: Can I past a list of query strings to query_engine.query() or query_engine.retrieve() i.e., batch query/retrieve? #7218

[Question]: Can I past a list of query strings to query_engine.query() or query_engine.retrieve() i.e., batch query/retrieve? #7218

kevon217 commented Aug 10, 2023

dosubot bot commented Aug 10, 2023

logan-markewich commented Aug 10, 2023

kevon217 commented Aug 11, 2023 •

edited

Loading

Data-drone commented Sep 7, 2023

logan-markewich commented Sep 7, 2023

[Question]: Can I past a list of query strings to query_engine.query() or query_engine.retrieve() i.e., batch query/retrieve? #7218

[Question]: Can I past a list of query strings to query_engine.query() or query_engine.retrieve() i.e., batch query/retrieve? #7218

Comments

kevon217 commented Aug 10, 2023

Question Validation

Question

dosubot bot commented Aug 10, 2023

Answer

Sources

logan-markewich commented Aug 10, 2023

kevon217 commented Aug 11, 2023 • edited Loading

Data-drone commented Sep 7, 2023

logan-markewich commented Sep 7, 2023

kevon217 commented Aug 11, 2023 •

edited

Loading