Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

RecursionError: maximum recursion depth exceeded with ElasticVectorSearch during RetrievalQA #8836

Closed
6 of 14 tasks
lauradang opened this issue Aug 6, 2023 · 2 comments
Closed
6 of 14 tasks
Labels
🤖:bug Related to a bug, vulnerability, unexpected error with an existing feature Ɑ: embeddings Related to text embedding models module Ɑ: vector store Related to vector store module

Comments

@lauradang
Copy link

lauradang commented Aug 6, 2023

System Info

Langchain version: 0.0.240
Python version: 3.10.2
Elasticsearch version: 7.17.0
System Version: macOS 13.4 (22F66)
Model Name: MacBook Pro
Model Identifier: Mac14,10
Chip: Apple M2 Pro
Total Number of Cores: 12 (8 performance and 4 efficiency)
Memory: 32 GB

Who can help?

@hwchase17 @agola11

Information

  • The official example notebooks/scripts
  • My own modified scripts

Related Components

  • LLMs/Chat Models
  • Embedding Models
  • Prompts / Prompt Templates / Prompt Selectors
  • Output Parsers
  • Document Loaders
  • Vector Stores / Retrievers
  • Memory
  • Agents / Agent Executors
  • Tools / Toolkits
  • Chains
  • Callbacks/Tracing
  • Async

Reproduction

run this using python3 script.py

script.py:

import os

from langchain.chains import RetrievalQA
from langchain.chat_models import ChatOpenAI
from langchain.document_loaders import TextLoader
from langchain.embeddings.openai import OpenAIEmbeddings
from langchain.text_splitter import RecursiveCharacterTextSplitter
from langchain.vectorstores import ElasticVectorSearch


def main():
    text_path = "some-test.txt"
    loader = TextLoader(text_path)
    data = loader.load()

    text_splitter = RecursiveCharacterTextSplitter.from_tiktoken_encoder(
        chunk_size=1000, chunk_overlap=0
    ) # I have also tried various chunk sizes, but still have the same error

    documents = text_splitter.split_documents(data)

    api_key = "..."
    embeddings = OpenAIEmbeddings(openai_api_key=api_key)

    os.environ["ELASTICSEARCH_URL"] = "..."
    db = ElasticVectorSearch.from_documents(
        documents,
        embeddings,
        index_name="laurad-test",
    )
    print(db.client.info())

    db = ElasticVectorSearch(
        index_name="laurad-test",
        embedding=embeddings,
        elasticsearch_url="..."
    )

    qa = RetrievalQA.from_chain_type(
        llm=ChatOpenAI(temperature=0, openai_api_key=api_key),
        chain_type="stuff",
        retriever=db.as_retriever(),
    )


if __name__ == "__main__":
    main()

Error traceback:

Traceback (most recent call last):
  File "/Users/laurad/dev/LLM/public_test.py", line 46, in <module>
    main()
  File "/Users/laurad/dev/LLM/public_test.py", line 41, in main
    retriever=db.as_retriever(),
  File "/Library/Frameworks/Python.framework/Versions/3.10/lib/python3.10/site-packages/langchain/vectorstores/base.py", line 458, in as_retriever
    tags.extend(self.__get_retriever_tags())
  File "/Library/Frameworks/Python.framework/Versions/3.10/lib/python3.10/site-packages/langchain/vectorstores/base.py", line 452, in __get_retriever_tags
    if self.embeddings:
  File "/Library/Frameworks/Python.framework/Versions/3.10/lib/python3.10/site-packages/langchain/vectorstores/elastic_vector_search.py", line 158, in embeddings
    return self.embeddings
  File "/Library/Frameworks/Python.framework/Versions/3.10/lib/python3.10/site-packages/langchain/vectorstores/elastic_vector_search.py", line 158, in embeddings
    return self.embeddings
  File "/Library/Frameworks/Python.framework/Versions/3.10/lib/python3.10/site-packages/langchain/vectorstores/elastic_vector_search.py", line 158, in embeddings
    return self.embeddings
  [Previous line repeated 993 more times]
RecursionError: maximum recursion depth exceeded

Expected behavior

I expect to ask questions and have answers provided back using the langchain.chains.retrieval_qa.base.RetrievalQA class. However, I am getting a RecursionError when creating the retrieval chain.

Note: I do not get the same error when using ChromaDB or OpenSearch as the retriever.

@dosubot dosubot bot added Ɑ: embeddings Related to text embedding models module Ɑ: vector store Related to vector store module 🤖:bug Related to a bug, vulnerability, unexpected error with an existing feature labels Aug 6, 2023
@dosubot
Copy link

dosubot bot commented Aug 6, 2023

Answer generated by a 🤖

Answer

Hello,

Thank you for providing detailed information about the issue you're experiencing. It appears that you're encountering a RecursionError when using the ElasticVectorSearch as the retriever in your RetrievalQA chain.

Based on the context provided, there was a similar issue reported in the LangChain repository: ElasticVectorSearch().as_retriever() throwing RecursionError: maximum recursion depth exceeded. This issue was resolved with a commit on July 26, 2023.

The commit message suggests that the RecursionError was due to an incorrect field reference in the embeddings() method of the ElasticVectorSearch class. However, the exact changes made in the commit are not provided in the context.

You mentioned that you're using LangChain version 0.0.240. The issue was resolved after this version, and the latest version at the time of writing is 0.0.254. It's possible that the issue you're experiencing has been fixed in the latest version.

I recommend updating your LangChain version to the latest one, 0.0.254, and see if the issue persists. You can do this by running the following command in your terminal:

pip install --upgrade langchain

After upgrading, please try running your script again. If the issue still persists, please provide any additional error messages or unexpected behavior you encounter.

I hope this helps! Let me know if you have any other questions or need further assistance.


This response is meant to be useful, save you time, and share context. It is not meant to be a precise solution, but rather a starting point for your own research.

Help me be more useful! Please leave a 👍 if this is helpful and 👎 if it is irrelevant.

@lauradang
Copy link
Author

Resolved! Thank you bot :)

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
🤖:bug Related to a bug, vulnerability, unexpected error with an existing feature Ɑ: embeddings Related to text embedding models module Ɑ: vector store Related to vector store module
Projects
None yet
Development

No branches or pull requests

1 participant