Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[Self-Host] /v1/crawl/ endpoint error when no supabase #1127

Open
daniel5gh opened this issue Feb 3, 2025 · 3 comments
Open

[Self-Host] /v1/crawl/ endpoint error when no supabase #1127

daniel5gh opened this issue Feb 3, 2025 · 3 comments

Comments

@daniel5gh
Copy link

daniel5gh commented Feb 3, 2025

Describe the Issue
I have a docker compose setup as described in SELF_HOST.md. I have not configured supabase or authentication.

When I hit the crawl status endpoint I see the following error:

error [:]: Error occurred in request! (/v1/crawl/6ebe383b-742b-4098-9c41-a1213cdce1df) -- ID 35396e67f7b54070855f1eec0ac31994 -- {"message":"Supabase client is not configured.","name":"Error","stack":"Error: Supabase client is not configured.\n    at Proxy.<anonymous> (/app/dist/src/services/supabase.js:41:23)\n    at crawlStatusController (/app/dist/src/controllers/v1/crawl-status.js:154:14)\n    at process.processTicksAndRejections (node:internal/process/task_queues:105:5)"} {}

To Reproduce
Steps to reproduce the issue:

  1. Configure the environment or settings without SUPABASE_URL (or I am using an empty value)
  2. Start the stack with docker compose up
  3. POST to http://localhost:3002/v1/crawl with a valid job
  4. GET status endpoint with ID from previous step: http://localhost:3002/v1/crawl/6ebe383b-742b-4098-9c41-a1213cdce1df
  5. Observe the error on the api service container: see description above

Expected Behavior
I expect a status 200 and a crawl result.

Environment (please complete the following information):

  • OS: docker on WSL
  • Firecrawl Version: 5894076
  • Node.js Version: node:23-slim
  • Docker Version (not applicable): 27.4.0
  • Database Type and Version: n/a

Logs

error [:]: Error occurred in request! (/v1/crawl/6ebe383b-742b-4098-9c41-a1213cdce1df) -- ID 35396e67f7b54070855f1eec0ac31994 -- {"message":"Supabase client is not configured.","name":"Error","stack":"Error: Supabase client is not configured.\n    at Proxy.<anonymous> (/app/dist/src/services/supabase.js:41:23)\n    at crawlStatusController (/app/dist/src/controllers/v1/crawl-status.js:154:14)\n    at process.processTicksAndRejections (node:internal/process/task_queues:105:5)"} {}

Additional Context
I have solved the issue by using the following check, going off what is done in other endpoints (crawlCancelController for example) this seems the right thing to do when no supabase is configured:

  const useDbAuthentication = process.env.USE_DB_AUTHENTICATION === "true";

  if (useDbAuthentication && totalCount === 0) {
    const x = await supabase_service
      .from('firecrawl_jobs')
      .select('*', { count: 'exact', head: true })
      .eq("crawl_id", req.params.jobId)
      .eq("success", true)
    
    totalCount = x.count ?? 0;
  }

at

if (totalCount === 0) {
const x = await supabase_service
.from('firecrawl_jobs')
.select('*', { count: 'exact', head: true })
.eq("crawl_id", req.params.jobId)
.eq("success", true)
totalCount = x.count ?? 0;
}

@rothnic
Copy link
Contributor

rothnic commented Feb 5, 2025

Just came across this same issue in relation to the /extract endpoint and was curious what the plan is.

  • Currently, disabling db authentication means nothing is stored to supabase, so these two aspects are currently coupled together
  • Should firecrawl be modified to support a single-user type mode for supabase using a self hosted or free instance?
    • Supabase would need to be initialized if it doesn't handle this already
    • Would need to create the supabase client even when db auth is disabled, but still allow storing data to supabase
    • Or, do we have the ability to create a supabase user manually for authentication in self hosted mode
  • Or, do we assume if db authentication is disabled that we just should return the results from the redis bull queue? (easiest option)

I found that I was able to pull the extract response from redis, rather than supabase, within extract-status.ts. I'm sure we could do the same for crawl. I went ahead and created a draft pull request with this change to review. See the changes to extract-status.ts.

@user72356
Copy link

I have the same issue.

@rothnic
Copy link
Contributor

rothnic commented Feb 7, 2025

I made some updates to the merge request and it should be about good to go. You can try my branch in the meantime if you want.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

3 participants