Skip to content

Exception of ngx.print and ngx.flush in access phase under boundary conditions #2410

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Open
bzp2010 opened this issue Apr 22, 2025 · 0 comments · May be fixed by #2411
Open

Exception of ngx.print and ngx.flush in access phase under boundary conditions #2410

bzp2010 opened this issue Apr 22, 2025 · 0 comments · May be fixed by #2411

Comments

@bzp2010
Copy link

bzp2010 commented Apr 22, 2025

Current behavior

On 1.27.1.1 (I think this can be reproduced on the historical version as well), the following error occurs when sending a streaming response using ngx.print and ngx.flush in the access phase:

2025/04/21 10:16:36 [error] 2735#2735: *1 lua entry thread aborted: runtime error: access_by_lua(nginx.conf:18):3: API disabled in the context of body_filter_by_lua*
stack traceback:
coroutine 0:
	[C]: in function 'print'
	access_by_lua(nginx.conf:18):3: in main chunk, client: 127.0.0.1, server: , request: "GET / HTTP/1.1", host: "127.0.0.1:9080"

Expected behavior

There should be no error.

Steps to reproduce

  1. Start openresty with the following nginx.conf
master_process on;
worker_processes 1;
error_log logs/error.log warn;
worker_rlimit_nofile 20480;
events {
    accept_mutex off;
    worker_connections 10620;
}
worker_rlimit_core  16G;
worker_shutdown_timeout 240s;

http {
    server {
        http2 on;
        listen 0.0.0.0:9080 default_server;

        location / {
            access_by_lua_block {
                while true do
                    ngx.print(("hello world"):rep(100000))
                    ngx.flush(true)
                end
            }

            body_filter_by_lua_block {
                ngx.log(ngx.DEBUG, "body_filter_by_lua_block")
            }
        }
    }
}
  1. Access 127.0.0.1:9080

It will spit out some response, after which the connection is closed, and looking at error.log, you can find the error mentioned above.

runtime error: access_by_lua(nginx.conf:18):3: API disabled in the context of body_filter_by_lua

Debug

I use GDB for debugging.

This error is triggered by the yieldable phase checking code in ngx.print or ngx.flush, so the error is due to the phase checking code incorrectly assuming that it is in the body_filter phase instead of the access where it should be.

It uses ctx->context on the lua module ctx to make the determination.

Checking the code in body_filter (https://github.com/openresty/lua-nginx-module/blob/master/src/ngx_http_lua_bodyfilterby.c#L327-L328), it does set it to body_filter before the Lua code enters that phase, and reverts to entering the phase it is in after the code logic is executed (https://github.com/openresty/lua-nginx-module/blob/master/src/ngx_http_lua_bodyfilterby.c#L339). Since NGINX is essentially single-threaded, this works and the state is expected to be restored correctly.

I see that there is another branch regarding the input in, where the state is not restored when in is NULL. So I'm setting a breakpoint here to check why the input value is NULL or not and why this is the case.

I found out that in some specific cases it does have in as NULL calls, so the status can't be restored.

Checking the call stack:

ngx_http_lua_wev_handler => https://github.com/openresty/lua-nginx-module/blob/master/src/ngx_http_lua_util.c#L1764

ngx_http_lua_flush_pending_output => https://github.com/openresty/lua-nginx-module/blob/master/src/ngx_http_lua_util.c#L1878

  if (ctx->busy_bufs) => ngx_http_lua_output_filter(r, NULL) => https://github.com/openresty/lua-nginx-module/blob/master/src/ngx_http_lua_util.c#L1898

ngx_http_lua_body_filter(r, NULL) => https://github.com/openresty/lua-nginx-module/blob/master/src/ngx_http_lua_util.c#L722

This is triggered by ngx_http_lua_wev_handler and causes an output_filter (body_filter) call with an in of NULL to be made because ctx->busy_bufs is not empty in ngx_http_lua_flush_pending_output, which will result in an incorrectly recovered state in ngx_http_lua_body_filter.

Since the call was initiated by ngx_http_lua_wev_handler, I suspected that this condition would occur when sending a large responses with a slow downstream resulting in data accumulating in the kernel TCP buffer or inside NGINX, so I constructed the above mentioned nginx.conf to verify this, and outputting some large data chunks during the access phase does indeed reproduce the issue.

Fix

As a fix, I think that even with the ngx_http_lua_body_filter call triggered by the fd's epoll writable event, the context should be restored rather than retaining a broken context.

i.e. add ctx->context = old_context at https://github.com/openresty/lua-nginx-module/blob/master/src/ngx_http_lua_bodyfilterby.c#L370-L372

I'll launch a PR to fix it. #2411

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging a pull request may close this issue.

1 participant