-
Notifications
You must be signed in to change notification settings - Fork 1.2k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[QST] Adding new parameter to Conv2dFprop in Python #2166
Comments
Thanks for the detailed messages. I think you'll need to modify the cutlass/python/cutlass/backend/c_types.py Lines 397 to 412 in 06e560d
|
@jackkosaian I modified the part of the code you told me in cutlass/python/cutlass/backend/c_types.py Lines 397 to 412 in 06e560d
However I still got the same error. I believe this is because the pointers ptr_mask0s, ptr_maks1s and the flag are not correctly initialised in the host code, for instance, as I did in Host Tensot code:
I think the solution is to create two tensors and a flag in the CPU host code in Python, initialise them and then sync() with the GPU memory. To do so, I decided to modify some parts of the code from different files: In argumments.py:
In ctypes.py:
And last, in conv.py:
I executed the code again with all of these modifications, and now I received another different error related to verify_tensor() function:
Could you please tell me how to proceed and which parts of the Python host code I should modify that I have not mentioned yet? Is every step I have shown correct to achieve my purpose? |
@jackkosaian, sorry, but do you have any update about it, please? I have a little urgency :( |
It's a bit hard for me to tell what differences were made given the pastes of large chunks of code alone. Do you have a fork of CUTLASS containing the changes you've made that you could link here? That would make it easier to help. |
@jackkosaian Sorry, I didn't fork CUTLASS. I am modifying the code locally. I can submit though the four files I modify (conv.py, conv2d_operation.py, arguments.py and ctypes.py) or the diff file between my files and the original ones if this serves to clarify things. |
Yes, seeing the diff of your whole repo (both C++ and Python changes) would be helpful. |
@jackkosaian I paste here a ZIP file including the diff files created from the following files: c++ files (current modifications working): Host Tensot , Tensor_ref , conv kernel and example 16. python files: conv.py, conv2d_operation.py, ctypes.py and argumments.py. Let me know your thoughts when you check the code. |
Thanks. What is the Python code that you are running that calls into your modified CUTLASS Python/C++? |
I got the same error you are mentioning when I ran the following code: plan = cutlass.Conv2dFprop(element=dtype, element_accumulator=torch.float32)
plan.run(input, weight, tensor_C, output, stride, padding, dilation, alpha, beta, print_module=print_module) The issue is that the ptr_mask and device_flag variables were not being passed to plan.run(). Are you calling plan.run() like what follows? plan = cutlass.Conv2dFprop(element=dtype, element_accumulator=torch.float32)
plan.run(input, weight, tensor_C, output, mask0, mask1, device_flag, stride, padding, dilation, alpha, beta, print_module=print_module) The Python issue went away after I changed the code to this. There is a compilation issue after that, which seems to be related to logic added in your modifications. |
@jackkosaian, thank you for your help, this is my host Python code:
As you said, I did not add "masks, masks and device_flag" parameters. However, my goal was to avoid adding this parameter in the user code here. Can these parameters be auto-initialized or similar from the inside conv.py file or run() function? It would be better for what I intend to achieve if the user continues to call the constructor function |
Yes, you should be able to do this. You will need to remove these arguments from the constructor and You could then use |
@jackkosaian, which part inside run() is the best for initializing these parameters, and which is the proper way, in your opinion, to do it? I mean, before cutlass/python/cutlass/op/conv.py Line 852 in 06e560d
Because inside Conv2dArguments, there is already get_arguments() function which performs |
Anywhere above there should be fine, but it might make most sense to put it here, where we get the tensors for A, B, etc. |
What is your question?
Hi, I have implemented a modification to the conv2d kernel in Cutlass. I have added three new attributes (two tensors, mask0s and mask1s, with the same datatype as the filters) and a flag (integer) that can be used in the convolutional kernel:
cutlass/include/cutlass/conv/kernel/implicit_gemm_convolution.h
Line 281 in 24f991e
To do so, I modified the cutlass code into three parts:
I have shown only the most critical functions globally, but inside the class, I modified constructors and reset methods to allocate memory from the host and later with copy_to_devidce() in the GPU.
All of this code is currently working, and it does ok without problems or memory errors while executing example 16.
My question now is how to adapt the host code to Python. The device code works well and takes the same arguments as before; the only difference is that now the tensor_ref class can have three more attributes. However, when defining the convolution on python there is a part in which it is specified the arguments in
cutlass/python/cutlass/backend/conv2d_operation.py
Line 214 in 375e284
I tried to modify this part of the code by adding two ElementB pointers and an integer and using the new constructor of tenser_ref, which I have previously added:
But of course, it gives an error of cuda illegal address when executing sync() to the parameters because I have added two pointers and an integer that has not correctly copied to GPU memory:
So, the question is, where do I need to add these pointers and the integer to properly initialize them on the host CPU memory and copy them to the GPU??
In the example 16 it is used host_tensor class, but in python, there is not such a class, so I am a little bit lost on this.
Any feedback would be appreciated.
Thanks.
The text was updated successfully, but these errors were encountered: