Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[BUG] Memory issues when using the solver for multiple registration attempts #30

Open
fdila opened this issue Feb 27, 2025 · 26 comments
Open
Labels
bug Something isn't working

Comments

@fdila
Copy link
Contributor

fdila commented Feb 27, 2025

Hi!
first of all: I don't know if this is a bug or if I'm not doing something properly.

I need to use the KISS matcher to align multiple pointclouds, like in a loop, but after a couple hundreds of alignments I get the error

double free or corruption (!prev)
Aborted (core dumped)

This is a snippet to reproduce my error, I took the examples/run_kiss_matcher.py script and instead of solving the registration one time I do it in a loop for a thousand times. After a non-deterministic amout of iterations I get the error copied above.
Snippet:

    params = kiss_matcher.KISSMatcherConfig(args.resolution)
    matcher = kiss_matcher.KISSMatcher(params)
    for i in range(1000):     
        result = matcher.estimate(src, tgt)
        matcher.clear()
        matcher.reset()
        matcher.reset_solver()
        print(i)

As show above, I'm clearing and resetting the matcher object, but maybe it is not enough to completely clear the internal state?

I tried to do the same thing in cpp and apparently there is no memory bug there, so I guess it has something to do with the pybiding. I tried to search in the pybinding repo issues and there are other people complaining about similar problems, but I haven't been able to pinpoint what exactly is making the python script crash.

@fdila fdila added the bug Something isn't working label Feb 27, 2025
@LimHyungTae
Copy link
Member

Oh thanks for reporting!

Hmm...that looks off, but occasionally, I also saw that error. Could you run the code like this?

    params = kiss_matcher.KISSMatcherConfig(args.resolution)
   
    for i in range(1000):     
        matcher = kiss_matcher.KISSMatcher(params)
        result = matcher.estimate(src, tgt)
        matcher.clear()
        matcher.reset()
        matcher.reset_solver()
        print(i)

@LimHyungTae
Copy link
Member

In the case of Python, at which iteration does the error occur?

@LimHyungTae
Copy link
Member

Hmmm...bro I tested python by running KISS-Matcher multiple time; but there are no memory issues. Could you check your data carefully? Because you acquired very noisy data, right? Or if it is possible, you can share your python script and data with me.

@fdila fdila changed the title [BUG] Memory issues with pybindings when using the solver for multiple registration attempts [BUG] Memory issues when using the solver for multiple registration attempts Feb 28, 2025
@fdila
Copy link
Contributor Author

fdila commented Feb 28, 2025

My bad, apparently it's happening also with the "pure" cpp code, at least there is one less layer (pybinding) to worry about.
Attached there is the cpp code.
I'm using the test data provided: ./run_kiss_matcher data/Vel16/src.pcd data/Vel16/tgt.pcd 0.3
It happens pretty randomly, it happened at the 52 iteration, at the 218, at 357, at 673, sometimes it does not happen at all.
I must say I am pretty confused about the randomness of this bug.

Just in case here is a video proving I'm not going mad: https://drive.google.com/file/d/1V_3Un0wdeE0gZWq2ECEtALP-G2xTMjde/view?usp=sharing

run_kiss_matcher.txt

@LimHyungTae
Copy link
Member

LimHyungTae commented Feb 28, 2025

Due to ICCV deadline, currently, I have no bandwidh—could you double-check which function causes the memory issue (using GDB or just putting std::cout line-by-line)?

@fdila
Copy link
Contributor Author

fdila commented Mar 3, 2025

No problem, thanks for the code and help!

It is crashing due something in this parellel for loop in the FasterPFH.cpp according to my debugging:

tbb::parallel_for(tbb::blocked_range<size_t>(0, N), [&](const tbb::blocked_range<size_t> &r) {
    //    tbb::parallel_for(0, N, [&](const int& j) {
    for (size_t j = r.begin(); j != r.end(); ++j) {
      const int p_idx = fpfh_indices_[j];
      std::vector<uint32_t> nn_indices;
      std::vector<double> nn_dists;
      nn_indices.reserve(corrs_fpfh_[p_idx].neighboring_indices.size());
      nn_dists.reserve(corrs_fpfh_[p_idx].neighboring_dists.size());
      //      nn_cardinalities.resize(p_voxel.valid_neighboring_voxels.size());

      const auto &indices = corrs_fpfh_[p_idx].neighboring_indices;
      const auto &dists   = corrs_fpfh_[p_idx].neighboring_dists;

      for (size_t i = 0; i < indices.size(); ++i) {
        if (is_valid_[indices[i]]) {
          nn_indices.emplace_back(spfh_hist_lookup_[indices[i]]);
          nn_dists.emplace_back(dists[i]);
        }
      }

      points[j] = points_[p_idx];
      WeightPointSPFHSignature(hist_f1_, hist_f2_, hist_f3_, nn_indices, nn_dists, descriptors[j]);
    }
  });

I think there is some unlucky race condition that randomly comes up if you have a lot of CPU cores: I am having issues when using a workstation with a 24-cores i9-13900K, but could not reproduce the issue on my laptop with a 10-cores i7-1355U.

After removing the parallel_for and using a simple for loop it looks like the memory issues are fixed.
Obviously the time for feature extraction went up, for scan to scan registration it went from around 0.005 seconds to around 0.015 seconds, but for my use case it's still fast enough.

I'm not really an expert on fixing race conditions, and removing the parallelism, even if not ideal, is a good enough solution for me, so I don't think I will be investigating further this issue.

@LimHyungTae
Copy link
Member

Thanks for your help! My desktop is also an AMD Ryzen™ Threadripper™ 7960X 24-Core, 48-Thread Processor, but the error has not occurred. Let me test it in the future (please just wait for one week!).

@ewak
Copy link
Contributor

ewak commented Mar 4, 2025

@fdila Maybe you could give my PR a try if you like. Assuming you have a ROS2 installation you could try some of what is below.

source /opt/ros/humble/setup.bash
cd $HOME
mkdir -p kiss_ws/data
# Put data you want to test with in here as approprate - ie Vel16/src.pcd and Vel16/tgt.pcd

mkdir -p kiss_ws/src

cd kiss_ws/src
git clone https://github.com/MIT-SPARK/KISS-Matcher.git
git remote add ewak git@github.com:ewak/KISS-Matcher.git
git fetch ewak
git checkout feature/mw/build_and_warnings
cd $HOME/kiss_ws

#Build 
colcon build --packages-select kiss_matcher_ros 
 
#Run with some yaw=6deg, roll=20deg. stress iterations =100
./install/kiss_matcher_ros/lib/kiss_matcher_ros/run_kiss_matcher  ./data/Vel16/tgt.pcd ./data/Vel16/src.pcd  1.0 6.0 20.0 1000
#To build with address sanitizer, blow away your build and install dirs
cd $HOME/kiss_ws
rm -fr ./build ./install
colcon build --packages-select kiss_matcher_ros --cmake-args -DCMAKE_BUILD_TYPE=RelWithDebInfo -DUSE_ADDRESS_SANITIZER=On
./install/kiss_matcher_ros/lib/kiss_matcher_ros/run_kiss_matcher  ./data/Vel16/tgt.pcd ./data/Vel16/src.pcd  1.0 6.0 20.0 1000
#To build with Debug and run with valgrind and get it to stop in gdb on first error. (note AddressSanitzer and valgrind are incompatible and don't mix)
cd $HOME/kiss_ws
rm -fr ./build ./install
colcon build --packages-select kiss_matcher_ros --cmake-args -DCMAKE_BUILD_TYPE=Debug

valgrind   --vgdb-error=1  ./install/kiss_matcher_ros/lib/kiss_matcher_ros/run_kiss_matcher  ./data/Vel16/tgt.pcd ./data/Vel16/src.pcd  1.0 6.0 20.0 1000

# In another terminal once an error has been hit.
gdb -ex 'target remote | /usr/bin/vgdb'

# Get a backtrace and share it here.
bt

# Kill off remote monitor - If you just quit gdb its hard to stop it, you will have to resort to `ps aux | grep valgrind` to find its pid and kill it.    
info inferiors
kill inferiors
OR just k for short.

# Quit gdb
q

@LimHyungTae
Copy link
Member

Thanks, @ewak ! Could you follow the instructions above, @fdila ?

@fdila
Copy link
Contributor Author

fdila commented Mar 10, 2025

Hi, I'll try it tomorrow thanks!

@fdila
Copy link
Contributor Author

fdila commented Mar 11, 2025

The execution with valgrind is taking quite a while, in the meantime I'm attaching the output of running with address sanitizer, which to me seems to confirm it could be a concurrency issue.

It seems like a really niche edge case, I was only able to reproduce with resolution 0.3 and no added yaw and roll angles.

address_sanitizer.txt

@ewak
Copy link
Contributor

ewak commented Mar 11, 2025

Can you show the exact command you used?

@ewak
Copy link
Contributor

ewak commented Mar 11, 2025

Nevermind I just reproduced it(Address sanitzer failure) on iteration 38, 159, 162 with
./install/kiss_matcher_ros/lib/kiss_matcher_ros/run_kiss_matcher ./data/Vel16/tgt.pcd ./data/Vel16/src.pcd 0.3 0.0 0.0 1000

@ewak
Copy link
Contributor

ewak commented Mar 11, 2025

Thread sanitizer shows plenty of problems

Here is the first

==================
WARNING: ThreadSanitizer: data race (pid=1948891)
  Read of size 8 at 0x7f5e71b13d88 by main thread:
    #0 bool tbb::detail::d1::dynamic_grainsize_mode<tbb::detail::d1::adaptive_mode<tbb::detail::d1::auto_partition_type> >::check_being_stolen<tbb::detail::d1::start_for<tbb::detail::d1::blocked_range<unsigned long>, kiss_matcher::VoxelgridSampling(std::vector<Eigen::Matrix<float, 3, 1, 0, 3, 1>, std::allocator<Eigen::Matrix<float, 3, 1, 0, 3, 1> > > const&, double)::{lambda(tbb::detail::d1::blocked_range<unsigned long> const&)#1}, tbb::detail::d1::auto_partitioner const> >(tbb::detail::d1::start_for<tbb::detail::d1::blocked_range<unsigned long>, kiss_matcher::VoxelgridSampling(std::vector<Eigen::Matrix<float, 3, 1, 0, 3, 1>, std::allocator<Eigen::Matrix<float, 3, 1, 0, 3, 1> > > const&, double)::{lambda(tbb::detail::d1::blocked_range<unsigned long> const&)#1}, tbb::detail::d1::auto_partitioner const>&, tbb::detail::d1::execution_data const&) /usr/include/oneapi/tbb/partitioner.h:421 (run_kiss_matcher+0x36fa8)
    #1 tbb::detail::d1::start_for<tbb::detail::d1::blocked_range<unsigned long>, kiss_matcher::VoxelgridSampling(std::vector<Eigen::Matrix<float, 3, 1, 0, 3, 1>, std::allocator<Eigen::Matrix<float, 3, 1, 0, 3, 1> > > const&, double)::{lambda(tbb::detail::d1::blocked_range<unsigned long> const&)#1}, tbb::detail::d1::auto_partitioner const>::execute(tbb::detail::d1::execution_data&) /usr/include/oneapi/tbb/parallel_for.h:171 (run_kiss_matcher+0x36fa8)
    #2 <null> <null> (libtbb.so.12+0x286c0)
    #3 operator() /home/wakem/contrib_github/kiss_ws/src/KISS-Matcher/cpp/kiss_matcher/core/kiss_matcher/KISSMatcher.cpp:50 (run_kiss_matcher+0x2e5fd)
    #4 kiss_matcher::KISSMatcher::match(std::vector<Eigen::Matrix<float, 3, 1, 0, 3, 1>, std::allocator<Eigen::Matrix<float, 3, 1, 0, 3, 1> > > const&, std::vector<Eigen::Matrix<float, 3, 1, 0, 3, 1>, std::allocator<Eigen::Matrix<float, 3, 1, 0, 3, 1> > > const&) /home/wakem/contrib_github/kiss_ws/src/KISS-Matcher/cpp/kiss_matcher/core/kiss_matcher/KISSMatcher.cpp:57 (run_kiss_matcher+0x2e5fd)
    #5 kiss_matcher::KISSMatcher::estimate(std::vector<Eigen::Matrix<float, 3, 1, 0, 3, 1>, std::allocator<Eigen::Matrix<float, 3, 1, 0, 3, 1> > > const&, std::vector<Eigen::Matrix<float, 3, 1, 0, 3, 1>, std::allocator<Eigen::Matrix<float, 3, 1, 0, 3, 1> > > const&) /home/wakem/contrib_github/kiss_ws/src/KISS-Matcher/cpp/kiss_matcher/core/kiss_matcher/KISSMatcher.cpp:119 (run_kiss_matcher+0x2f6f2)
    #6 main /home/wakem/contrib_github/kiss_ws/src/KISS-Matcher/ros/src/run_kiss_matcher.cpp:120 (run_kiss_matcher+0x15ab9)

  Previous write of size 8 at 0x7f5e71b13d88 by thread T1:
    #0 tbb::detail::d1::adaptive_mode<tbb::detail::d1::auto_partition_type>::adaptive_mode(tbb::detail::d1::adaptive_mode<tbb::detail::d1::auto_partition_type>&, tbb::detail::d0::split) /usr/include/oneapi/tbb/partitioner.h:303 (run_kiss_matcher+0x371fe)
    #1 tbb::detail::d1::dynamic_grainsize_mode<tbb::detail::d1::adaptive_mode<tbb::detail::d1::auto_partition_type> >::dynamic_grainsize_mode(tbb::detail::d1::dynamic_grainsize_mode<tbb::detail::d1::adaptive_mode<tbb::detail::d1::auto_partition_type> >&, tbb::detail::d0::split) /usr/include/oneapi/tbb/partitioner.h:414 (run_kiss_matcher+0x371fe)
    #2 tbb::detail::d1::auto_partition_type::auto_partition_type(tbb::detail::d1::auto_partition_type&, tbb::detail::d0::split) /usr/include/oneapi/tbb/partitioner.h:494 (run_kiss_matcher+0x371fe)
    #3 tbb::detail::d1::start_for<tbb::detail::d1::blocked_range<unsigned long>, kiss_matcher::VoxelgridSampling(std::vector<Eigen::Matrix<float, 3, 1, 0, 3, 1>, std::allocator<Eigen::Matrix<float, 3, 1, 0, 3, 1> > > const&, double)::{lambda(tbb::detail::d1::blocked_range<unsigned long> const&)#1}, tbb::detail::d1::auto_partitioner const>::start_for(tbb::detail::d1::start_for<tbb::detail::d1::blocked_range<unsigned long>, kiss_matcher::VoxelgridSampling(std::vector<Eigen::Matrix<float, 3, 1, 0, 3, 1>, std::allocator<Eigen::Matrix<float, 3, 1, 0, 3, 1> > > const&, double)::{lambda(tbb::detail::d1::blocked_range<unsigned long> const&)#1}, tbb::detail::d1::auto_partitioner const>&, tbb::detail::d0::split&, tbb::detail::d1::small_object_allocator&) /usr/include/oneapi/tbb/parallel_for.h:89 (run_kiss_matcher+0x371fe)
    #4 tbb::detail::d1::start_for<tbb::detail::d1::blocked_range<unsigned long>, kiss_matcher::VoxelgridSampling(std::vector<Eigen::Matrix<float, 3, 1, 0, 3, 1>, std::allocator<Eigen::Matrix<float, 3, 1, 0, 3, 1> > > const&, double)::{lambda(tbb::detail::d1::blocked_range<unsigned long> const&)#1}, tbb::detail::d1::auto_partitioner const>* tbb::detail::d1::small_object_allocator::new_object<tbb::detail::d1::start_for<tbb::detail::d1::blocked_range<unsigned long>, kiss_matcher::VoxelgridSampling(std::vector<Eigen::Matrix<float, 3, 1, 0, 3, 1>, std::allocator<Eigen::Matrix<float, 3, 1, 0, 3, 1> > > const&, double)::{lambda(tbb::detail::d1::blocked_range<unsigned long> const&)#1}, tbb::detail::d1::auto_partitioner const>, tbb::detail::d1::start_for<tbb::detail::d1::blocked_range<unsigned long>, kiss_matcher::VoxelgridSampling(std::vector<Eigen::Matrix<float, 3, 1, 0, 3, 1>, std::allocator<Eigen::Matrix<float, 3, 1, 0, 3, 1> > > const&, double)::{lambda(tbb::detail::d1::blocked_range<unsigned long> const&)#1}, tbb::detail::d1::auto_partitioner const>&, tbb::detail::d0::split&, tbb::detail::d1::small_object_allocator&>(tbb::detail::d1::execution_data&, tbb::detail::d1::start_for<tbb::detail::d1::blocked_range<unsigned long>, kiss_matcher::VoxelgridSampling(std::vector<Eigen::Matrix<float, 3, 1, 0, 3, 1>, std::allocator<Eigen::Matrix<float, 3, 1, 0, 3, 1> > > const&, double)::{lambda(tbb::detail::d1::blocked_range<unsigned long> const&)#1}, tbb::detail::d1::auto_partitioner const>&, tbb::detail::d0::split&, tbb::detail::d1::small_object_allocator&) /usr/include/oneapi/tbb/detail/_small_object_pool.h:55 (run_kiss_matcher+0x371fe)
    #5 void tbb::detail::d1::start_for<tbb::detail::d1::blocked_range<unsigned long>, kiss_matcher::VoxelgridSampling(std::vector<Eigen::Matrix<float, 3, 1, 0, 3, 1>, std::allocator<Eigen::Matrix<float, 3, 1, 0, 3, 1> > > const&, double)::{lambda(tbb::detail::d1::blocked_range<unsigned long> const&)#1}, tbb::detail::d1::auto_partitioner const>::offer_work_impl<tbb::detail::d1::start_for<tbb::detail::d1::blocked_range<unsigned long>, kiss_matcher::VoxelgridSampling(std::vector<Eigen::Matrix<float, 3, 1, 0, 3, 1>, std::allocator<Eigen::Matrix<float, 3, 1, 0, 3, 1> > > const&, double)::{lambda(tbb::detail::d1::blocked_range<unsigned long> const&)#1}, tbb::detail::d1::auto_partitioner const>&, tbb::detail::d0::split&>(tbb::detail::d1::execution_data&, tbb::detail::d1::start_for<tbb::detail::d1::blocked_range<unsigned long>, kiss_matcher::VoxelgridSampling(std::vector<Eigen::Matrix<float, 3, 1, 0, 3, 1>, std::allocator<Eigen::Matrix<float, 3, 1, 0, 3, 1> > > const&, double)::{lambda(tbb::detail::d1::blocked_range<unsigned long> const&)#1}, tbb::detail::d1::auto_partitioner const>&, tbb::detail::d0::split&) /usr/include/oneapi/tbb/parallel_for.h:137 (run_kiss_matcher+0x371fe)
    #6 tbb::detail::d1::start_for<tbb::detail::d1::blocked_range<unsigned long>, kiss_matcher::VoxelgridSampling(std::vector<Eigen::Matrix<float, 3, 1, 0, 3, 1>, std::allocator<Eigen::Matrix<float, 3, 1, 0, 3, 1> > > const&, double)::{lambda(tbb::detail::d1::blocked_range<unsigned long> const&)#1}, tbb::detail::d1::auto_partitioner const>::offer_work(tbb::detail::d0::split&, tbb::detail::d1::execution_data&) /usr/include/oneapi/tbb/parallel_for.h:124 (run_kiss_matcher+0x371fe)
    #7 void tbb::detail::d1::partition_type_base<tbb::detail::d1::auto_partition_type>::execute<tbb::detail::d1::start_for<tbb::detail::d1::blocked_range<unsigned long>, kiss_matcher::VoxelgridSampling(std::vector<Eigen::Matrix<float, 3, 1, 0, 3, 1>, std::allocator<Eigen::Matrix<float, 3, 1, 0, 3, 1> > > const&, double)::{lambda(tbb::detail::d1::blocked_range<unsigned long> const&)#1}, tbb::detail::d1::auto_partitioner const>, tbb::detail::d1::blocked_range<unsigned long> >(tbb::detail::d1::start_for<tbb::detail::d1::blocked_range<unsigned long>, kiss_matcher::VoxelgridSampling(std::vector<Eigen::Matrix<float, 3, 1, 0, 3, 1>, std::allocator<Eigen::Matrix<float, 3, 1, 0, 3, 1> > > const&, double)::{lambda(tbb::detail::d1::blocked_range<unsigned long> const&)#1}, tbb::detail::d1::auto_partitioner const>&, tbb::detail::d1::blocked_range<unsigned long>&, tbb::detail::d1::execution_data&) /usr/include/oneapi/tbb/partitioner.h:284 (run_kiss_matcher+0x371fe)
    #8 tbb::detail::d1::start_for<tbb::detail::d1::blocked_range<unsigned long>, kiss_matcher::VoxelgridSampling(std::vector<Eigen::Matrix<float, 3, 1, 0, 3, 1>, std::allocator<Eigen::Matrix<float, 3, 1, 0, 3, 1> > > const&, double)::{lambda(tbb::detail::d1::blocked_range<unsigned long> const&)#1}, tbb::detail::d1::auto_partitioner const>::execute(tbb::detail::d1::execution_data&) /usr/include/oneapi/tbb/parallel_for.h:172 (run_kiss_matcher+0x371fe)
    #9 <null> <null> (libtbb.so.12+0x20b3b)

  Thread T1 (tid=1948894, running) created by main thread at:
    #0 pthread_create ../../../../src/libsanitizer/tsan/tsan_interceptors_posix.cpp:969 (libtsan.so.0+0x605b8)
    #1 <null> <null> (libtbb.so.12+0x2243b)
    #2 <null> <null> (libtbb.so.12+0x286c0)
    #3 operator() /home/wakem/contrib_github/kiss_ws/src/KISS-Matcher/cpp/kiss_matcher/core/kiss_matcher/KISSMatcher.cpp:50 (run_kiss_matcher+0x2e5fd)
    #4 kiss_matcher::KISSMatcher::match(std::vector<Eigen::Matrix<float, 3, 1, 0, 3, 1>, std::allocator<Eigen::Matrix<float, 3, 1, 0, 3, 1> > > const&, std::vector<Eigen::Matrix<float, 3, 1, 0, 3, 1>, std::allocator<Eigen::Matrix<float, 3, 1, 0, 3, 1> > > const&) /home/wakem/contrib_github/kiss_ws/src/KISS-Matcher/cpp/kiss_matcher/core/kiss_matcher/KISSMatcher.cpp:57 (run_kiss_matcher+0x2e5fd)
    #5 kiss_matcher::KISSMatcher::estimate(std::vector<Eigen::Matrix<float, 3, 1, 0, 3, 1>, std::allocator<Eigen::Matrix<float, 3, 1, 0, 3, 1> > > const&, std::vector<Eigen::Matrix<float, 3, 1, 0, 3, 1>, std::allocator<Eigen::Matrix<float, 3, 1, 0, 3, 1> > > const&) /home/wakem/contrib_github/kiss_ws/src/KISS-Matcher/cpp/kiss_matcher/core/kiss_matcher/KISSMatcher.cpp:119 (run_kiss_matcher+0x2f6f2)
    #6 main /home/wakem/contrib_github/kiss_ws/src/KISS-Matcher/ros/src/run_kiss_matcher.cpp:120 (run_kiss_matcher+0x15ab9)

SUMMARY: ThreadSanitizer: data race /usr/include/oneapi/tbb/partitioner.h:421 in bool tbb::detail::d1::dynamic_grainsize_mode<tbb::detail::d1::adaptive_mode<tbb::detail::d1::auto_partition_type> >::check_being_stolen<tbb::detail::d1::start_for<tbb::detail::d1::blocked_range<unsigned long>, kiss_matcher::VoxelgridSampling(std::vector<Eigen::Matrix<float, 3, 1, 0, 3, 1>, std::allocator<Eigen::Matrix<float, 3, 1, 0, 3, 1> > > const&, double)::{lambda(tbb::detail::d1::blocked_range<unsigned long> const&)#1}, tbb::detail::d1::auto_partitioner const> >(tbb::detail::d1::start_for<tbb::detail::d1::blocked_range<unsigned long>, kiss_matcher::VoxelgridSampling(std::vector<Eigen::Matrix<float, 3, 1, 0, 3, 1>, std::allocator<Eigen::Matrix<float, 3, 1, 0, 3, 1> > > const&, double)::{lambda(tbb::detail::d1::blocked_range<unsigned long> const&)#1}, tbb::detail::d1::auto_partitioner const>&, tbb::detail::d1::execution_data const&)

@ewak
Copy link
Contributor

ewak commented Mar 11, 2025

diff --git a/cpp/kiss_matcher/CMakeLists.txt b/cpp/kiss_matcher/CMakeLists.txt
index fe45173..87f26c6 100644
--- a/cpp/kiss_matcher/CMakeLists.txt
+++ b/cpp/kiss_matcher/CMakeLists.txt
@@ -6,6 +6,7 @@ option(USE_SYSTEM_EIGEN3 "Use system pre-installed Eigen" ON)
 option(USE_SYSTEM_TBB "Use system pre-installed oneAPI/tbb" ON)
 option(USE_SYSTEM_ROBIN "Use system pre-installed ROBIN from SPARK @ MIT" ON)
 option(USE_ADDRESS_SANITIZER "Use address sanitizer" OFF)
+option(USE_THREAD_SANITIZER "Use thread sanitizer" ON)
 
 # Set a default build type if none is specified
 if (NOT CMAKE_BUILD_TYPE AND NOT CMAKE_CONFIGURATION_TYPES)
@@ -22,6 +23,10 @@ if (USE_ADDRESS_SANITIZER)
     add_compile_options(-fsanitize=address -fsanitize-recover=address)
     add_link_options(-fsanitize=address -fsanitize-recover=address)
 endif()
+if (USE_THREAD_SANITIZER)
+    add_compile_options(-fsanitize=thread)
+    add_link_options(-fsanitize=thread)
+endif()
 
 include(GNUInstallDirs)
 include(3rdparty/find_dependencies.cmake)
diff --git a/ros/CMakeLists.txt b/ros/CMakeLists.txt
index bd9b655..8fc13b5 100644
--- a/ros/CMakeLists.txt
+++ b/ros/CMakeLists.txt
@@ -19,6 +19,11 @@ if (USE_ADDRESS_SANITIZER)
     add_compile_options(-fsanitize=address -fsanitize-recover=address)
     add_link_options(-fsanitize=address -fsanitize-recover=address)
 endif()
+option(USE_THREAD_SANITIZER "Use thread sanitizer" ON)
+if (USE_THREAD_SANITIZER)
+    add_compile_options(-fsanitize=thread)
+    add_link_options(-fsanitize=thread)
+endif()
 
 if(EXISTS ${CMAKE_CURRENT_SOURCE_DIR}/../cpp/kiss_matcher/ AND NOT IS_SYMLINK ${CMAKE_CURRENT_SOURCE_DIR})
   add_subdirectory(${CMAKE_CURRENT_SOURCE_DIR}/../cpp/kiss_matcher ${CMAKE_CURRENT_BINARY_DIR}/kiss_matcher)

@mattalvarado
Copy link

Haven't dug into this at all, but I have seen similar failures when a dependency is compiled statically with one version of Eigen, and the runtime code is using another version of Eigen. kiss_matcher_core seems to be compiled as a static library , and it might be worth seeing if making that shared has any impact on the double free error

@ewak
Copy link
Contributor

ewak commented Mar 12, 2025

Ok I fixed the build up and made it use its own eigen and a newer tbb and changed the build to use shared libraries. It now requires CMake 3.24+ which I hope is ok.

I think the actual problem is related to growth and rehashing of the robin_map spfh_hist_lookup_.

Reserving space before populating it seems to help.

A ddressSanitizer: heap-use-after-free on address 0x6330001209c4 at pc 0x7fc91afb42a7 bp 0x7fc9067f4880 sp 0x7fc9067f4870                                                                                                                                                                                    
READ of size 4 at 0x6330001209c4 thread T6                                                                                                                                                                                                                                                                                    
    #0 0x7fc91afb42a6 in unsigned long tsl::detail_robin_hash::robin_hash<std::pair<unsigned int, unsigned int>, tsl::robin_map<unsigned int, unsigned int, std::hash<unsigned int>, std::equal_to<unsigned int>, std::allocator<std::pair<unsigned int, unsigned int> >, false, tsl::rh::power_of_two_growth_policy<2ul> >::K
eySelect, tsl::robin_map<unsigned int, unsigned int, std::hash<unsigned int>, std::equal_to<unsigned int>, std::allocator<std::pair<unsigned int, unsigned int> >, false, tsl::rh::power_of_two_growth_policy<2ul> >::ValueSelect, std::hash<unsigned int>, std::equal_to<unsigned int>, std::allocator<std::pair<unsigned int
, unsigned int> >, false, tsl::rh::power_of_two_growth_policy<2ul> >::hash_key<unsigned int>(unsigned int const&) const /home/wakem/contrib_github/kiss_ws/src/KISS-Matcher/cpp/kiss_matcher/core/kiss_matcher/tsl/robin_hash.h:1085                                                                                          
    #1 0x7fc91afb42a6 in tsl::detail_robin_hash::robin_hash<std::pair<unsigned int, unsigned int>, tsl::robin_map<unsigned int, unsigned int, std::hash<unsigned int>, std::equal_to<unsigned int>, std::allocator<std::pair<unsigned int, unsigned int> >, false, tsl::rh::power_of_two_growth_policy<2ul> >::KeySelect, tsl:
:robin_map<unsigned int, unsigned int, std::hash<unsigned int>, std::equal_to<unsigned int>, std::allocator<std::pair<unsigned int, unsigned int> >, false, tsl::rh::power_of_two_growth_policy<2ul> >::ValueSelect, std::hash<unsigned int>, std::equal_to<unsigned int>, std::allocator<std::pair<unsigned int, unsigned int
> >, false, tsl::rh::power_of_two_growth_policy<2ul> >::rehash_impl(unsigned long) /home/wakem/contrib_github/kiss_ws/src/KISS-Matcher/cpp/kiss_matcher/core/kiss_matcher/tsl/robin_hash.h:1286
    #2 0x7fc91af5d227 in tsl::detail_robin_hash::robin_hash<std::pair<unsigned int, unsigned int>, tsl::robin_map<unsigned int, unsigned int, std::hash<unsigned int>, std::equal_to<unsigned int>, std::allocator<std::pair<unsigned int, unsigned int> >, false, tsl::rh::power_of_two_growth_policy<2ul> >::KeySelect, tsl:
:robin_map<unsigned int, unsigned int, std::hash<unsigned int>, std::equal_to<unsigned int>, std::allocator<std::pair<unsigned int, unsigned int> >, false, tsl::rh::power_of_two_growth_policy<2ul> >::ValueSelect, std::hash<unsigned int>, std::equal_to<unsigned int>, std::allocator<std::pair<unsigned int, unsigned int
> >, false, tsl::rh::power_of_two_growth_policy<2ul> >::rehash_on_extreme_load(short) /home/wakem/contrib_github/kiss_ws/src/KISS-Matcher/cpp/kiss_matcher/core/kiss_matcher/tsl/robin_hash.h:1341                                                                                                                            
    #3 0x7fc91af5d227 in tsl::detail_robin_hash::robin_hash<std::pair<unsigned int, unsigned int>, tsl::robin_map<unsigned int, unsigned int, std::hash<unsigned int>, std::equal_to<unsigned int>, std::allocator<std::pair<unsigned int, unsigned int> >, false, tsl::rh::power_of_two_growth_policy<2ul> >::KeySelect, tsl:
:robin_map<unsigned int, unsigned int, std::hash<unsigned int>, std::equal_to<unsigned int>, std::allocator<std::pair<unsigned int, unsigned int> >, false, tsl::rh::power_of_two_growth_policy<2ul> >::ValueSelect, std::hash<unsigned int>, std::equal_to<unsigned int>, std::allocator<std::pair<unsigned int, unsigned int
> >, false, tsl::rh::power_of_two_growth_policy<2ul> >::rehash_on_extreme_load(short) /home/wakem/contrib_github/kiss_ws/src/KISS-Matcher/cpp/kiss_matcher/core/kiss_matcher/tsl/robin_hash.h:1336
    #4 0x7fc91af5d227 in std::pair<tsl::detail_robin_hash::robin_hash<std::pair<unsigned int, unsigned int>, tsl::robin_map<unsigned int, unsigned int, std::hash<unsigned int>, std::equal_to<unsigned int>, std::allocator<std::pair<unsigned int, unsigned int> >, false, tsl::rh::power_of_two_growth_policy<2ul> >::KeySe
lect, tsl::robin_map<unsigned int, unsigned int, std::hash<unsigned int>, std::equal_to<unsigned int>, std::allocator<std::pair<unsigned int, unsigned int> >, false, tsl::rh::power_of_two_growth_policy<2ul> >::ValueSelect, std::hash<unsigned int>, std::equal_to<unsigned int>, std::allocator<std::pair<unsigned int, un
signed int> >, false, tsl::rh::power_of_two_growth_policy<2ul> >::robin_iterator<false>, bool> tsl::detail_robin_hash::robin_hash<std::pair<unsigned int, unsigned int>, tsl::robin_map<unsigned int, unsigned int, std::hash<unsigned int>, std::equal_to<unsigned int>, std::allocator<std::pair<unsigned int, unsigned int>
 >, false, tsl::rh::power_of_two_growth_policy<2ul> >::KeySelect, tsl::robin_map<unsigned int, unsigned int, std::hash<unsigned int>, std::equal_to<unsigned int>, std::allocator<std::pair<unsigned int, unsigned int> >, false, tsl::rh::power_of_two_growth_policy<2ul> >::ValueSelect, std::hash<unsigned int>, std::equal
_to<unsigned int>, std::allocator<std::pair<unsigned int, unsigned int> >, false, tsl::rh::power_of_two_growth_policy<2ul> >::insert_impl<unsigned int, std::piecewise_construct_t const&, std::tuple<unsigned int const&>, std::tuple<> >(unsigned int const&, std::piecewise_construct_t const&, std::tuple<unsigned int con
st&>&&, std::tuple<>&&) /home/wakem/contrib_github/kiss_ws/src/KISS-Matcher/cpp/kiss_matcher/core/kiss_matcher/tsl/robin_hash.h:1187                                                                                                                                                                                          
    #5 0x7fc91af5f063 in std::pair<tsl::detail_robin_hash::robin_hash<std::pair<unsigned int, unsigned int>, tsl::robin_map<unsigned int, unsigned int, std::hash<unsigned int>, std::equal_to<unsigned int>, std::allocator<std::pair<unsigned int, unsigned int> >, false, tsl::rh::power_of_two_growth_policy<2ul> >::KeySe
lect, tsl::robin_map<unsigned int, unsigned int, std::hash<unsigned int>, std::equal_to<unsigned int>, std::allocator<std::pair<unsigned int, unsigned int> >, false, tsl::rh::power_of_two_growth_policy<2ul> >::ValueSelect, std::hash<unsigned int>, std::equal_to<unsigned int>, std::allocator<std::pair<unsigned int, un
signed int> >, false, tsl::rh::power_of_two_growth_policy<2ul> >::robin_iterator<false>, bool> tsl::detail_robin_hash::robin_hash<std::pair<unsigned int, unsigned int>, tsl::robin_map<unsigned int, unsigned int, std::hash<unsigned int>, std::equal_to<unsigned int>, std::allocator<std::pair<unsigned int, unsigned int>
 >, false, tsl::rh::power_of_two_growth_policy<2ul> >::KeySelect, tsl::robin_map<unsigned int, unsigned int, std::hash<unsigned int>, std::equal_to<unsigned int>, std::allocator<std::pair<unsigned int, unsigned int> >, false, tsl::rh::power_of_two_growth_policy<2ul> >::ValueSelect, std::hash<unsigned int>, std::equal
_to<unsigned int>, std::allocator<std::pair<unsigned int, unsigned int> >, false, tsl::rh::power_of_two_growth_policy<2ul> >::try_emplace<unsigned int const&>(unsigned int const&) /home/wakem/contrib_github/kiss_ws/src/KISS-Matcher/cpp/kiss_matcher/core/kiss_matcher/tsl/robin_hash.h:777
    #6 0x7fc91af5f063 in tsl::robin_map<unsigned int, unsigned int, std::hash<unsigned int>, std::equal_to<unsigned int>, std::allocator<std::pair<unsigned int, unsigned int> >, false, tsl::rh::power_of_two_growth_policy<2ul> >::ValueSelect::value_type& tsl::detail_robin_hash::robin_hash<std::pair<unsigned int, unsig
ned int>, tsl::robin_map<unsigned int, unsigned int, std::hash<unsigned int>, std::equal_to<unsigned int>, std::allocator<std::pair<unsigned int, unsigned int> >, false, tsl::rh::power_of_two_growth_policy<2ul> >::KeySelect, tsl::robin_map<unsigned int, unsigned int, std::hash<unsigned int>, std::equal_to<unsigned in
t>, std::allocator<std::pair<unsigned int, unsigned int> >, false, tsl::rh::power_of_two_growth_policy<2ul> >::ValueSelect, std::hash<unsigned int>, std::equal_to<unsigned int>, std::allocator<std::pair<unsigned int, unsigned int> >, false, tsl::rh::power_of_two_growth_policy<2ul> >::operator[]<unsigned int const&, t
sl::robin_map<unsigned int, unsigned int, std::hash<unsigned int>, std::equal_to<unsigned int>, std::allocator<std::pair<unsigned int, unsigned int> >, false, tsl::rh::power_of_two_growth_policy<2ul> >::ValueSelect, (void*)0>(unsigned int const&) /home/wakem/contrib_github/kiss_ws/src/KISS-Matcher/cpp/kiss_matcher/co
re/kiss_matcher/tsl/robin_hash.h:946                                           
    #7 0x7fc91af5f063 in tsl::robin_map<unsigned int, unsigned int, std::hash<unsigned int>, std::equal_to<unsigned int>, std::allocator<std::pair<unsigned int, unsigned int> >, false, tsl::rh::power_of_two_growth_policy<2ul> >::operator[](unsigned int const&) /home/wakem/contrib_github/kiss_ws/src/KISS-Matcher/cpp/k
iss_matcher/core/kiss_matcher/tsl/robin_map.h:450                              
    #8 0x7fc91af5f063 in operator() /home/wakem/contrib_github/kiss_ws/src/KISS-Matcher/cpp/kiss_matcher/core/kiss_matcher/FasterPFH.cpp:247                   
    #9 0x7fc91af6116f in __invoke_impl<void, const kiss_matcher::FasterPFH::ComputeFeature(std::vector<Eigen::Matrix<float, 3, 1> >&, std::vector<Eigen::Matrix<float, -1, 1> >&)::<lambda(const tbb::detail::d1::blocked_range<long unsigned int>&)>&, tbb::detail::d1::blocked_range<long unsigned int>&> /usr/include/c++/1
1/bits/invoke.h:61                                                             
    #10 0x7fc91af6116f in __invoke<const kiss_matcher::FasterPFH::ComputeFeature(std::vector<Eigen::Matrix<float, 3, 1> >&, std::vector<Eigen::Matrix<float, -1, 1> >&)::<lambda(const tbb::detail::d1::blocked_range<long unsigned int>&)>&, tbb::detail::d1::blocked_range<long unsigned int>&> /usr/include/c++/11/bits/inv
oke.h:96                                                                       
    #11 0x7fc91af6116f in invoke<const kiss_matcher::FasterPFH::ComputeFeature(std::vector<Eigen::Matrix<float, 3, 1> >&, std::vector<Eigen::Matrix<float, -1, 1> >&)::<lambda(const tbb::detail::d1::blocked_range<long unsigned int>&)>&, tbb::detail::d1::blocked_range<long unsigned int>&> /usr/include/c++/11/functional
:97                                                                            
    #12 0x7fc91af6116f in invoke<const kiss_matcher::FasterPFH::ComputeFeature(std::vector<Eigen::Matrix<float, 3, 1> >&, std::vector<Eigen::Matrix<float, -1, 1> >&)::<lambda(const tbb::detail::d1::blocked_range<long unsigned int>&)>&, tbb::detail::d1::blocked_range<long unsigned int>&> /home/wakem/contrib_github/kis
s_ws/build/kiss_matcher_ros/_deps/tbb-src/src/tbb/../../include/tbb/../oneapi/tbb/detail/_utils.h:356                                                          
    #13 0x7fc91af6116f in run_body /home/wakem/contrib_github/kiss_ws/build/kiss_matcher_ros/_deps/tbb-src/src/tbb/../../include/tbb/../oneapi/tbb/parallel_for.h:117
    #14 0x7fc91af6116f in work_balance<tbb::detail::d1::start_for<tbb::detail::d1::blocked_range<long unsigned int>, kiss_matcher::FasterPFH::ComputeFeature(std::vector<Eigen::Matrix<float, 3, 1> >&, std::vector<Eigen::Matrix<float, -1, 1> >&)::<lambda(const tbb::detail::d1::blocked_range<long unsigned int>&)>, const
 tbb::detail::d1::auto_partitioner>, tbb::detail::d1::blocked_range<long unsigned int> > /home/wakem/contrib_github/kiss_ws/build/kiss_matcher_ros/_deps/tbb-src/src/tbb/../../include/tbb/../oneapi/tbb/partitioner.h:450
    #15 0x7fc91af6116f in execute<tbb::detail::d1::start_for<tbb::detail::d1::blocked_range<long unsigned int>, kiss_matcher::FasterPFH::ComputeFeature(std::vector<Eigen::Matrix<float, 3, 1> >&, std::vector<Eigen::Matrix<float, -1, 1> >&)::<lambda(const tbb::detail::d1::blocked_range<long unsigned int>&)>, const tbb:
:detail::d1::auto_partitioner>, tbb::detail::d1::blocked_range<long unsigned int> > /home/wakem/contrib_github/kiss_ws/build/kiss_matcher_ros/_deps/tbb-src/src/tbb/../../include/tbb/../oneapi/tbb/partitioner.h:289
    #16 0x7fc91af6116f in execute /home/wakem/contrib_github/kiss_ws/build/kiss_matcher_ros/_deps/tbb-src/src/tbb/../../include/tbb/../oneapi/tbb/parallel_for.h:170
    #17 0x7fc911bae472 in tbb::detail::d1::task* tbb::detail::r1::task_dispatcher::local_wait_for_all<false, tbb::detail::r1::outermost_worker_waiter>(tbb::detail::d1::task*, tbb::detail::r1::outermost_worker_waiter&) /home/wakem/contrib_github/kiss_ws/build/kiss_matcher_ros/_deps/tbb-src/src/tbb/task_dispatcher.h:33
4                                                                              
    #18 0x7fc911bae472 in tbb::detail::d1::task* tbb::detail::r1::task_dispatcher::local_wait_for_all<tbb::detail::r1::outermost_worker_waiter>(tbb::detail::d1::task*, tbb::detail::r1::outermost_worker_waiter&) /home/wakem/contrib_github/kiss_ws/build/kiss_matcher_ros/_deps/tbb-src/src/tbb/task_dispatcher.h:470
    #19 0x7fc911bae472 in tbb::detail::r1::arena::process(tbb::detail::r1::thread_data&) /home/wakem/contrib_github/kiss_ws/build/kiss_matcher_ros/_deps/tbb-src/src/tbb/arena.cpp:215
    #20 0x7fc911bae472 in tbb::detail::r1::thread_dispatcher_client::process(tbb::detail::r1::thread_data&) /home/wakem/contrib_github/kiss_ws/build/kiss_matcher_ros/_deps/tbb-src/src/tbb/thread_dispatcher_client.h:41
    #21 0x7fc911bae472 in tbb::detail::r1::thread_dispatcher::process(rml::job&) /home/wakem/contrib_github/kiss_ws/build/kiss_matcher_ros/_deps/tbb-src/src/tbb/thread_dispatcher.cpp:195
    #22 0x7fc911b80652 in tbb::detail::r1::rml::private_worker::run() /home/wakem/contrib_github/kiss_ws/build/kiss_matcher_ros/_deps/tbb-src/src/tbb/private_server.cpp:271
    #23 0x7fc911b80652 in tbb::detail::r1::rml::private_worker::thread_routine(void*) /home/wakem/contrib_github/kiss_ws/build/kiss_matcher_ros/_deps/tbb-src/src/tbb/private_server.cpp:221
    #24 0x7fc90ee94ac2 in start_thread nptl/pthread_create.c:442
    #25 0x7fc90ef2684f  (/lib/x86_64-linux-gnu/libc.so.6+0x12684f)

@ewak
Copy link
Contributor

ewak commented Mar 12, 2025

colcon build --cmake-args -DCMAKE_BUILD_TYPE=RelWithDebInfo -DUSE_ADDRESS_SANITIZER=ON -DUSE_ASAN=ON --packages-select kiss_matcher_ros
./install/kiss_matcher_ros/lib/kiss_matcher_ros/run_kiss_matcher  ./data/Vel16/tgt.pcd ./data/Vel16/src.pcd 0.3 0.0 0.0 100000

Has reached stress iteration: 2000 and is still going.

@LimHyungTae
Copy link
Member

Thanks for the update and the improvements! Upgrading to CMake 3.24+ should be fine.

It's good to hear that reserving space before populating spfh_hist_lookup_ helps mitigate the issue. I'll take a closer look at the changes and test them further. Let me know if you notice any unexpected behavior as the iterations increase.

Appreciate your effort!

Does your PR #33 address this issue, right? I'll take a look within two business days.

@ewak
Copy link
Contributor

ewak commented Mar 12, 2025

Yeah PR #33 is the one.

I was just giving it a thrash building release mode, with the sanitizers off and tried using the same point cloud for src and tgt.

colcon build --cmake-args -DCMAKE_BUILD_TYPE=Release -DUSE_ADDRESS_SANITIZER=OFF -DUSE_ASAN=OFF --packages-select kiss_matcher_ros
/usr/bin/time -v ./install/kiss_matcher_ros/lib/kiss_matcher_ros/run_kiss_matcher ./data/Vel16/src.pcd ./data/Vel16/src.pcd 0.3 0.0 0.0 1000

It seems to run slower and was killed with signal 9 after 129 iterations.
Probably a pathological unexpected test case???

sudo dmesg -T | grep "Out of memory" 
[Sun Mar 16 15:39:10 2025] Out of memory: Killed process 2022734 (run_kiss_matche) total-vm:30275424kB, anon-rss:28482284kB, file-rss:3620kB, shmem-rss:0kB, UID:2003 pgtables:56376kB oom_score_adj:0

@mattalvarado
Copy link

Have you tried building without ROS2? If I build the python bindings directly via cd python && mkdir build && cmake .. && make -j10 the following code works without any issues. Note this needs to run in the build directory where the
kiss_matcher.cpython-38-x86_64-linux-gnu.so library is located.

import kiss_matcher
import numpy as np

params = kiss_matcher.KISSMatcherConfig(0.1)
matcher = kiss_matcher.KISSMatcher(params)
for i in range(1000):
    src = np.random.rand(10000, 3);
    tgt = src + 3
    result = matcher.estimate(src, tgt)
    matcher.clear()
    matcher.reset()
    matcher.reset_solver()
    print(i)

Looking into the asan output above it looks like there is a copy of tsl::robin_map included from KISS-Matcher/cpp/kiss_matcher/core/kiss_matcher/tsl which is being used by FasterPFH.hpp instead of the system installed version. Is this intentional?

@LimHyungTae
Copy link
Member

Hi, as you can see here (in https://github.com/MIT-SPARK/KISS-Matcher/blob/main/cpp/kiss_matcher/3rdparty/find_dependencies.cmake):

find_external_dependency("Eigen3" "Eigen3::Eigen" "${CMAKE_CURRENT_LIST_DIR}/eigen/eigen.cmake")
find_external_dependency("TBB" "TBB::tbb" "${CMAKE_CURRENT_LIST_DIR}/tbb/tbb.cmake")
find_external_dependency("robin" "robin::robin" "${CMAKE_CURRENT_LIST_DIR}/robin/robin.cmake")
# ToDos
# find_external_dependency("tsl-robin-map" "tsl::robin_map" "${CMAKE_CURRENT_LIST_DIR}/tsl_robin/tsl_robin.cmake")

Now, the local ones are used. Originally, I wanted to use the tsl from https://github.com/Tessil/robin-map; however, as far as I remember, it leads to an installation issue (I forgot the detail, though).

@ewak
Copy link
Contributor

ewak commented Mar 13, 2025

Have you tried building without ROS2? If I build the python bindings directly via cd python && mkdir build && cmake .. && make -j10 the following code works without any issues.
No I have not. I think that a special set of data tickles the problem right now.

Looking into the asan output above it looks like there is a copy of tsl::robin_map included from KISS-Matcher/cpp/kiss_matcher/core/kiss_matcher/tsl which is being used by FasterPFH.hpp instead of the system installed version. Is this intentional?

It is using the robin map included with the code. It appears to be the same as the latest release apart from code formatting and some backward compatibility for pre c++17 compilers and the following change.
Tessil/robin-map@612c2be

++ b/cpp/kiss_matcher/core/kiss_matcher/tsl/robin_hash.h
@@ -1458,8 +1458,11 @@ class robin_hash : private Hash, private KeyEqual, private GrowthPolicy {
 
       tsl_rh_assert(nb_elements == size());
     } else {
-      m_bucket_count =
-          numeric_cast<size_type>(bucket_count_ds, "Deserialized bucket_count is too big.");
+      m_bucket_count = numeric_cast<size_type>(
+          bucket_count_ds, "Deserialized bucket_count is too big.");
+      // Recompute m_load_threshold, during max_load_factor() the bucket count
+      // was still 0 which would trigger rehash on first insert
+      m_load_threshold = size_type(float(bucket_count()) * m_max_load_factor);

@ewak
Copy link
Contributor

ewak commented Mar 13, 2025

The operator[] on spfh_hist_lookup_ does a try_emplace which may cause a rehash which invalidates any indexes.
This is done inside a tbb::parallel_for(tbb::blocked_range<size_t>(0, N), [&](const tbb::blocked_range<size_t> &r) {

Changing it to .at() as below, it immediately throws.

I do not know the intention of spfh_hist_lookup_. Should it be getting modified in the loop?

+++ b/cpp/kiss_matcher/core/kiss_matcher/FasterPFH.cpp
@@ -203,7 +203,8 @@ void FasterPFH::ComputeFeature(std::vector<Eigen::Vector3f> &points,
   // Setting up the SPFH histogram bins and lookup table, reserve alot in order to avoid
   // growth during parallel processing
   spfh_hist_lookup_.clear();
-  spfh_hist_lookup_.reserve(data_size * 3);
+  //spfh_hist_lookup_.reserve(data_size * 3);
+  spfh_hist_lookup_.reserve(data_size);
 
   static Eigen::VectorXf bin_f1 = Eigen::VectorXf::Zero(nr_bins_f1_);
   static Eigen::VectorXf bin_f2 = Eigen::VectorXf::Zero(nr_bins_f2_);
@@ -245,7 +246,7 @@ void FasterPFH::ComputeFeature(std::vector<Eigen::Vector3f> &points,
 
       for (size_t i = 0; i < indices.size(); ++i) {
         if (is_valid_[indices[i]]) {
-          nn_indices.emplace_back(spfh_hist_lookup_[indices[i]]);
+          nn_indices.emplace_back(spfh_hist_lookup_.at(indices[i]));
           nn_dists.emplace_back(dists[i]);
         }
       }

@mattalvarado
Copy link

I can confirm my example fails if I use the input Vel16 clouds with a resolution of 0.3m. Apologies for being slow to the party here.

@ewak
Copy link
Contributor

ewak commented Mar 13, 2025

I have a patch that fixes it. Just needs a tidy up, maybe tonight (UTC+11 here)

Essentially use find to see if the indices[i] is in spfh_hist_lookup_ before using it. IOW skip if not there.

I am also working on reserving the correct amount of space for spfh_hist_lookup_. essentially data_size / max_load_factor rounded up to next power of two,

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Something isn't working
Projects
None yet
Development

No branches or pull requests

4 participants