1. Complete Exercise 1 and 2 steps. Pipeline for filtering, RANSAC plane fitting, segmentation, and clustering implemented.
Voxel downsampling, passthrough filtering, RANSAC plane segmentation, and Euclidean clustering were all performed on sample pcl cloud data, in the aforementioned order. The code can be found in
RoboND-Perception-Exercises/Exercise-1/RANSAC.py
and the results are shown below.
This is done to reduce the total number of points on the scene to save on compute time and processing cycles.
This is done with the intention of removing extraneous points that are unnecessary to our purposes. In this instance, we are only interested in the table top and the objects found on top of it.
This is done to separate the table surface from the objects on the table surface. We segment it by identifying the plane in the scene, and all the points that comprise it. We also obtain all the points that do not comprise this plane, which are basically our objects.
This is done to group or cluster each point with a particular group. The idea is that each point belongs to a cluster or group that is closest to it, and that each cluster comprises an object of interest on top of the table.
We color code each point to show which cluster it belongs to
2. Complete Exercise 3 Steps. Features are extracted and an SVM is trained. Object recognition is implemented.
For exercise 3, we:
- Generate training sets by capturing color and normal histograms of objects. A copy of the script used here can be found at
~/catkin_ws/src/RoboND-Perception-Project/exercise_3_code/capture_features.py
. In this instance, we iterated 50 times for each object. A copy of the resultant training set can be found at
~/catkin_ws/src/RoboND-Perception-Project/exercise_3_code/training_sets/training_set_complete_50.sav
. 2. Once training sets have been generated, we proceed to train our models using these. A copy of the training script used can be found at
~/catkin_ws/src/RoboND-Perception-Project/exercise_3_code/train_svm.py
and the resulting model can be found at
~/catkin_ws/src/RoboND-Perception-Project/exercise_3_code/models/model.sav
. The resultant confusion matrices are shown below
1. For all three tabletop setups (test*.world
), perform object recognition, then read in respective pick list (pick_list_*.yaml
). Next construct the messages that would comprise a valid PickPlace
request output them to .yaml
format.
The object recognition code can be found in
RoboND-Perception-Project/pr2_robot/scripts/object_recognition.py
The output yaml files are:
RoboND-Perception-Project/ros_independent_src/output_yaml/output_1.yaml
RoboND-Perception-Project/ros_independent_src/output_yaml/output_2.yaml
RoboND-Perception-Project/ros_independent_src/output_yaml/output_3.yaml
RoboND-Perception-Project/ros_independent_src/output_yaml/output_4.yaml
Below are the images for the labeled items in the three different test worlds and the challenge world
Most of the code has been done according to the exercises. The major departures from the exercises involve code dealing with the object recognition for the challenge world to generate output4.yaml. The most notable of these differences involve the intensive use of passthrough filters for the challenge world, to ensure that there are no anomalous artifacts when segmenting objects. RANSAC plane segmentation was also not used in the challenge world. Instead, to obtain the pcl and ros cloud of the table surfaces for the challenge world, we created multiple passthrough filters that encompass the table surfaces,which we then combined later on. Below, we restate the different techniques we use, and some observations.
- Statistical outlier filtering (noise filtering) - for effective noise filtering, it seems it is best to do the filtering at the very beginning, even before doing voxel downsampling. The algorithm has an easier time removing point cloud outliers when non-noise points are denser.
- Voxel downsampling - for better object recognition effectivity, it may be best to have less downsampling. Although this will have to be balanced with the processing capacity of the host machine. Point clouds that are too dense may consume too much processing cycles, and may take too long to process.
- Passthrough filtering - Partitioning pointclouds using this technique, and recombining them later on, has proven to be a very handy tool in isolating areas of interest and removing artifacts. If the environment we are working with is fairly fixed, using passthrough filtering to reduce scene complexity is fairly effective. The downside however is if the opposite were true, that is, if the environment is dynamic, and there are no static areas. In these instances, this method may instead "chop up" the point cloud in the wrong places. It may, however, be possible to programmatically and dynamically determine which areas to use a passthrough filter on, so it may still have utility even if the environment is dynamic.
- RANSAC plane segmentation - the implementation from the exercises only allows for the segmentation of a single identified plane from a point cloud. For future implementations, we can explore how to identify and segment multiple planes from a single point cloud. We can also explore how segmentation of other geometric shapes can be done using this technique.
- Euclidean Clustering
- Histogram generation for color and shape(normals) features - an attempt was made to generate training sets based only on shape and colors of the objects. Training sets using only the object colors was still able to produce a normalized confusion matrix with values greater than 90%. On the other hand, training sets using only normals(shape) failed to produce workable results given some basic settings. This may have been caused by the limited number of "buckets" used, which was at 20. Maybe larger values can produce better results, and could allow for the use of normal features in object recognition in the future scripts and programs.
- Model generation using a Support Vector Machine - we used the rbf kernel, with the setting
probability=True
. The rbf kernel produced better results than the linear kernel. With theprobability=True
setting, we are also able to check the degree of confidence of the performed object recognition, and decide whether or not the object identification attempt is sufficiently robust for our purposes. - Object recognition using the aforementioned model
- Collision map generation - we use the clear_octomap service to periodically clear the collision map, and then repopulate it for each object recognition attempt.
- Pick place operation - some config files were adjusted to try to get the arm to actually pick the objects. Although it may be possible the cause of the objects "slipping" from the gripper is the fact that object orientation data is needed to be supplied, apart from the object position. Further refinement of place pose assignment could also be done.
Lastly the primary thing to improve is to attempt to have the robot arm execute pick and place operations in the challenge worldm and improve its performance in the test world. It is currently successful in identifying all the objects in the worlds, however, more work needs to be done to enable to pick the objects in the challenge world, and to increase reliability of picking up the objects in the test worlds.