{"id":711,"date":"2019-12-08T04:02:05","date_gmt":"2019-12-08T04:02:05","guid":{"rendered":"https:\/\/www.danielparente.net\/en\/2019\/12\/08\/implementing-object-detection-and-instance-segmentation-for-data-scientists\/"},"modified":"2019-12-08T04:02:05","modified_gmt":"2019-12-08T04:02:05","slug":"implementing-object-detection-and-instance-segmentation-for-data-scientists","status":"publish","type":"post","link":"https:\/\/www.danielparente.net\/en\/2019\/12\/08\/implementing-object-detection-and-instance-segmentation-for-data-scientists\/","title":{"rendered":"Implementing Object Detection and Instance Segmentation for Data Scientists"},"content":{"rendered":"<p> [ad_1]<br \/>\n<\/p>\n<div>\n<p><img decoding=\"async\" src=\"https:\/\/mlwhiz.com\/images\/weapons\/main.png\" alt=\"\"\/><\/p>\n<p>Object Detection is a helpful tool to have in your coding repository.<\/p>\n<p>It forms the backbone of many fantastic industrial applications. Some of them being self-driving cars, medical imaging and face detection.<\/p>\n<p>In my last <a href=\"https:\/\/towardsdatascience.com\/a-hitchhikers-guide-to-object-detection-and-instance-segmentation-ac0146fe8e11\" rel=\"nofollow noopener noreferrer\" target=\"_blank\">post<\/a> on Object detection, I talked about how Object detection models evolved.<\/p>\n<p>But what good is theory, if we can\u2019t implement it?<\/p>\n<p><strong><em>This post is about implementing and getting an object detector on our custom dataset of weapons.<\/em><\/strong><\/p>\n<p>The problem we will specifically solve today is that of Instance Segmentation using Mask-RCNN.<\/p>\n<hr\/>\n<h2 id=\"instance-segmentation\">Instance Segmentation<\/h2>\n<p><em>Can we create<\/em> <strong><em>masks<\/em><\/strong> <em>for each object in the image? Specifically something like:<\/em><\/p>\n<p><img decoding=\"async\" src=\"https:\/\/mlwhiz.com\/images\/weapons\/0.png\" alt=\"\"\/><\/p>\n<p>The most common way to solve this problem is by using Mask-RCNN. The architecture of Mask-RCNN looks like below:<\/p>\n<p><img decoding=\"async\" src=\"https:\/\/mlwhiz.com\/images\/weapons\/1.png\" alt=\"[Source](https:\/\/medium.com\/@jonathan_hui\/image-segmentation-with-mask-r-cnn-ebe6d793272)\"\/><em><a href=\"https:\/\/medium.com\/@jonathan_hui\/image-segmentation-with-mask-r-cnn-ebe6d793272\" rel=\"nofollow noopener noreferrer\" target=\"_blank\">Source<\/a><\/em><\/p>\n<p>Essentially, it comprises of:<\/p>\n<ul>\n<li>\n<p>A backbone network like resnet50\/resnet101<\/p>\n<\/li>\n<li>\n<p>A Region Proposal network<\/p>\n<\/li>\n<li>\n<p>ROI-Align layers<\/p>\n<\/li>\n<li>\n<p>Two output layers \u2014 one to predict masks and one to predict class and bounding box.<\/p>\n<\/li>\n<\/ul>\n<p>There is a lot more to it. If you want to learn more about the theory, read my last post\u2013<br \/>\n<a href=\"https:\/\/towardsdatascience.com\/a-hitchhikers-guide-to-object-detection-and-instance-segmentation-ac0146fe8e11\" rel=\"nofollow noopener noreferrer\" target=\"_blank\">Demystifying Object Detection and Instance Segmentation for Data Scientists<\/a><\/p>\n<p>This post is mostly going to be about the <a href=\"https:\/\/github.com\/MLWhiz\/object_detection\" rel=\"nofollow noopener noreferrer\" target=\"_blank\">code<\/a>.<\/p>\n<hr\/>\n<h2 id=\"1-creating-your-custom-dataset-for-instance-segmentation\">1. Creating your Custom Dataset for Instance Segmentation<\/h2>\n<p><img decoding=\"async\" src=\"https:\/\/mlwhiz.com\/images\/weapons\/2.png\" alt=\"Our Dataset\"\/><\/p>\n<p>The use case we will be working on is a weapon detector. A weapon detector is something that can be used in conjunction with street cameras as well as CCTV\u2019s to fight crime. So it is pretty nifty.<\/p>\n<p>So, I started with downloading 40 images each of guns and swords from the <a href=\"https:\/\/storage.googleapis.com\/openimages\/web\/index.html\" rel=\"nofollow noopener noreferrer\" target=\"_blank\">open image dataset<\/a> and annotated them using the VIA tool. Now setting up the annotation project in VIA is petty important, so I will try to explain it step by step.<\/p>\n<h3 id=\"1-set-up-via\">1. Set up VIA<\/h3>\n<p>VIA is an annotation tool, using which you can annotate images both bounding boxes as well as masks. I found it as one of the best tools to do annotation as it is online and runs in the browser itself.<\/p>\n<p>To use it, open <a href=\"http:\/\/www.robots.ox.ac.uk\/~vgg\/software\/via\/via.html\" rel=\"nofollow noopener noreferrer\" target=\"_blank\">http:\/\/www.robots.ox.ac.uk\/~vgg\/software\/via\/via.html<\/a><\/p>\n<p>You will see a page like:<\/p>\n<p><img decoding=\"async\" src=\"https:\/\/mlwhiz.com\/images\/weapons\/3.png\" alt=\"\"\/><\/p>\n<p>The next thing we want to do is to add the different class names in the region_attributes. Here I have added \u2018gun\u2019 and \u2018sword\u2019 as per our use case as these are the two distinct targets I want to annotate.<\/p>\n<p><img decoding=\"async\" src=\"https:\/\/mlwhiz.com\/images\/weapons\/4.png\" alt=\"\"\/><\/p>\n<h3 id=\"2-annotate-the-images\">2. Annotate the Images<\/h3>\n<p>I have kept all the files in the folder data. Next step is to add the files we want to annotate. We can add files in the data folder using the \u201cAdd Files\u201d button in the VIA tool. And start annotating along with labels as shown below after selecting the polyline tool.<\/p>\n<p><img decoding=\"async\" src=\"https:\/\/mlwhiz.com\/images\/weapons\/5.png\" alt=\"Click, Click, Enter, Escape, Select\"\/><\/p>\n<h3 id=\"3-download-the-annotation-file\">3. Download the annotation file<\/h3>\n<p>Click on save project on the top menu of the VIA tool.<\/p>\n<p><img decoding=\"async\" src=\"https:\/\/mlwhiz.com\/images\/weapons\/6.png\" alt=\"\"\/><\/p>\n<p>Save file as via_region_data.json by changing the project name field. This will save the annotations in COCO format.<\/p>\n<h3 id=\"4-set-up-the-data-directory-structure\">4. Set up the data directory structure<\/h3>\n<p>We will need to set up the data directories first so that we can do object detection. In the code below, I am creating a directory structure that is required for the model that we are going to use.<\/p>\n<div class=\"highlight\">\n<pre style=\"color:#f8f8f2;background-color:#272822;-moz-tab-size:4;-o-tab-size:4;tab-size:4\"><code class=\"language-py\" data-lang=\"py\"><span style=\"color:#f92672\">from<\/span> random <span style=\"color:#f92672\">import<\/span> random\n<span style=\"color:#f92672\">import<\/span> os\n<span style=\"color:#f92672\">from<\/span> glob <span style=\"color:#f92672\">import<\/span> glob\n<span style=\"color:#f92672\">import<\/span> json\n<span style=\"color:#75715e\"># Path to your images<\/span>\nimage_paths <span style=\"color:#f92672\">=<\/span> glob(<span style=\"color:#e6db74\">\"data\/*\"<\/span>)\n<span style=\"color:#75715e\">#Path to your annotations from VIA tool<\/span>\nannotation_file <span style=\"color:#f92672\">=<\/span> <span style=\"color:#e6db74\">'via_region_data.json'<\/span>\n<span style=\"color:#75715e\">#clean up the annotations a little<\/span>\nannotations <span style=\"color:#f92672\">=<\/span> json<span style=\"color:#f92672\">.<\/span>load(open(annotation_file))\ncleaned_annotations <span style=\"color:#f92672\">=<\/span> {}\n<span style=\"color:#66d9ef\">for<\/span> k,v <span style=\"color:#f92672\">in<\/span> annotations[<span style=\"color:#e6db74\">'_via_img_metadata'<\/span>]<span style=\"color:#f92672\">.<\/span>items():\n    cleaned_annotations[v[<span style=\"color:#e6db74\">'filename'<\/span>]] <span style=\"color:#f92672\">=<\/span> v\n<span style=\"color:#75715e\"># create train and validation directories<\/span>\n<span style=\"color:#960050;background-color:#1e0010\">!<\/span> mkdir procdata\n<span style=\"color:#960050;background-color:#1e0010\">!<\/span> mkdir procdata<span style=\"color:#f92672\">\/<\/span>val\n<span style=\"color:#960050;background-color:#1e0010\">!<\/span> mkdir procdata<span style=\"color:#f92672\">\/<\/span>train\ntrain_annotations <span style=\"color:#f92672\">=<\/span> {}\nvalid_annotations <span style=\"color:#f92672\">=<\/span> {}\n<span style=\"color:#75715e\"># 20% of images in validation folder<\/span>\n<span style=\"color:#66d9ef\">for<\/span> img <span style=\"color:#f92672\">in<\/span> image_paths:\n    <span style=\"color:#75715e\"># Image goes to Validation folder<\/span>\n    <span style=\"color:#66d9ef\">if<\/span> random()<span style=\"color:#f92672\">&lt;<\/span><span style=\"color:#ae81ff\">0.2<\/span>:\n        os<span style=\"color:#f92672\">.<\/span>system(<span style=\"color:#e6db74\">\"cp \"<\/span><span style=\"color:#f92672\">+<\/span> img <span style=\"color:#f92672\">+<\/span> <span style=\"color:#e6db74\">\" procdata\/val\/\"<\/span>)\n        img <span style=\"color:#f92672\">=<\/span> img<span style=\"color:#f92672\">.<\/span>split(<span style=\"color:#e6db74\">\"\/\"<\/span>)[<span style=\"color:#f92672\">-<\/span><span style=\"color:#ae81ff\">1<\/span>]\n        valid_annotations[img] <span style=\"color:#f92672\">=<\/span> cleaned_annotations[img]\n    <span style=\"color:#66d9ef\">else<\/span>:\n        os<span style=\"color:#f92672\">.<\/span>system(<span style=\"color:#e6db74\">\"cp \"<\/span><span style=\"color:#f92672\">+<\/span> img <span style=\"color:#f92672\">+<\/span> <span style=\"color:#e6db74\">\" procdata\/train\/\"<\/span>)\n        img <span style=\"color:#f92672\">=<\/span> img<span style=\"color:#f92672\">.<\/span>split(<span style=\"color:#e6db74\">\"\/\"<\/span>)[<span style=\"color:#f92672\">-<\/span><span style=\"color:#ae81ff\">1<\/span>]\n        train_annotations[img] <span style=\"color:#f92672\">=<\/span> cleaned_annotations[img]\n<span style=\"color:#75715e\"># put different annotations in different folders<\/span>\n<span style=\"color:#66d9ef\">with<\/span> open(<span style=\"color:#e6db74\">'procdata\/val\/via_region_data.json'<\/span>, <span style=\"color:#e6db74\">'w'<\/span>) <span style=\"color:#66d9ef\">as<\/span> fp:\n    json<span style=\"color:#f92672\">.<\/span>dump(valid_annotations, fp)\n<span style=\"color:#66d9ef\">with<\/span> open(<span style=\"color:#e6db74\">'procdata\/train\/via_region_data.json'<\/span>, <span style=\"color:#e6db74\">'w'<\/span>) <span style=\"color:#66d9ef\">as<\/span> fp:\n    json<span style=\"color:#f92672\">.<\/span>dump(train_annotations, fp)<\/code><\/pre>\n<\/div>\n<p>After running the above code, we will get the data in the below folder structure:<\/p>\n<pre><code>- procdata\n     - train\n         - img1.jpg\n         - img2.jpg\n         - via_region_data.json\n     - val\n         - img3.jpg\n         - img4.jpg\n         - via_region_data.json\n<\/code><\/pre>\n<hr\/>\n<h2 id=\"2-setup-the-coding-environment\">2. Setup the Coding Environment<\/h2>\n<p>We will use the code from the <a href=\"https:\/\/github.com\/matterport\/Mask_RCNN\" rel=\"nofollow noopener noreferrer\" target=\"_blank\">matterport\/Mask_RCNN<\/a> GitHub repository. You can start by cloning the repository and installing the required libraries.<\/p>\n<div class=\"highlight\">\n<pre style=\"color:#f8f8f2;background-color:#272822;-moz-tab-size:4;-o-tab-size:4;tab-size:4\"><code class=\"language-bash\" data-lang=\"bash\">git clone https:\/\/github.com\/matterport\/Mask_RCNN\ncd Mask_RCNN\npip install -r requirements.txt<\/code><\/pre>\n<\/div>\n<p>Once we are done with installing the dependencies and cloning the repo, we can start with implementing our project.<\/p>\n<p>We make a copy of the samples\/balloon directory in Mask_RCNN folder and create a <strong><em>samples\/guns_and_swords<\/em><\/strong> directory where we will continue our work:<\/p>\n<div class=\"highlight\">\n<pre style=\"color:#f8f8f2;background-color:#272822;-moz-tab-size:4;-o-tab-size:4;tab-size:4\"><code class=\"language-bash\" data-lang=\"bash\">cp -r samples\/balloon samples\/guns_and_swords<\/code><\/pre>\n<\/div>\n<h3 id=\"setting-up-the-code\">Setting up the Code<\/h3>\n<p>We start by renaming and changing balloon.py in the <code>samples\/guns_and_swords<\/code> directory to <code>gns.py<\/code>. The <code>balloon.py<\/code> file right now trains for one target. I have extended it to use multiple targets. In this file, we change:<\/p>\n<ol>\n<li>\n<p><code>balloonconfig<\/code> to <code>gnsConfig<\/code><\/p>\n<\/li>\n<li>\n<p><code>BalloonDataset<\/code> to <code>gnsDataset<\/code> : We changed some code here to get the target names from our annotation data and also give multiple targets.<\/p>\n<\/li>\n<li>\n<p>And some changes in the train function<\/p>\n<\/li>\n<\/ol>\n<p>Showing only the changed <code>gnsConfig<\/code> here to get you an idea. You can take a look at the whole <a href=\"https:\/\/github.com\/MLWhiz\/object_detection\/blob\/master\/guns_and_swords\/gns.py\" rel=\"nofollow noopener noreferrer\" target=\"_blank\">gns.py<\/a> code here.<\/p>\n<div class=\"highlight\">\n<pre style=\"color:#f8f8f2;background-color:#272822;-moz-tab-size:4;-o-tab-size:4;tab-size:4\"><code class=\"language-py\" data-lang=\"py\"><span style=\"color:#66d9ef\">class<\/span> <span style=\"color:#a6e22e\">gnsConfig<\/span>(Config):\n    <span style=\"color:#e6db74\">\"\"\"Configuration for training on the toy  dataset.\n<\/span><span style=\"color:#e6db74\">    Derives from the base Config class and overrides some values.\n<\/span><span style=\"color:#e6db74\">    \"\"\"<\/span>\n    <span style=\"color:#75715e\"># Give the configuration a recognizable name<\/span>\n    NAME <span style=\"color:#f92672\">=<\/span> <span style=\"color:#e6db74\">\"gns\"<\/span>\n    <span style=\"color:#75715e\"># We use a GPU with 16GB memory, which can fit three image.<\/span>\n    <span style=\"color:#75715e\"># Adjust down if you use a smaller GPU.<\/span>\n    IMAGES_PER_GPU <span style=\"color:#f92672\">=<\/span> <span style=\"color:#ae81ff\">3<\/span>\n    <span style=\"color:#75715e\"># Number of classes (including background)<\/span>\n    NUM_CLASSES <span style=\"color:#f92672\">=<\/span> <span style=\"color:#ae81ff\">1<\/span> <span style=\"color:#f92672\">+<\/span> <span style=\"color:#ae81ff\">2<\/span>  <span style=\"color:#75715e\"># Background + sword + gun<\/span>\n    <span style=\"color:#75715e\"># Number of training steps per epoch<\/span><\/code><\/pre>\n<\/div>\n<hr\/>\n<h2 id=\"3-visualizing-images-and-masks\">3. Visualizing Images and Masks<\/h2>\n<p>Once we are done with changing the <code>gns.py<\/code> file,we can visualize our masks and images. You can do simply by following this <a href=\"https:\/\/github.com\/MLWhiz\/object_detection\/blob\/master\/guns_and_swords\/1.%20Visualize%20Dataset.ipynb\" rel=\"nofollow noopener noreferrer\" target=\"_blank\">Visualize Dataset.ipynb<\/a> notebook.<\/p>\n<p><img decoding=\"async\" src=\"https:\/\/mlwhiz.com\/images\/weapons\/a.png\" alt=\"\"\/><\/p>\n<hr\/>\n<h2 id=\"4-train-the-maskrcnn-model-with-transfer-learning\">4. Train the MaskRCNN Model with Transfer Learning<\/h2>\n<p>To train the maskRCNN model, on the Guns and Swords dataset, we need to run one of the following commands on the command line based on if we want to initialise our model with COCO weights or imagenet weights:<\/p>\n<div class=\"highlight\">\n<pre style=\"color:#f8f8f2;background-color:#272822;-moz-tab-size:4;-o-tab-size:4;tab-size:4\"><code class=\"language-bash\" data-lang=\"bash\"><span style=\"color:#75715e\"># Train a new model starting from pre-trained COCO weights<\/span>\n python3 gns.py train \u2014 dataset<span style=\"color:#f92672\">=<\/span>\/path\/to\/dataset \u2014 weights<span style=\"color:#f92672\">=<\/span>coco\n\n<span style=\"color:#75715e\"># Resume training a model that you had trained earlier<\/span>\n python3 gns.py train \u2014 dataset<span style=\"color:#f92672\">=<\/span>\/path\/to\/dataset \u2014 weights<span style=\"color:#f92672\">=<\/span>last\n\n<span style=\"color:#75715e\"># Train a new model starting from ImageNet weights<\/span>\n python3 gns.py train \u2014 dataset<span style=\"color:#f92672\">=<\/span>\/path\/to\/dataset \u2014 weights<span style=\"color:#f92672\">=<\/span>imagenet<\/code><\/pre>\n<\/div>\n<p>The command with weights=last will resume training from the last epoch. The weights are going to be saved in the logs directory in the Mask_RCNN folder.<\/p>\n<p>This is how the loss looks after our final epoch.<\/p>\n<p><img decoding=\"async\" src=\"https:\/\/mlwhiz.com\/images\/weapons\/12.png\" alt=\"\"\/><\/p>\n<h3 id=\"visualize-the-losses-using-tensorboard\">Visualize the losses using Tensorboard<\/h3>\n<p>You can take advantage of tensorboard to visualise how your network is performing. Just run:<\/p>\n<div class=\"highlight\">\n<pre style=\"color:#f8f8f2;background-color:#272822;-moz-tab-size:4;-o-tab-size:4;tab-size:4\"><code class=\"language-bash\" data-lang=\"bash\">tensorboard --logdir ~\/objectDetection\/Mask_RCNN\/logs\/gns20191010T1234<\/code><\/pre>\n<\/div>\n<p>You can get the tensorboard at<\/p>\n<pre><code>https:\/\/localhost:6006\n<\/code><\/pre>\n<p>Here is how our mask loss looks like:<\/p>\n<p><img decoding=\"async\" src=\"https:\/\/mlwhiz.com\/images\/weapons\/13.png\" alt=\"\"\/><\/p>\n<p>We can see that the validation loss is performing pretty abruptly. This is expected as we only have kept 20 images in the validation set.<\/p>\n<hr\/>\n<h2 id=\"5-prediction-on-new-images\">5. Prediction on New Images<\/h2>\n<p>Predicting a new image is also pretty easy. Just follow the <a href=\"https:\/\/github.com\/MLWhiz\/object_detection\/blob\/master\/guns_and_swords\/2.%20predict.ipynb\" rel=\"nofollow noopener noreferrer\" target=\"_blank\">prediction.ipynb<\/a> notebook for a minimal example using our trained model. Below is the main part of the code.<\/p>\n<div class=\"highlight\">\n<pre style=\"color:#f8f8f2;background-color:#272822;-moz-tab-size:4;-o-tab-size:4;tab-size:4\"><code class=\"language-py\" data-lang=\"py\"><span style=\"color:#75715e\"># Function taken from utils.dataset<\/span>\n<span style=\"color:#66d9ef\">def<\/span> <span style=\"color:#a6e22e\">load_image<\/span>(image_path):\n    <span style=\"color:#e6db74\">\"\"\"Load the specified image and return a [H,W,3] Numpy array.\n<\/span><span style=\"color:#e6db74\">    \"\"\"<\/span>\n    <span style=\"color:#75715e\"># Load image<\/span>\n    image <span style=\"color:#f92672\">=<\/span> skimage<span style=\"color:#f92672\">.<\/span>io<span style=\"color:#f92672\">.<\/span>imread(image_path)\n    <span style=\"color:#75715e\"># If grayscale. Convert to RGB for consistency.<\/span>\n    <span style=\"color:#66d9ef\">if<\/span> image<span style=\"color:#f92672\">.<\/span>ndim <span style=\"color:#f92672\">!=<\/span> <span style=\"color:#ae81ff\">3<\/span>:\n        image <span style=\"color:#f92672\">=<\/span> skimage<span style=\"color:#f92672\">.<\/span>color<span style=\"color:#f92672\">.<\/span>gray2rgb(image)\n    <span style=\"color:#75715e\"># If has an alpha channel, remove it for consistency<\/span>\n    <span style=\"color:#66d9ef\">if<\/span> image<span style=\"color:#f92672\">.<\/span>shape[<span style=\"color:#f92672\">-<\/span><span style=\"color:#ae81ff\">1<\/span>] <span style=\"color:#f92672\">==<\/span> <span style=\"color:#ae81ff\">4<\/span>:\n        image <span style=\"color:#f92672\">=<\/span> image[<span style=\"color:#f92672\">...<\/span>, :<span style=\"color:#ae81ff\">3<\/span>]\n    <span style=\"color:#66d9ef\">return<\/span> image\n<span style=\"color:#75715e\"># path to image to be predicted<\/span>\nimage <span style=\"color:#f92672\">=<\/span> load_image(<span style=\"color:#e6db74\">\"..\/..\/..\/data\/2c8ce42709516c79.jpg\"<\/span>)\n<span style=\"color:#75715e\"># Run object detection<\/span>\nresults <span style=\"color:#f92672\">=<\/span> model<span style=\"color:#f92672\">.<\/span>detect([image], verbose<span style=\"color:#f92672\">=<\/span><span style=\"color:#ae81ff\">1<\/span>)\n<span style=\"color:#75715e\"># Display results<\/span>\nax <span style=\"color:#f92672\">=<\/span> get_ax(<span style=\"color:#ae81ff\">1<\/span>)\nr <span style=\"color:#f92672\">=<\/span> results[<span style=\"color:#ae81ff\">0<\/span>]\na <span style=\"color:#f92672\">=<\/span> visualize<span style=\"color:#f92672\">.<\/span>display_instances(image, r[<span style=\"color:#e6db74\">'rois'<\/span>], r[<span style=\"color:#e6db74\">'masks'<\/span>], r[<span style=\"color:#e6db74\">'class_ids'<\/span>], dataset<span style=\"color:#f92672\">.<\/span>class_names, r[<span style=\"color:#e6db74\">'scores'<\/span>], ax<span style=\"color:#f92672\">=<\/span>ax,\n                            title<span style=\"color:#f92672\">=<\/span><span style=\"color:#e6db74\">\"Predictions\"<\/span>)<\/code><\/pre>\n<\/div>\n<p>This is how the result looks for some images in the validation set:<\/p>\n<p><img decoding=\"async\" src=\"https:\/\/mlwhiz.com\/images\/weapons\/b.png\" alt=\"\"\/><\/p>\n<hr\/>\n<h2 id=\"improvements\">Improvements<\/h2>\n<p>The results don\u2019t look very promising and leave a lot to be desired, but that is to be expected because of very less training data(60 images). One can try to do the below things to improve the model performance for this weapon detector.<\/p>\n<ol>\n<li>\n<p>We just trained on 60 images due to time constraints. While we used transfer learning the data is still too less \u2014 Annotate more data.<\/p>\n<\/li>\n<li>\n<p>Train for more epochs and longer time. See how validation loss and training loss looks like.<\/p>\n<\/li>\n<li>\n<p>Change hyperparameters in the mrcnn\/config file in the Mask_RCNN directory. For information on what these hyperparameters mean, take a look at my previous post. The main ones you can look at:<\/p>\n<\/li>\n<\/ol>\n<div class=\"highlight\">\n<pre style=\"color:#f8f8f2;background-color:#272822;-moz-tab-size:4;-o-tab-size:4;tab-size:4\"><code class=\"language-py\" data-lang=\"py\"><span style=\"color:#75715e\"># if you want to provide different weights to different losses<\/span>\nLOSS_WEIGHTS <span style=\"color:#f92672\">=<\/span>{<span style=\"color:#e6db74\">'rpn_class_loss'<\/span>: <span style=\"color:#ae81ff\">1.0<\/span>, <span style=\"color:#e6db74\">'rpn_bbox_loss'<\/span>: <span style=\"color:#ae81ff\">1.0<\/span>, <span style=\"color:#e6db74\">'mrcnn_class_loss'<\/span>: <span style=\"color:#ae81ff\">1.0<\/span>, <span style=\"color:#e6db74\">'mrcnn_bbox_loss'<\/span>: <span style=\"color:#ae81ff\">1.0<\/span>, <span style=\"color:#e6db74\">'mrcnn_mask_loss'<\/span>: <span style=\"color:#ae81ff\">1.0<\/span>}\n\n<span style=\"color:#75715e\"># Length of square anchor side in pixels<\/span>\nRPN_ANCHOR_SCALES <span style=\"color:#f92672\">=<\/span> (<span style=\"color:#ae81ff\">32<\/span>, <span style=\"color:#ae81ff\">64<\/span>, <span style=\"color:#ae81ff\">128<\/span>, <span style=\"color:#ae81ff\">256<\/span>, <span style=\"color:#ae81ff\">512<\/span>)\n\n<span style=\"color:#75715e\"># Ratios of anchors at each cell (width\/height)<\/span>\n<span style=\"color:#75715e\"># A value of 1 represents a square anchor, and 0.5 is a wide anchor<\/span>\nRPN_ANCHOR_RATIOS <span style=\"color:#f92672\">=<\/span> [<span style=\"color:#ae81ff\">0.5<\/span>, <span style=\"color:#ae81ff\">1<\/span>, <span style=\"color:#ae81ff\">2<\/span>]<\/code><\/pre>\n<\/div>\n<hr\/>\n<h2 id=\"conclusion\">Conclusion<\/h2>\n<p><strong><em>In this post, I talked about how to implement Instance segmentation using Mask-RCNN for a custom dataset.<\/em><\/strong><\/p>\n<p>I tried to make the coding part as simple as possible and hope you find the code useful. In the next part of this post, I will deploy this model using a web app. So stay tuned.<\/p>\n<p>You can download the annotated weapons data as well as the code at <a href=\"https:\/\/github.com\/MLWhiz\/object_detection\" rel=\"nofollow noopener noreferrer\" target=\"_blank\">Github<\/a>.<\/p>\n<p>If you want to know more about various <strong><em>Object Detection techniques, motion estimation, object tracking in video etc<\/em><\/strong>., I would like to recommend this awesome course on <a href=\"https:\/\/www.coursera.org\/specializations\/aml?siteID=lVarvwc5BD0-AqkGMb7JzoCMW0Np1uLfCA&amp;utm_content=2&amp;utm_medium=partners&amp;utm_source=linkshare&amp;utm_campaign=lVarvwc5BD0\" rel=\"nofollow noopener noreferrer\" target=\"_blank\">Deep Learning in Computer Vision<\/a> in the <a href=\"https:\/\/www.coursera.org\/specializations\/aml?siteID=lVarvwc5BD0-AqkGMb7JzoCMW0Np1uLfCA&amp;utm_content=2&amp;utm_medium=partners&amp;utm_source=linkshare&amp;utm_campaign=lVarvwc5BD0\" rel=\"nofollow noopener noreferrer\" target=\"_blank\">Advanced machine learning specialization<\/a>.<\/p>\n<p>Thanks for the read. I am going to be writing more beginner-friendly posts in the future too. Follow me up at <a href=\"https:\/\/medium.com\/@rahul_agarwal\" rel=\"nofollow noopener noreferrer\" target=\"_blank\"><strong>Medium<\/strong><\/a> or Subscribe to my <a href=\"http:\/\/eepurl.com\/dbQnuX\" rel=\"nofollow noopener noreferrer\" target=\"_blank\"><strong>blog<\/strong><\/a> to be informed about them. As always, I welcome feedback and constructive criticism and can be reached on Twitter <a href=\"https:\/\/twitter.com\/MLWhiz\" rel=\"nofollow noopener noreferrer\" target=\"_blank\">@mlwhiz<\/a>.<\/p>\n<p>Also, a small disclaimer \u2014 There might be some affiliate links in this post to relevant resources as sharing knowledge is never a bad idea.<\/p>\n<\/p><\/div>\n<p><script async src=\"\/\/platform.twitter.com\/widgets.js\" charset=\"utf-8\"><\/script><script>\n\t  !function(f,b,e,v,n,t,s)\n\t  {if(f.fbq)return;n=f.fbq=function(){n.callMethod?\n\t  n.callMethod.apply(n,arguments):n.queue.push(arguments)};\n\t  if(!f._fbq)f._fbq=n;n.push=n;n.loaded=!0;n.version='2.0';\n\t  n.queue=[];t=b.createElement(e);t.async=!0;\n\t  t.src=v;s=b.getElementsByTagName(e)[0];\n\t  s.parentNode.insertBefore(t,s)}(window, document,'script',\n\t  'https:\/\/connect.facebook.net\/en_US\/fbevents.js');\n\t  fbq('init', '1062344757288542');\n\t  fbq('track', 'PageView');\n\t<\/script><br \/>\n<br \/>[ad_2]<br \/>\n<br \/><a href=\"https:\/\/mlwhiz.com\/blog\/2019\/12\/06\/weapons\/\" target=\"_blank\" rel=\"noopener\">Source link <\/a><\/p>\n","protected":false},"excerpt":{"rendered":"<p>[ad_1] Object Detection is a helpful tool to have in your coding repository. It forms the backbone of many fantastic industrial applications. Some of them being self-driving cars, medical imaging and face detection. In my last post on Object detection, I talked about how Object detection models evolved. But what good is theory, if we [&hellip;]<\/p>\n","protected":false},"author":1,"featured_media":712,"comment_status":"open","ping_status":"open","sticky":false,"template":"","format":"standard","meta":{"footnotes":"","jetpack_post_was_ever_published":false},"categories":[1],"tags":[],"class_list":["post-711","post","type-post","status-publish","format-standard","has-post-thumbnail","hentry","category-uncategorized"],"blocksy_meta":[],"jetpack_featured_media_url":"https:\/\/e928cfdc7rs.exactdn.com\/info\/uploads\/sites\/3\/2019\/12\/Implementing-Object-Detection-and-Instance-Segmentation-for-Data-Scientists-scaled.png?strip=all","jetpack_shortlink":"https:\/\/wp.me\/p2TFCd-bt","jetpack_sharing_enabled":true,"jetpack-related-posts":[],"_links":{"self":[{"href":"https:\/\/www.danielparente.net\/en\/wp-json\/wp\/v2\/posts\/711","targetHints":{"allow":["GET"]}}],"collection":[{"href":"https:\/\/www.danielparente.net\/en\/wp-json\/wp\/v2\/posts"}],"about":[{"href":"https:\/\/www.danielparente.net\/en\/wp-json\/wp\/v2\/types\/post"}],"author":[{"embeddable":true,"href":"https:\/\/www.danielparente.net\/en\/wp-json\/wp\/v2\/users\/1"}],"replies":[{"embeddable":true,"href":"https:\/\/www.danielparente.net\/en\/wp-json\/wp\/v2\/comments?post=711"}],"version-history":[{"count":0,"href":"https:\/\/www.danielparente.net\/en\/wp-json\/wp\/v2\/posts\/711\/revisions"}],"wp:featuredmedia":[{"embeddable":true,"href":"https:\/\/www.danielparente.net\/en\/wp-json\/wp\/v2\/media\/712"}],"wp:attachment":[{"href":"https:\/\/www.danielparente.net\/en\/wp-json\/wp\/v2\/media?parent=711"}],"wp:term":[{"taxonomy":"category","embeddable":true,"href":"https:\/\/www.danielparente.net\/en\/wp-json\/wp\/v2\/categories?post=711"},{"taxonomy":"post_tag","embeddable":true,"href":"https:\/\/www.danielparente.net\/en\/wp-json\/wp\/v2\/tags?post=711"}],"curies":[{"name":"wp","href":"https:\/\/api.w.org\/{rel}","templated":true}]}}