ISSN: 0970-938X (Print) | 0976-1683 (Electronic)

Biomedical Research

An International Journal of Medical Sciences

Research Article - Biomedical Research (2018) Volume 29, Issue 14

Computer vision based automated cell counting pipeline: a case study for HL60 cancer cell on hemocytometer

Akın Özkan1*, Sultan Belgin İşgör2, Gökhan Şengül3 and Yasemin Gülgün İşgör4

1Department of Electrical and Electronics Engineering, Atilim University, Ankara, Turkey

2Department of Chemical Engineering and Applied Chemistry, Atilim University, Ankara, Turkey

3Department of Computer Engineering, Atilim University, Ankara, Turkey

4Medical Laboratory Techniques, Ankara University Vocational School of Health, Ankara, Turkey

*Corresponding Author:
Akin Ozkan
Department of Electrical and Electronics Engineering
Atilim University Faculty of Engineering, Turkey

Accepted date: July 05, 2018

DOI: 10.4066/biomedicalresearch.29-18-575

Visit for more related articles at Biomedical Research

Abstract

Counting of cells can give useful information about the cell density to understand the concerning cell culture condition. Usually, cell counting can be achieved manually with the help of the microscope and hemocytometer by the domain experts. The main drawback of the manual counting procedure is that the reliability highly depends on the experience and concentration of the examiners. Therefore, computer vision based automated cell counting is an essential tool to improve the accuracy. Although the commercial automated cell counting systems are available in the literature, their high cost limits their broader usage. In this study, we present a cell counting pipeline for light microscope images based on hemocytometer that can be easily adapted to the various cell types. The proposed method is robust to adverse image and cell culture conditions such as cell shape deformations, lightning conditions and brightness differences. In addition, we collect a novel human promyelocytic leukemia (HL60) cancer cell dataset to test our pipeline. The experimental results are presented in three measures: recall, precision and F-measure. The method reaches up to 98%, 92%, and 95% based on these three measures respectively by combining Support Vector Machine (SVM) and Histogram of Oriented Gradient (HOG).

Keywords

Cell counting, Visual feature extraction, Hemocytometer, HL60, Light microscope.

Introduction

The cell counting procedure is an indispensable part of all cell culture experiments [1]. Knowledge of the considered cell quantity is an important parameter for the cell-based experiments. Starting with an accurate cell amount supports and guaranties the reliability of the experimental results. The unstable estimations of cell density may end up with the biased experimental results accordingly. In addition, it helps to maintain the density of cell culture for the optimal growth rate. A variety of false impressions is common in the manual counting process caused by examiner’s experience and tiredness. Therefore, an automated cell counting system is inevitably needed to improve the accuracy of the cell counting process. In the literature, the cell counting is frequently done manually with the help of the light microscope [2] and hemocytometer [3] by the experts.

Briefly, hemocytometer is a piece of glass on which lines are drawn in a certain size to help the manual counting. Ultimately, it contains a particular volume of the solution. By using the count of cell samples, cells/ml can be used by getting information about the overall cell suspension.

Light microscopes are simply the type of microscope which have the light source that is placed at the bottom and the objective is at the top. Moreover, the sample that will be examined is placed between these two. There are several objective ratios for the microscopes, and the researcher can set this value accordingly to their needs. Note that depending on the increase of the objective values, the detail of the view can be increased. However, the size of the visible area decreases. Lastly, the 40X objective setting is often set during the manual cell counting.

HL60 [4] is the cell type used throughout the study which corresponds to the kind of leukemia cancer. These cells have the ability to divide them indefinitely, thus it makes them a continuous cell source for the cancer research. To this end, HL60 cells are used as an essential material in various microbiological studies to track new anticancer compounds on the leukemia treatment. Morphologically, HL60 cells are rounded and their diameters vary within a range.

The basis of the proposed method is the adaptation of hemocytomer-based manual counting to automated procedure by adding computer vision based pipeline to reduce its shortcomings. The proposed pipeline similarly uses a microscope and hemocytometer as in the manual counting while it eliminates the shortcomings of human labor and reliability.

Related Work

The automated cell counting studies can be grouped under three major categories, namely, cell counting for blood cells, cell colony counting (or bacteria) and hemocytometer-based cell counting. For the first category, there is a group of studies that is done for the blood cells. In essence, in these studies [5,6], two common characteristics have been emerged. First, the methods have been adapted to the blood cell images with high microscopic magnifications (e.g. 1000X). Second, the studies do not use hemocytometer and they singly utilize the images that have only cells with an uniformly distributed background. Ultimately, it is quite easy to distinguish. The second category, cell colony counting [7], is a fundamental way of collecting information about the cell cultures. Researchers use this information to evaluate the side effects of the antibiotics and drug safety.

The third group of automated cell counting is based on hemocytometer-based cell counting. In the study introduced by Brinda et al. [8,9], the cells are segmented by using a recursive algorithm. They test their algorithm on the conventional hemocytometer corners and the test images are quite limited. At the end, the quantitative results for cell population are reported with the accuracy of 95%-100%.

Claudio and Leonilda [10] proposed that cells are counted by using the captured images at two different focal settings which mean that the users need to acquire a sample twice. They claim that the method can obtain approximately 97% of cells.

Yu-Wei et al. [11] presents an approach which uses image processing techniques to count the number of cells for a given image. The corners of hemocytometer are used as the counting area and the probability of cell overlaps is not considered in particular. As claimed, the highest hit rate is 100% and the corresponding miss rate is 0%. Note that they also explain that their test results are deployed on only six different images which is relatively small.

In the very recent study, Dong et al. [12] concentrates on the insect cell counting pipeline by using bright-field microscopy. They exploit overly focused images to decrease the domination of hemocytometer lines with respect to the cells. Briefly, the pipeline is based on image filtering. Their final error rate is 2.21% on average, ranging from 0.89% to 3.97%.

To best of our knowledge, there is no distinct study in the literature that presents a comprehensive pipeline on hemocytometer based cell counting by accounting all adverse conditions exhibited for the problem. In addition, there is no publicly available dataset in the literature for HL60 cancer cells that can be used for the academic studies.

Materials and Methods

In this section, fundamental concepts are revised which will be used in the proposed pipeline.

HL60_HEM40X_CC image dataset

In this study, we propose a baseline dataset referred as HL60_HEM40x_CC and it is available from “biochem.atilim.edu.tr/datasets/” web address. This dataset contains unstained HL60 cancer cell images with the magnification factor 40X by hemocytometer (HEM40X) for Cell Counting (CC). HL60_HEM40x_CC is composed of three main components as the image sets, ground truth cell annotations, and the counting area boundary annotations.

Image sets: Images are acquired in different sessions by Motic B3-Series 2.0 Megapixels Moticam 2000 camera attached to the microscope. Moreover, the dataset contains 468 Red- Green-Blue (RGB) images in 1200 × 1600 pixel resolution. We divide the dataset into two subsets as Sets 1 and 2 for the purpose of using each set in either training or testing stage ones at a time. Note that the dataset contains challenging samples which possesses realistic conditions. In particular, imperfect visualization, cell shape deformations, varying lighting, clumped cells and impurities are some of the real conditions observed in the dataset.

Ground truth cell annotations: Ground truth cell coordinates for each image are annotated by three experts by labelling 6890 cells in total. Experts annotate all cell locations as ‘Positive’ and non-cell locations as ‘Negative’ samples and the statistics of the sets are summarized in Table 1. Additionally, sample cell and non-cell images are available on the dataset website.

  Labels Number of cells
Set 1 Set 2
Cell 2621 4269
Non-cell 3548 4583

Table 1: Statistics of annotated HL60 cells for each set.

The ground truth annotations for each image are stored in Comma Separated Values (CSV) file format. The coordinate annotations are in form of bounding box where x, y, width and height indicate the coordinate of the upper-left of the bounding box and its width and height values respectively.

Counting area boundary annotations: There is a conventional counting rule [13] that needs to be utilized to avoid double-counting. More precisely, the cells intersecting left and top sides of the middle of the triple lines are counted (i.e. counting area) yet the cells on the right and bottom ones are not counted in the process.

For each image, the counting area boundary annotations are defined by employing the boundaries and they are manually annotated by the experts. Ultimately, each of the annotations are served separate file as the same format cell annotations.

Each of the file has four entities which define the top, bottom, right and left boundaries of the counting area respectively.

Bounding box and sliding window

A bounding box [14] (i.e. bounding rectangle) is exploited to describe the convex area that covers a single cell inside the image. Intuitively, the bounding box collusion refers to the process of detecting the intersections between two bounding boxes. If the collided area is larger than 0, then the overlap ratio must be calculated to evaluate the level of collusion.

Basically, the sliding window approach [15] has a certain size of the imaging window (i.e. bounding box) and it slides across the input image with a step size. Each and every step of sliding, a sub-image patch is cropped from the input image. If we consider the image as a single matrix, the sliding window approach can be assumed as the process of separation operations that separate the large matrix into smaller sub matrices.

In the construction of the dataset, experts use the square bounding boxes to annotate the cell locations. Practically, the bounding box collusion is utilized to evaluate how similar expert annotates and the pipeline outputs are. This is a ratio type called as minimum type collusion rate (i.e. Min type collusion rate). Formally, the minimum type collusion rate is calculated as the area of intersection between bounding box A and bounding box B and it is divided by the minimum area of these two bounding boxes.

Feature extractors

Feature extraction is the way of representing an image by using less number of descriptors obtained from the image pixel relations. There are three promising feature extractors that are used throughout the paper, namely, Local Binary Pattern (LBP), Local Phase Quantization (LPQ) and Histogram Oriented Gradients (HOG).

LBP [16] derives a description by using the texture features obtained from local neighborhoods. Precisely, it uses a center pixel as an anchor point and labels the pixels N that are within this circular neighborhood by radius R. Ultimately, if the value of a pixel in the neighborhood is larger than the value of the center pixel, it is labeled as 1 otherwise it is equal to 0. The results for each center pixel are encoded and a histogram is built from the outputs of all corresponding pixels of the whole image.

LPQ [17] uses local phase information to generate blur and illumination invariant features. The main idea is to transform an input image to frequency domain by computing the Short- Term Fourier Transform from rectangular neighborhoods. The phase information is expressed as binary coefficients. Then, the feature is produced by representing 8 binary values obtained from binary coefficients as the integer value varying from 0 to 255. As feature extractor, LPQ method is invariant to illumination changes compared to the LBP.

HOG [18] is one of the feature descriptors that are widely used in the pattern recognition community. It accounts the occurrence of gradient orientations inside a sub-area of the image. More precisely, the extraction of HOG features begins by dividing the image into small cells (i.e., sub-areas), then it calculates the local histogram of gradients representing by the directions of the pixels. To this end, it computes local histograms from the cells and converts to a single 1-by-N feature vector where N is the length of the feature. Practically, this feature encodes the local shape information of the objects perfectly and it is robust to illumination changes due to the gradient operations.

Classifiers

Learning latent patterns about data (i.e., classification) can be achieved by exploiting human annotations that is renowned as supervised learning in the literature. Various supervised learning techniques are covered such as K-Nearest Neighbors (KNN), Support Vector Machine (SVM) and Random Forest (RF) in the paper.

KNN [19] practically caches all of the training samples, and predicts the label for an unseen input by counting the labels of training samples from a certain number (K) of nearest neighbors. Even though it is quite simple, KNN achieve successful results for a large number of classification problems. Its classification score (i.e., highest one) indicates that the confidence of the prediction for an unseen input corresponds to a particular class. The scores can be considered as the probability distribution which intuitively corresponds to the ratio of sample labels within K of the nearest neighbors of the input.

SVM [20] separate the decision boundaries of the feature space such that it ultimately maximizes the margin between two individual classes by learning the hyperplane. In the training stage, nonlinear mapping can be also exploited by projecting the input sample to a high-dimensional feature space with a kernel function. To this end, the separation of samples is promoted by this assumption. There are several well-known kernel functions in the literature and Radial Basis Function (RBF) is the most preferred one. In the test stage, a new input is classified based on their mapped directions of the hyperplane. Furthermore, SVM is one of the most outperforming classification techniques, due to its good generalization capacity. More precisely, for a small number of training samples, it can still model data effectively. Lastly, SVM predicts only class labels without any confidence score information. Platt suggests an extra calculation to extend the SVM model to compute the score estimates. The scores correspond to the distances of the samples to the hyperplane. Hence, the higher score indicates the higher confidence.

RF [21] is one of the most popular approaches and has shown significant performance compared to the algorithms used for various classification problems. It contains a series of tree classifiers in which they are used to classify the samples. First, the input sample is modeled with the classification trees in the forest. Next, each tree returns a classification score and RF decides the classification result by considering these votes.

K-fold cross validation

In the literature, there is an idea which can be expressed as that there is no ultimate classifier for all feature extractors that constantly outperforms in all pattern recognition tasks [22]. Kfold cross validation [23] is a commonly used technique to evaluate the predictive accuracy of multiple approaches that yield the best combination. Intuitively, it splits the data into k subsets. Then, a test fold is chosen from one of the k subsets and the remaining k-1 subsets are used for training for the each iteration.

The proposed pipeline

Even if the grid structure for hemocytometer is critical for manual cell counting, it is a major drawback for the computer vision. A simple solution is to replace these lines by normalizing the values with the background color. However, when the approach is applied, the white lines are eliminated (i.e., the color is converted) yet the cells on these lines are also deformed and the counting results becomes inconsistent. Therefore, we construct our dataset such that it contains training samples that account this drawback. Hence, the classification models can learn the pattern and become more robust without the need of any extra processing step.

Furthermore, the main objective of our method is to exploit the assumption of sliding windows and the vision-based pattern recognition algorithms. Precisely, all possible cell locations are iteratively investigated by cropping small sub-image patches from the pre-defined window size. This step should be efficient in order to deploy an effective model. To ease the understandability, the proposed method is divided into five steps.

Improving image quality

The success rate of the proposed method is highly influenced by the quality of images so that it is important to keep the image qualities at an optimum level. In the enhancement step, first, a raw RGB input image is converted to a gray level image by forming the weighted sum of R, G, and B components as follows:

Gray level=0.2989 × R+0.5870 × G+0.1140 × B → (1)

Second, the pixel intensities are mapped to the values by saturating 1% of the lowest and highest values as explained in [24]. Then, the raw images become more interpretable for further image analyses. Note that this procedure is accounted for both the training and test data.

Defining the counting area

In the hemocytometer based cell counting, there is a rule to avoid double-counting that explained in the materials and methods section. For this purpose, the counting region should be initially determined. This step corresponds to the estimation of counting area of the input image. Since, the counting area boundary locations are provided in the dataset, there is no need to utilize any other step. However, this step might be automated in future.

Reducing search space

Most of the images in the data set contains a fewer number of cells than non-cell (i.e. non-textured) regions. Hence, the elimination of empty areas by a simple algorithm ultimately reduces the search space (i.e., computation complexity), thus the approach becomes more efficient in terms of processing time. The precision rate for non-cell region scan be also improved (i.e., misclassification rate). Edge detection and sliding window are used together to reduce the cell search space. For a given image, the edge density measures the density of edges obtained by the edge detector. Ultimately, the edge density can differ for the regions that contain high/low edge information.

Therefore, before finding cell locations in detail, edge detection is computed to determine the possible cell locations that can be considered in the cell counting. Canny edge detector [25] is applied to each cropped sub-image and the edge density is measured. If the density value is greater than a threshold (i.e. 10%), the approach accounts the sub-image as a possible cell location; otherwise it is assessed as empty. Empirically, we tune some of the parameters that are used for this process as follows. We set the optimum window size and the step size as 50 × 50 and 5 pixels respectively.

Finding cell locations

After reducing the search space, the remaining parts of the input image might still contain cell and non-cell parts. To estimate the cell locations, various classification and feature extraction methods are adapted as previously explained. Furthermore, we can separate the estimation of the cell location stage into two steps as training and test. In the training step, a large body of fixed sized cells and non-cells are represented by the visual feature extractor and a classifier is trained by using these representations. To this end, this step obtains an ideal visual pattern for the classification and feature combination.

Note that this model is computed only once and stored for the test stage. In the test step, our approach takes all of the subimages from the sliding windows and calculates the probabilistic decisions as a posterior probability (i.e. confidences) for the possible cells. To this end, the detected cell locations are stored in a list for a final step based on their confidences.

Elimination of the nested cell locations

In the final step, the nested cell locations are eliminated from the list. Precisely, the classification algorithms can yield more than one location for each cell location. For this reason, selecting the most suitable response from a set of outcomes is needed as an additional strategy.

Non-Maximal Suppression (NMS) [26] is a widely used technique as the post-processing in the object recognition. It chooses the best likelihood out of overlapped bounding boxes regarding to their confidences. In the NMS algorithm, it selects a random bounding box and compares with the others that are collided by suppressing (i.e. removing) the lower ones. This process repeats until no more collided bounding box remains. The type of the collusion and the threshold are set as Min type ratio with 0.25 (i.e. 25%) respectively.

Results and Discussion

In order to assess the superiority of the proposed method, a series of analyses have been performed on the dataset as explained in [27]. There are several statistical metrics to evaluate the performance of the proposed pipeline [28]. Also, these metrics help to compare baselines in terms of rigid standards as recall, precision, and F-measure.

First, we utilize the k-fold cross validation to estimate the best performance between different feature extractors and classifiers by tuning the parameters. The results can be summarized in Table 2 for the best pairs and parameters. We set the k value to 3, 5, 10, and 100. Note that the results are highly correlated and support each other.

Classifier Feature Extractor
LBP LPQ HOG
R N Winsize Freqestim Cell size
KNN (k=3) 3 20 9 1 12
SVM linear 3 4 17 3 12
SVM rbf 3 4 17 2 18
RF(250) 4 20 19 1 14

Table 2: The best parameter configuration for each combination of feature extractors and classifiers using k-fold cross validation.

Second, the best combination parameters for the overall performance are evaluated in Table 3. As stated, since the HL60_HEM40X_CC dataset contains two sets of images, at each step, one of the set is used for training while the performance is reported on the other set and vice versa.

  Min type ratio with threshold 0.9 Min type ratio with threshold 0.8
Feature extractor Classifier Set 1 Set 2 Set 1 Set 2
Recall Precision F-measure Recall Precision F-measure Recall Precision F-measure Recall Precision F-measure
LBP KNN (k=3) 0.29 0.06 0.1 0.21 0.05 0.08 0.3 0.06 0.1 0.26 0.06 0.1
SVM linear 0.74 0.5 0.6 0.69 0.44 0.54 0.8 0.53 0.64 0.78 0.49 0.6
SVM rbf 0.79 0.53 0.63 0.7 0.47 0.56 0.85 0.57 0.68 0.8 0.53 0.64
RF (250) 0.75 0.48 0.59 0.73 0.48 0.58 0.89 0.58 0.7 0.87 0.57 0.69
LPQ KNN (k=3) 0.29 0.08 0.13 0.31 0.11 0.16 0.34 0.1 0.15 0.44 0.15 0.22
SVM linear 0.84 0.36 0.5 0.58 0.79 0.67 0.86 0.37 0.52 0.6 0.81 0.69
SVM rbf 0.85 0.56 0.68 0.74 0.62 0.67 0.86 0.57 0.69 0.75 0.63 0.68
RF(250) 0.91 0.75 0.82 0.86 0.76 0.81 0.93 0.78 0.85 0.89 0.78 0.83
HOG KNN (k=3) 0.54 0.4 0.46 0.67 0.58 0.62 0.67 0.49 0.57 0.8 0.69 0.74
SVM linear 0.94 0.51 0.66 0.93 0.66 0.77 0.98 0.53 0.69 0.97 0.93 0.95
SVM rbf 0.96 0.91 0.93 0.95 0.91 0.93 0.98 0.92 0.95 0.98 0.93 0.95
RF (250) 0.94 0.88 0.91 0.93 0.87 0.9 0.98 0.92 0.95 0.97 0.91 0.94

Table 3: Experimental results for the proposed pipeline. Two thresholds are selected; 0.9 for 90% and similarity 0.8 for 80% or more similarity between expert annotations and pipeline outputs.

Throughout the paper, various parameters for feature extractors and classifiers are considered. Since LBP is computed by using N sample points on a circle of radius R, the values from 2 to 8 by 2 and from 4 to 6 by 4 are considered accordingly. In similar, LPQ uses the size of local window (WinSize) with 3 to 5 by 6 and a low frequency estimation method (Freqestim). Lastly, HOG is evaluated by tuning only cell size (CellSize) from 6 to 32 by 2. KNN is tested by changing the size of k from 1 to 21 by 2 and k is set to 3 as the best configuration. SVM is used with linear and RBF kernels. RF is tested by using 250 trees.

The experimental results are presented in Table 3 as a tabular form. Moreover, recall precision and F-measure are obtained by comparing with the ground truth for each image set and depending on the different collusion ratio thresholds. Results are presented by using min type ratio with thresholds 0.9 and 0.8. 0.9 indicates that 90% or more bounding box intersection area is obtained between the method output and the ground truth. Similarly, 0.8 denotes that the results of the proposed pipeline reach to 80%or more bounding box intersection area.

To ease the understandability, the best results for each featureclassifier combination are denoted by the underlined character in terms of all performance metrics. The results show that the HOG feature extraction is clearly more reliable than the other popular methods such as LBP and LPQ. Furthermore, SVM with RBF kernel and RF yield compatible performance. KNN shows the worst performance and the performance drops 10%. Since, HOG is the best feature extractor; SVM with RBF kernel increases the overall recall and precision (i.e. Fmeasure) scores up to 3% compared to RF. In similar, it increases up to 27% compared to SVM with the linear kernel, and up to 47% compared to KNN. In conclusion, the experimental results verify that HOG and SVM combination with RBF kernel is the best performing approach for the dataset. This combination achieves 98%, 92% and 95% recall, precision, and F-measure scores respectively.

Conclusion

In this paper, we propose a pipeline for the automated hemocytometer-based cell counting that can easily have adapted to the various counting process for different cell types. Similar to the manual counting, the proposed method uses a microscope and hemocytometer while it eliminates the shortcomings and reliability of human labor.

The experiments are conducted on the well-known cancer cell type namely, HL60. Furthermore, we acquire a novel dataset as a baseline (HL60_HEM40x_CC) which is publicly available for further research. This dataset contains image samples which exhibit various adverse experimental conditions that practically simulate possible real life situations. It is released from the “biochem.atilim.edu.tr/datasets/” web address to contribute to the domain of image-based cell counting by hemocytometer. HL60_HEM40x_CC is mainly collected for cell counting that includes 468 raw hemocytometer images acquired on 40X light microscope objective and using unstained HL60 cells. Moreover, the ground truth cell annotations which comprise total 6890 cells are labeled by the experts.

From our experiment results, the pipeline reaches up to 98%, 92%, and 95% in terms of recall, precision and F-measure, respectively by combining SVM with RBF kernel and HOG. To this end, additional observations are made under the guidance of the experimental results.

Lastly, there might be studies in future on the dataset to reduce the classification errors by adapting different computer vision approaches, even though the current results yield promising level of success. Hence, new extensions can be made on the dataset in future by adding new cancer types.

Acknowledgments

This study was extracted from PhD thesis (Thesis Number 490289) titled as "Computer vision and machine learning based adaptable conversion method for any light microscope to automated cell counter by trypan blue dye-exclusion" which can be accessed from the https://tez.yok.gov.tr/UlusalTezMerkezi/.

References