Main Content

Detect Nuclei in Large Whole Slide Images Using Cellpose

This example shows how to detect cell nuclei in whole slide images (WSIs) of tissue stained using hematoxylin and eosin (H&E) by using Cellpose.

Cell segmentation is an important step in most digital pathology workflows. The number and distribution of cells in a tissue sample can be a biomarker of cancer or other diseases. Digital pathology uses digital images of microscopy slides, or whole slide images (WSIs), for clinical diagnosis or research. To capture tissue- and cellular-level detail, WSIs have high resolutions, and can have sizes on the order of 200,000-by-100,000 pixels. To facilitate efficient display, navigation, and processing of WSIs, the best practice is to store them in a multiresolution format. This example addresses two challenges to segmenting nuclei in a WSI:

  • Accurately labeling cell nuclei — This example uses a pretrained model from the Cellpose Library [1] [2], which provides deep learning models for cell segmentation. You can configure Cellpose models in MATLAB® by using the cellpose object and its object functions.

  • Managing large images — This examples uses a blocked image approach to process the WSI without loading the full image into system memory. You can manage and process images block-by-block by using the blockedImage object and its object functions.

This example requires the Medical Imaging Toolbox™ Interface for Cellpose Library. You can install the Medical Imaging Toolbox Interface for Cellpose Library from Add-On Explorer. For more information about installing add-ons, see Get and Manage Add-Ons. The Medical Imaging Toolbox Interface for Cellpose Library requires Deep Learning Toolbox™ and Computer Vision Toolbox™. For a similar example that uses only Image Processing Toolbox, see Detect and Count Cell Nuclei in Whole Slide Images.

Download Sample Image from Camelyon17 Data Set

This example uses one WSI from the Camelyon17 challenge. Run this code to download the tumor_065.zip file from the MathWorks® website, then unzip the file. The size of the data file is approximately 1.3 GB.

zipFile = matlab.internal.examples.downloadSupportFile( ...
    "image","data/camelyon17/tumor_065.zip");
filepath = fileparts(zipFile);
unzip(zipFile,filepath)

Specify the filename of the TIF file.

fileName = fullfile(filepath,"tumor_065.tif");

Create Blocked Image

Create a blockedImage object from the sample image. This image has 11 resolution levels. The finest resolution level, which is the first level, has a size of 220672-by-97792 pixels. The coarsest resolution level, which is the last level, has a size of 512-by-512 pixels and easily fits in memory.

bim = blockedImage(fileName);
disp(bim.NumLevels)
    11
disp(bim.Size)
      220672       97792           3
      110592       49152           3
       55296       24576           3
       27648       12288           3
       13824        6144           3
        7168        3072           3
        1577        3629           3
        3584        1536           3
        2048        1024           3
        1024         512           3
         512         512           3

Set the spatial referencing for the blocked image by using the setSpatialReferencingForCamelyon16 helper function, which is attached to this example as a supporting file. The helper function sets the WorldStart and WorldEnd properties of the blockedImage object using the spatial referencing information from the TIF file metadata.

bim = setSpatialReferencingForCamelyon16(bim);

Display the blocked image using the bigimageshow function.

bigimageshow(bim);

Test Cellpose Model in Representative Subregion

In this example, you use a representative region of interest to test a Cellpose model and tune its parameter values. You then apply the final model to all regions of the image that contain tissue.

Identify and display a representative region in the image.

xRegion = [52700 53000];
yRegion = [98140 98400];
xlim(xRegion)
ylim(yRegion)
title("Region of Interest")

Extract the region of interest at the finest resolution level by using the getRegion function.

imRegion = getRegion(bim,[98140 52700],[98400 53000]);

Measure Average Nuclei Diameter

To ensure accurate results, specify the approximate average diameter of the nuclei, in pixels. For this example, assume a diameter of 20 pixels.

averageCellDiameter = 20;

Alternatively, you can measure the approximate nucleus diameter in the Image Viewer app. Open the app by entering this command.

imageViewer(imRegion)

In the app, measure the nucleus of several cells to estimate the average diameter in the whole image. In the Viewer tab of the app toolstrip, click Measure Distance. In the main display pane, click on the boundary of the nucleus you want to measure and drag to draw a line across the nucleus. When you release the button, the app labels the line with a distance measurement, in pixels. You can reposition the endpoints or move the whole line by dragging the ends or middle portion of the line, respectively. This image shows an example of several distance measurements in the app.

Image Viewer app window showing several sample measurements of the nucleus diameter

Configure Nuclei Cellpose Model

Segment the image region by using the pretrained nuclei model from the Cellpose Library. This model is suitable for detecting nuclei. To learn more about other models and their training data, see the Cellpose Library documentation.

The nuclei model expects input images that have one channel in which the nuclei are bright relative to other areas. To use this model, convert the RGB input image to an inverted grayscale image.

img = im2gray(imRegion);
imgc = imcomplement(img);

Create a cellpose object for the nuclei pretrained model.

cp = cellpose(Model="nuclei");

Segment the input image using the Cellpose model. Specify the average diameter of the nuclei to the segmentCells2D function by using the ImageCellDiameter name-value argument. The function returns a 2-D label image that labels pixels of the background as 0 and the pixels of different nuclei as increasing values, starting at 1.

labels = segmentCells2D(cp,imgc,ImageCellDiameter=averageCellDiameter);

Display the original RGB image, grayscale image, inverted grayscale image, and RGB image overlaid with the predicted labels as a tiled montage. The predicted labels image shows a different color for each detected nucleus, corresponding to the value of the associated pixels in labels.

figure
tiledlayout(1,4,TileSpacing="none",Padding="tight")
nexttile
imshow(imRegion)
title("RGB Image")
nexttile
imshow(img)
title("Grayscale Image")
nexttile
imshow(imgc)
title("Inverted Image")
nexttile
imshow(labeloverlay(imRegion,labels))
title("Predicted Labels")

linkaxes(findobj(gcf,Type="axes"));

Interactively Explore Labels Across WSI

In the previous section, you configured a Cellpose model based on one representative region. Before you apply the model to the full WSI, which takes several minutes, verify that the model works well for other representative regions in the slide.

Copy, paste, and run this code in the Command Window to call the exploreCamelyonBlockedApply helper function, which is attached to this example as a supporting file. A figure window opens, showing the whole slide with a rectangular ROI on the left and zoomed in views of the ROI on the right. The function applies the model to only the current ROI inside the rectangle, which is faster than processing the full image.

exploreCamelyonBlockedApply(bim, ...
    @(bs)labeloverlay(bs.Data, ...
    segmentCells2D(cp,imcomplement(im2gray(bs.Data)), ...
    ImageCellDiameter=averageCellDiameter)), ...
    BlockSize=[256 256],InitialXY=[64511 95287])

In the figure window, use the scroll wheel to zoom in on the slide until the rectangular ROI is clearly visible. Drag the rectangle to move the ROI. When you release the button, the zoomed in views automatically update. For this example, you can see that the model works well because it has labelled the dark purple nuclei in the plane of the microscope, and the majority of touching nuclei have been correctly labeled as separate instances. If the model does not produce accurate results for your data, try tuning the model parameters. For more information, see the Refine Cellpose Segmentation by Tuning Model Parameters example.

Animation showing how to use the interactive window to explore labels across the WSI

Apply Cellpose Segmentation to Full WSI

Once you are satisfied with the Cellpose model in some representative regions, apply the model to the full WSI. For efficiency, apply the Cellpose model only to blocks that contain tissue, and ignore blocks without tissue. You can identify blocks with tissue using a coarse tissue mask.

Create Tissue Mask

In this example, you create a tissue mask by using the createBinaryTissueMask helper function, which is attached to this example as a supporting file. The code for the function has been generated by using the Color Thresholder app. This image shows the app window with the selected RGB color thresholds.

Color Thresholder app window showing tissue with RGB thresholds applied

Optionally, you can segment the tissue using a different color space, or select different thresholds by creating your own createBinaryTissueMask function. For example, to select different thresholds, open a low-resolution version of the image in the Color Thresholder app by entering this code in the Command Window. After you select new thresholds, select Export > Export Function from the app toolstrip. Save the generated function as createBinaryTissueMask.m.

iml9 = gather(bim,Level=9);
colorThresholder(iml9)

Use the apply function to run the createBinaryTissueMask function on each block and assemble a single blockedImage. This example creates the mask at the ninth resolution level, which fits in memory and follows the contours of the tissue better than a mask at the coarsest resolution level.

maskLevel = 9;
tissueMask = apply(bim,@(bs)createBinaryTissueMask(bs.Data),Level=maskLevel);

Display the mask overlaid on the original WSI.

figure
h = bigimageshow(bim);
showlabels(h,tissueMask,Colormap=[0 0 0; 0 1 0])
title("Tissue Mask")

Select Blocks with Tissue

Select the set of blocks inside the tissue mask by using the selectBlockLocations function. Include only blocks that have at least one pixel inside the mask by specifying the InclusionThreshold value as 0.

bls = selectBlockLocations(bim,BlockSize=[512 512], ...
    Masks=tissueMask,InclusionThreshold=0);

Configure Cellpose Model

Create a new cellpose object to process the full WSI using the nuclei pretrained model. Use the canUseGPU function to check whether a supported GPU and Parallel Computing Toolbox™ are available. By default, a cellpose object processes images on a GPU if one is available. If a GPU is not available, enable OpenVINO acceleration for best performance on the CPU.

if canUseGPU
    cp = cellpose(Model="nuclei");
else
    cp = cellpose(Model="nuclei",Acceleration="openvino");
end

Apply Cellpose Model

Use the apply function to run the labelNuclei helper function on all blocks in the tissue mask. The helper function, defined at the end of this example, segments each block using the Cellpose model and returns metrics about the segmentation labels. The helper manually normalizes all blocks to the same range before segmenting them. Specify the limits, gmin and gmax, as the minimum and maximum intensities from the original representative region img, respectively. The helper function uses an overlapping border between blocks to ensure it counts nuclei on the edges of a block only once. Specify a border size equal to 1.2 times the average nucleus diameter.

gmin = min(img,[],"all");
gmax = max(img,[],"all");

borderSize = round(1.2*[averageCellDiameter averageCellDiameter]);

if(exist("labelData","dir"))
    rmdir("labelData","s")
end

[bstats,bcount] = apply(bim, ...
    @(bs)labelNuclei(bs,cp,gmin,gmax,averageCellDiameter), ...
    Level=1, ...
    BorderSize=borderSize, ...
    BlockLocationSet=bls, ...
    PadPartial=true, ...
    OutputLocation="labelData", ...
    Adapter=images.blocked.MATBlocks, ...
    UseParallel=true);
WARNING: no mask pixels found

Explore Cellpose Results for WSI

Explore the results by creating a heatmap and histogram of nuclei statistics, and by verifying the results.

Display Heatmap of Detected Nuclei

View the results as a heat map of the nuclei count in each block. For more efficient display, downsample the count data before displaying it.

bcount.BlockSize = [50 50];
hb = bigimageshow(bim);
showlabels(hb,bcount,Colormap=flipud(hot(550)))
title("Heatmap of Nuclei Count Per Block")

Zoom in to examine a local region. The darker heatmap regions indicate higher nuclei density, indicating more cells per unit of area.

xlim([43000 47000])
ylim([100900 104000])

Display Histogram of Nucleus Area

Load the nuclei statistics into memory.

stats = gather(bstats);

Remove empty blocks that are part of the tissue mask, but do not contain cell nuclei.

emptyInd = arrayfun(@(x)isempty(x.Area),stats);
stats = stats(~emptyInd);

Concatenate the results from all blocks into vectors that contain the area, centroid coordinates, and bounding box for each nucleus.

areaVec = horzcat(stats.Area);
centroidsXY = vertcat(stats.CentroidXY);
boundingBoxedXYWH = vertcat(stats.BoundingBoxXYWH);

Create a histogram of nucleus area values, in square pixels.

figure
histogram(areaVec);
title("Histogram of Nucleus Area, " + "Mean: " + mean(areaVec) + " square pixels")

View Centroids and Bounding Boxes

Verify that nuclei within the overlapping border regions have been counted only one time. First, display a subregion of the WSI at the intersection of multiple blocks.

hb = bigimageshow(bim);
xlim([44970 45140])
ylim([98210 98380])
hold on

Display the grid lines between blocks.

hb.GridVisible = "on";
hb.GridAlpha = 0.5;
hb.GridColor = "g";
hb.GridLineWidth = 5;

Plot a marker at the centroid of each nucleus detected by the model. Make the size of each marker proportional to the area of the corresponding nucleus. The display shows that the Cellpose model successfully detects nuclei near block boundaries and does not double count them. If you observe double detections for your own image data, try increasing the value of the borderSize argument when applying the model using the apply function.

scatter(centroidsXY(:,1),centroidsXY(:,2),areaVec/10,MarkerFaceColor="w")

Add bounding boxes to validate that each Cellpose label correctly spans the full, corresponding nucleus.

xs = xlim;
ys = ylim;
inRegion = centroidsXY(:,1) > xs(1) ... 
    & centroidsXY(:,1) < xs(2) ...
    & centroidsXY(:,2) > ys(1) ...
    & centroidsXY(:,2) < ys(2);
bboxesInRegion = boundingBoxedXYWH(inRegion,:);
for ind = 1:size(bboxesInRegion,1)
    drawrectangle(Position=bboxesInRegion(ind,:));
end

Helper Functions

The labelNuclei function performs these operations to segment nuclei within a block of blocked image data:

  • Converts the input image to grayscale using im2gray, normalizes the intensities to the range [gmin gmax] using the rescale function, and inverts the image using imcomplement.

  • Segments cells using the segmentCells2D function with the cellpose object cp. The helper function disables automatic intensity normalization to avoid applying different normalization limits across blocks.

  • Computes statistics for the nuclei segmentation masks, including area, centroid, and bounding boxes, by using the regionprops function.

  • Removes data for nuclei in the border region by using the clearBorderData helper function, is defined in this section.

function [stats,count] = labelNuclei(bs,cp,gmin,gmax,ndia)
% Convert to grayscale
img = im2gray(bs.Data);
% Normalize
img = rescale(img,InputMin=gmin,InputMax=gmax);
% Invert
imgc = imcomplement(img);

% Detect
labels = segmentCells2D(cp,imgc,ImageCellDiameter=ndia,NormalizeInput=false);

% Compute statistics. Add additional properties if required.
% If you require pixel level masks for each cell, add "Image" to the vector in the second input argument. 
% Note that this increases the output directory size. 
rstats = regionprops(labels,["Centroid","Area","BoundingBox"]);

% Only retain stats and labels whose centroids are within this block
rstats = clearBorderData(rstats,bs);

% Flatten out the data into a scalar structure
stats.Area = [rstats.Area];
stats.CentroidXY = vertcat(rstats.Centroid);
stats.BoundingBoxXYWH = vertcat(rstats.BoundingBox);

if ~isempty(rstats)
    % Move to image coordinates (Start is in YX format, so flip it)
    offset = fliplr(bs.Start(1:2) - 1);
    stats.CentroidXY = stats.CentroidXY + offset;
    stats.BoundingBoxXYWH(:,1:2) = stats.BoundingBoxXYWH(:,1:2) + offset;
end

% Density count
count = numel(rstats);
end

The clearBorderData helper function removes statistics for nuclei in the overlapping border between blocks to prevent double-counting.

function rstats = clearBorderData(rstats,bs)
% Clear labels whose centroids are in the border area

fullBlockSize = size(bs.Data); % includes border pixels

% Go from (x,y) coordinates to (row,column) indices within the block
lowEdgeXY = fliplr(bs.BorderSize(1:2)+1);
highEdgeXY = fliplr(fullBlockSize(1:2)-bs.BorderSize(1:2));

% Find indices of centroids inside the current block (excludes the
% border region)
centroidsXY = reshape([rstats.Centroid],2,[]);
centroidsXY = round(centroidsXY);
inBlock = centroidsXY(1,:) >= lowEdgeXY(1) ... 
    & centroidsXY(1,:) <= highEdgeXY(1) ...
    & centroidsXY(2,:) >= lowEdgeXY(2) ...
    & centroidsXY(2,:) <= highEdgeXY(2);

% Filter stats to retain only inside block detections
rstats = rstats(inBlock);
end

References

[1] Stringer, Carsen, Tim Wang, Michalis Michaelos, and Marius Pachitariu. “Cellpose: A Generalist Algorithm for Cellular Segmentation.” Nature Methods 18, no. 1 (January 2021): 100–106. https://doi.org/10.1038/s41592-020-01018-x.

[2] Pachitariu, Marius, and Carsen Stringer. “Cellpose 2.0: How to Train Your Own Model.” Nature Methods 19, no. 12 (December 2022): 1634–41. https://doi.org/10.1038/s41592-022-01663-4.

See Also

| | | |

Related Topics

External Websites