MATLAB Answers

is an image a 2d grid or a cube

23 views (last 30 days)
sparsh garg
sparsh garg on 23 Sep 2021 at 12:16
Answered: Image Analyst on 23 Sep 2021 at 15:14
I am having an argument with my colleague that an image is a 3 dimensional object
He says that in geometric sense an image is a 2d square and not a cube.
But since the image has 3 components height,width and channels i feel that it should be treated as a cube.
The following article from cs231n supports my point
For example, suppose that the input volume has size [32x32x3], (e.g. an RGB CIFAR-10 image). If the receptive field (or the filter size) is 5x5, then each neuron in the Conv Layer will have weights to a [5x5x3] region in the input volume, for a total of 5*5*3 = 75 weights (and +1 bias parameter). Notice that the extent of the connectivity along the depth axis must be 3, since this is the depth of the input volume.
Moreover,in pre deep learning we had techniques for image segmentation such as mean shift,HOG feature descriptor,they all relied on the assumption of an image being a cube.(I am just making that u[)
@Image Analyst would appreciate your take on this.
  2 Comments
sparsh garg
sparsh garg on 23 Sep 2021 at 14:45
Ice comets will fall in the middle of the sahara before i post a question on stackoverflow,atleaset you /people at other forums are open minded to accept questions,there a question will be accepted only if the moderator likes it,otherwise
"this question is closed/deleted as it;s not related.

Sign in to comment.

Accepted Answer

Jan
Jan on 23 Sep 2021 at 13:23
This is a question of taste. It depends on what you want to consider as elements of the array.
  • An image is a 2D object containing pixels with e.g. 8, 16, 24, or 32 bits. These bits can be represented e.g. as 3 Byte for a 24 bit image in UINT8 format.
  • An image can be sees as 3D array if you consider the color value of e.g. RGB channels as position in the color space.
For Matlab and other programming tools, an image is a pile of bytes and some information about the structure without any meaning. The structure for storing the image information is chosen such, that they can be processed efficiently. The question, if the data are 2D or 3D is an artificial interpretation. Both views are valid.

More Answers (1)

Image Analyst
Image Analyst on 23 Sep 2021 at 15:14
It's a question of semantics. A gray scale image is a 2-D image in that it takes 2 values (x,y) or (column, row) to refer to a single pixel.
A volumetric image, like CT or MRI, is a 3-D volumetric image in that it takes 3 values (x,y,z) or (column, row, slice) to refer to a pixel, both programmatically and intuitively.
A color image is normally thought of (outside of a computer program) as a 2-D "thing" by most people. However each location has 3 or more color values. Three for RGB and more for hyperspectral images. So you can intuitively think of the color image as a single 2-D image, or as a stack of several 2-D images where each 2-D image represents one color channel. Of course, in coding/computer programming if you have the color image as a single variable (instead of separate variables for each color channel) then you'll need 3 indexes to index into the array to specify a single pixel's value for that particular color at that particular location. So in that "coding" sense, a color image is a 3-D image. Well it's at least certainly a 3-D variable regardless of how you want to think about it in a layman's/non-computer sense.

Community Treasure Hunt

Find the treasures in MATLAB Central and discover how the community can help you!

Start Hunting!

Translated by