A megapixel is 1 million pixels, and when used in terms of digital cameras, represents the maximum number of pixels which can be acquired by a camera’s sensor. In reality it conveys a sense of the image size which is produced, i.e. the image resolution. When looking at digital cameras, this can be somewhat confusing because there are different types of terms used to describe resolution.

For example the Fuji X-H1 has 24.3 megapixels. The maximum image resolution is is 6000×4000 or 24MP. This is sometimes known as the number of effective pixels (or photosites), and represents those pixels within the actual image area. However if we delve deeper into the specifications (e.g. Digital Camera Database), and you will find a term called sensor resolution. This is the total number of pixels, or rather photosites¹, on the sensor. For the X-H1 this is 6058×4012 pixels, which is where the 24.3MP comes from. The sensor resolution is calculated from sensor size and effective megapixels in the following manner:

Calculate the aspect ratio (r) between width and height of the sensor. The X-H1 has a sensor size of 23.5mm×15.6mm so r=23.5/15.6 = 1.51.

Calculate the √(no. pixels / r), so √(24300000/1.51) = 4012. This is the vertical sensor resolution.

Multiply 4012×1.51=6058, to determine the horizontal sensor resolution.

The Fuji X-H1 is said to have a sensor resolution of 24,304,696 (total) pixels, and a maximum image resolution of 24,000,000 (effective) pixels. So effectively 304,696 photosites on the sensor are not recorded as pixels, representing approximately 1%. These remaining pixels form a border to the image on the sensor.

So to sum up there are four terms worth knowing:

effective pixels/megapixels – the number of pixels/megapixels in an image, or “active” photosites on a sensor.

maximum image resolution – another way to describe the effective pixels.

total photosites/pixels – the total number of photosites on a sensor.

sensor resolution – another way to describe the total photosites on a sensor.

¹ Remember, camera sensors have photosites, not pixels. Camera manufacturers use the term pixels because it is easier for people to understand.

A lot of visual technology such as digital cameras, and even TVs are based on megapixels, or rather the millions of pixels in a sensor/screen. What is the resolution of the human eye? It’s not an easy question to answer, because there are a number of facets to the concept of resolution, and the human eye is not analogous to a camera sensor. It might be better to ask how many pixels would be needed to make an image on a “screen” large enough to fill our entire field of view, so that when we look at it we can’t detect pixelation.

Truthfully, we may never really be able to put an exact number on the resolution of the human visual system – the eyes are organic, not digital. Human vision is made possible by the presence of photoreceptors in the retina. These photoreceptors, of which there are over 120 million in each eye, convert electromagnetic radiation into neural signals. The photoreceptors consist of rods and cones. Rods (which are rod shaped) provide scotopicvision, are responsible for low-light vision, and are achromatic. Cones (which are con shaped) provide photopicvision, are active at high levels of illumination, and are capable of colour vision. There are roughly 6-7 million cones, and nearly 120-125 million rods.

But how many [mega] pixels is this equivalent to? An easy guess of pixel resolution might be 125 -130 megapixels. Maybe. But then many rods are attached to bipolar cells providing for a low resolution, whereas cones each have their own bipolar cell. The bipolar cells strive to transmit signals from the photoreceptors to the ganglion cells. So there may be way less than 120 million rods providing actual information (sort-of like taking a bunch of grayscale pixels in an image and averaging their values to create an uber-pixel). So that’s not a fruitful number.

A few years ago Roger M. Clark of Clark Vision performed a calculation, assuming a field of view of 120° by 120°, and an acuity of 0.3 arc minutes. The result? He calculated that the human eye has a resolution of 576 megapixels. The calculation is simple enough:

(120 × 120 × 60 × 60) / (0.3 × 0.3) =576,000,000

The value 60 is the number of arc-minutes per degree, and the 0.3 arcmin² is essentially the “pixel” size. A square degree is then 60×60 arc-minutes, and contains 40,000 “pixels”. Seems like a huge number. But, as Clark notes, the human eye is not a digital camera. We don’t take snapshots (more’s the pity), and our vision system is more like a video stream. We also have two eyes, providing stereoscopic and binocular vision with the ability of depth perception. So there are many more factors than available in a simple sensor. For example, we typically move our eyes around, and our brain probably assembles a higher resolution image than is possible using our photoreceptors (similar I would imagine to how a high-megapixel image is created by a digital camera, slightly moving the sensor, and combining the shifted images).

The issue here may actually be the pixel size. In optimal viewing conditions the human eye can resolve detail as small as 0.59 arc minutes per line pair, which equates to 0.3 arc minutes. This number comes from a study from 1897 – “Die Abhängigkeit der Sehschärfe von der Beleuchtungsintensität”, written by Arthur König (translated roughly to “The Dependence of Visual Acuity on the Illumination Intensity”). A more recent study from 1990 (Curcio90) suggests a value of 77 cycles per degree. To convert this to arc-minutes per cycle, we first divide 1 by 77 and then multiply by 60 = 0.779. Two pixels define a cycle, so 0.779/2 = 0.3895, or 0.39. Now if we use 0.39×0.39 arcmin as the pixel size, we get 6.57 pixels per arcmin², versus 11.11 pixels when the acuity is 0.3. This vastly changes the value calculated to 341megapixels (60% of the previous calculation).

Clark’s calculation using 120° is also conservative, as the eyes field of view is roughly 155° horizontally, and 135° vertically. If we used these constraints we would get 837 megapixels (0.3), or 495 megapixels (0.39). The pixel size of 0.3 arcmin² is optimal viewing – but about 75% of the population have 20/20 vision, both with and without corrective measures. 20/20 vision implies an acuity of 1 arc minute, which means a pixel size of 1×1 arcmin². This could mean a simple 75 megapixels. There are three other factors which complicate this: (i) these calculations assume uniform optimal acuity, which is very rarely the case, (ii) vision is binocular, not monocular, and (iii) the field of view is likely not a rectangle.

For binocular vision, assuming each eye has a horizontal field of view of 155°, and there is an overlap of 120° (120° of vision from each eye is binocular, remaining 35° in each eye is monocular). This results in an overall horizontal field of view of 190°, meaning if we use 190°, and 1 arc minute acuity we get a combined total vision of 92 megapixels. If we change acuity to 0.3 we get over 1 gigapixel. Quite a range.

All these calculations are mere musings – there are far too many variables to consider in trying to calculate a generic number to represent the megapixel equivalent of the human visual system. The numbers I have calculated are approximations only to show the broad range of possibilities based solely on a few simple assumptions. In the next couple of posts we’ll look at some of the complicating factors, such as the concept of uniform acuity.