Abstract
Scientists have made efforts to understand the beauty of painting art in their own languages. As digital image acquisition of painting arts has made rapid progress, researchers have come to a point where it is possible to perform statistical analysis of a largescale database of artistic paints to make a bridge between art and science. Using digital image processing techniques, we investigate three quantitative measures of images – the usage of individual colors, the variety of colors and the roughness of the brightness. We found a difference in color usage between classical paintings and photographs and a significantly low color variety of the medieval period. Interestingly, moreover, the increment of roughness exponent as painting techniques such as chiaroscuro and sfumato have advanced is consistent with historical circumstances.
Introduction
Humans have expressed physical experiences and abstract ideas in artistic paintings such as cave paintings, frescos in cathedrals and even graffiti on city walls. Such paintings, to convey intended messages, consist of three fundamental building blocks: points, lines and planes. Recent studies have shed light on interesting mathematical patterns between these building blocks in paintings.
Artistic styles were analyzed through various statistical techniques such as fractal analysis^{1}, the waveletbased technique^{2}, the multiresolution hidden Markov method^{3}, the Fisher kernel based approach^{4} and the sparse coding model^{5,6}. Recently, these methods have also been applied to other cultural heritages such as literature^{7,8,9,10} and music^{11,12,13,14}. Such quantitative analysis is called “stylometry,” which originates from literature analysis to identify characteristic literary style^{9}.
In this study, we add a new dimension to the body of stylometry studies by analyzing a largescale database of artistic paintings. With digital image processing techniques we quantify the change in variety of painted colors and their spatial structures over ten historical periods of western paintings – medieval, early renaissance, northern renaissance, high renaissance, mannerism, baroque, rococo, neoclassicism, romanticism and realism – starting from the 11th century to the mid19th century. Digital images of the paintings were obtained from the Web Gallery of Art^{15}, which is a searchable database for European paintings and sculptures consisting of over 29,000 pieces ranging from the years 1000 to 1850. Most of the identifiable images contain information of schools, periods and artists and are good quality in resolution to apply statistical analysis.
Here we focus on the following three quantities – the usage of each color, variety of painted colors and the roughness of the brightness of images. First, we count how often a certain color appears in a painting for each period. From the frequency histogram, we find a clear difference between classical paintings and photographs. Next, we measure a fractal dimension of painted colors for each period in a color space, which is analogically considered to reflect the color ‘palette’ of that period. Interestingly, the fractal dimension of the medieval period is lower than that of other periods. The detailed results and our inference are discussed in this section. Last, we consider how rough or smooth an image is in the sense of its brightness. In order to quantify roughness of brightness, a wellknown roughness exponent measurement in statistical physics is applied. We find that the roughness exponent increases gradually over the 10 periods, which is consistent with the historical circumstances like the birth of the new painting techniques such as chiaroscuro and sfumato^{16,17} (Chiaroscuro and sfumato are major painting techniques developed and widely used during the Renaissance period. Literally, the compound word chiaroscuro is formed from the Italian words chiaro (light) and oscuro (dark), which refers to an artistic technique to delineate tonal contrasts and voluminous objects with a dramatic use of light. Precursors of chiaroscuro are Leonardo da Vinci (1452–1519) and Michelangelo Merisi da Caravaggio (1571–1610) and Rembrandt van Rijn (1606–1669) is a representative artist wellknown for his use of chiaroscuro. The Italian word sfumato is derived from the Italian term fumo which literally means “smoke”. Leonardo da Vinci mentioned sfumato as a blending of colors without lines or borders, in the manner of smoke or beyond the focus plane. In other words, sfumato is a painting technique to express gradual fadeout between object and background avoiding harsh outlines.). Analyzing these three properties, we propose new approaches to quantitatively analyze a large scale database of paintings. Applying our method to the controversial Jackson Pollock's drip paintings, it is possible to infer that his drip paintings are quite different from works of other painters.
Results
Chromospectroscopy
First we investigate how many different kinds of color appear in a painting and how often a certain color is painted, which is similar to Zipf's plot for word frequencies in literature^{18}. It is named as “chromospectroscopy.” A color is considered to be like a word for a painter. As an example of chromospectroscopy, Fig. 1a displays the fraction of each color used in a painting in descending rank order. If each color is chosen from a palette uniformly at random, the frequency of each color would follow a binomial distribution for a random process (see more detail in the supplement) and its rank plot would show an inverse of its cumulative, i.e., the regularized incomplete beta function^{19}. This is because the rank plot is the inverse of its cumulative density function (see black dots in Fig. 1a). However, interestingly, the rankordered colorusage distribution (RCD) shows a long tail distribution, which is different from the inverse function of the regularized incomplete beta function (see Fig. 1a).
Figure 1b shows RCDs for 10 periods of European art history and photographs. The RCD of a period represents how many colors are used and how often a specific color appears during the period. All periods of painting show a universal distribution curve, but the rank of each color for each period is rather different. The RCD of photographs is similar to that of paintings at the beginning of a powerlaw part but the exponential tail deviates significantly from paintings, as shown in Fig. 1b. In order to clarify the difference of the tail section of RCDs between paintings and photographs, we analyze RCDs of images of photographs after applying several painting filters from popular software. There are clear changes in the tail of the distribution when only the oil painting filter is applied. An oil painting filter usually consists of two parameters – range and level – which are related to the size of an art paint brush and smearing intensity. It seems these two parameters influence the shape of the exponential tail of the RCD. Another interesting fact is that there is no clear difference between RCDs of photographs and hyperrealism paintings, which are extremely finely drawn with microscope and are hard to distinguish from photographs with unaided eyes (see Figure S4b in the supplement). This suggests that paintings are only quantitatively distinguished from photographs by the tail section of the RCD. The tail of RCD represents frequency of noisy colors or a level of details in the image.
Fractal pattern and color palette
RCDs for all periods of paintings show quite universal distribution curves. However, the most commonly painted color is different for each period. To characterize the variety of colors more quantitatively, while ignoring its individual frequency, we investigate the fractal pattern of the painted color in the RGB color space for each period.
To examine the fractal characteristics of painted colors for each period, we measure the boxcounting dimension^{20} of the paintings in the RGB color space and compare them with two iconoclastic artists: Pieter Bruegel the Elder and Jackson Pollock. Each color used in the painting is plotted on a point in the RGB color space. Based on the definition of the boxcounting dimension, we iteratively change the length of box ε from ε = 1 to ε = 32 and count the number of nonempty boxes. A nonempty box indicates that corresponding colors within the box are used in the painting at least once. If the distribution of colors in the color space is homogeneous, the box counting dimension is 3. In other words, if the box counting dimension is less than 3, the distributions in the color space is heterogeneous and fractal, which means some axes are preferred or the distribution is composed of a preferred color scheme in the color space. In this sense, measuring the boxcounting dimension quantifies the spatial uniformity or fractality of painted colors for each artistic period.
Figure 2a shows that the boxcounting dimensions of paintings from the 10 historic periods are in the range between 2.6 and 2.8 except for the medieval period. As Fig. 2b shows, only the boxcounting dimension of the medieval period is close to that of Jackson Pollock's drip paintings (below 2.4), where he used limited colors intentionally. In addition, the box counting dimension for the paintings of Pieter Bruegel the Elder is approximately 2.55. A low boxcounting dimension represents that there is a strong preference in a small number of selected colors in the medieval age. That is, the color palette in the medieval age is significantly different from the other periods.
One can find the reason why the box counting dimensions for the medieval age and Jackson Pollock are different from others in the historical facts. First, specific rare pigments were preferred for political purposes and religious reasons in the medieval age despite their expensive cost. Second, no technique of physical mixing between different pure colors was used in that period due to the tendency to emphasize the purity of colors and materials themselves. Artists recoated on a colored canvas to represent various colors in the middle age. The drip paintings of Jackson Pollock are also formed from recoating each single color dripping pattern on other layers and the number of used colors is smaller than other western paintings before 20^{th} century. Furthermore, oil colors and color mixing techniques were not fully developed until the Renaissance age. The introduction of new expression tools, like pastels and fingers and painting techniques, such as chiaroscuro and sfumato, made much more colorful and natural expressions possible after the Renaissance period^{21}. The difference of fractal dimensions between the medieval and other periods quantitatively may quantitatively reflect the historical facts and the painting technical difference in art history.
Spatial renormalization and fixed point analysis
In the RGB color space, each painting has its own set of scattered color pixels. In order to analyze the characteristics of color usages, considering the variety of color in the paintings, we define three representative points in the RGB color space. First, center of usage frequency in the color space may be compared to center of mass in physics. One can calculate center of usage frequency (CM) in the color space with the usage information and spatial position of colors such as the center of mass of physical objects. Second, iteratively resizing a painting is necessary to get the fixed point of the painting borrowed from real space renormalization concept in physics. Repeatedly resizing a painting, a painting eventually becomes one pixel. That is the fixed point of the painting (FP). The third fixed point of the randomized painting (SFP) is the same as mentioned in the second one except for shuffling the pixels of the painting. If the spatial information of the scattered color is irrelevant, FP and SFP would not be significantly different. Note that center of mass point of a shuffled image (SCM) is the same as the original CM. Then, two vectors d_{1} (d_{2}) pointing from CM to FP (SFP) can be compared to quantify the randomness of the spatial arrangement of the colors in paintings. If d_{1} and d_{2} are similar, the used colors in a painting are not diverse or the spatial arrangement of the colors in a painting is close to random. Figure 3c suggests that the color arrangement of Jackson Pollock's drip paintings is quite different from other paintings, showing that Pollock's art work is quite random, especially in the spatial arrangement of colors. On the other hand, the two fixed points of Pieter Bruegel the Elder's paintings are far away each other.
Surface roughness and brightness contrast
Though we mainly focus on the usage of colors, ignoring its spatial arrangement over the first two subsections, spatial correlation of colors is also important to understand the artistic style of the paintings, as shown in previous RG analysis, because a painting is a composition of colors in the proper place. The spatial arrangement of colors makes various artistic effects possible. For example, contrast, as one of the artistic effects, is an important element to express shape and space in two dimensional fine arts. Among various types of contrast, brightness contrast is the most important in art history due to the cultural background of Europe which usually adopts the contrast of light and darkness as a metaphorical expression. In this subsection, taking both the color information of pixels and their spatial arrangement into account, we examine the prevalence of brightness contrast in European paintings over 10 artistic periods.
To quantify brightness contrast, we utilize the twopoint height difference correlation (HDC) and its roughness exponent α, the slope of HDC curve in a double logarithmic plot of the surface growth model in statistical physics^{22}. First we get the brightness in greyscale from the RGB color information through a weighted transformation (see Methods) and define a “brightness surface” of an image by adopting the brightness of a pixel as a height at that position of the image as shown in Fig. 4a and b. A threedimensional surface, like a deeppile carpet, is obtained from the 2dimensional painting, where the HDC is calculated as a function of distance r. This method is widely used in condensed matter and statistical physics to analyze the roughness of a growing surface, for example a semiconductor surface grown by chemical deposition^{22}. For comparison, a shuffled image, by changing a pixel's position randomly, is analyzed together.
As shown in Fig. 4a and b, since the brightness of a point is defined as its height, the height difference between two points represents the brightness difference. The twopoint HDC of a randomly shuffled painting is displayed in blue dots in Fig. 4c and d for comparison. The slope α for randomized images is 0 since there is no spatial correlation any more. Figure 4d shows an example of Jackson Pollock's drip painting, which is hard to distinguish from randomly shuffled painting when only the spatial correlation is considered. The roughness exponent of Jackson Pollock's drip painting is very small comparing to that of other European paintings.
Since HDC describes the spatial correlation between color pixels on a surface as a function of distance, the slope of the HDC function, i.e., the roughness exponent α, denotes the average brightness difference according to the contrast effect. Figure 5a shows that the roughness exponent α gradually increases over the 10 artistic periods, which is consistent with historical circumstances. First, the increasing tendency of α is related to changes in painting techniques and genres, such as from portraits to landscape. In the history of western art, many new painting techniques were developed and spread during the Renaissance period. For example, chiaroscuro, which is one of the canonical painting modes in the Renaissance period^{16}, characterizes strong contrasts between light and shade. The roughness exponent and the HDC capture the level of brightness and relative spatial position. Hence, a roughness exponent α of a painting could be a quantitative indicator of a chiaroscuro technique and its increasing tendency over artistic periods reflects the spread of the chiaroscuro technique over the continent^{21}. In addition, the Renaissance art movement led that painting genres became more diverse. Therefore, more portraits and landscape paintings were encouraged. Large objects in paintings such as a torso, i.e., the upper body of portraits, or mountains and sky in landscapes decrease the brightness difference in a short distance, but makes the increment of the HDC bigger as distance increases^{21}. Therefore, the historical renovation of painting techniques and the diversification of painting genres are clearly captured in an increasing tendency of the roughness exponent α.
Another example, sfumato is another major painting mode developed in the Renaissance period to express a vanishing or shading around objects in a painting^{17}. Smoothing the edges of objects in a painting makes the variance of brightness decrease because it doesn't allow abrupt changes at the boundary. In this case, image entropy^{23} would be a good measurement for the sfumato technique, which indicates the variance of brightness in a specific locale. Since the variance is inversely proportional to homogeneity, the image entropy describes the level of local homogeneity of brightness in a painting.
Figure 5b shows that the image entropy H increases up to Neoclassicism and then decreases, which is somewhat different from the roughness exponent since the image entropy only considers the complexity of the color gradient around a pixel locally comparing to the fact that the roughness exponent also consider the color brightness difference of remote distance. We think that the different behaviors of these two measures may reflect the tendency that the chiaroscuro technique is still developing but the sfumato declines. It may be rejecting mysterious expression and respecting the realistic one.
Discussion
From the analysis of a largescale European painting image archive, we display that chromospectroscopy of 10 art historical periods shows a universal distribution curve which distinguishes art paintings from photographs. Additionally, fractal analysis allows us to rediscover the expansion of the color palette after the medieval period, which is consistent with the fact that the color palette of the medieval age was relatively narrow comparing to other periods because of historical circumstances. Furthermore, we measure the roughness exponent and image entropy of brightness surfaces over the 10 art historical periods. We find that these mathematical measurements quantitatively describe the birth of new painting techniques and their increasing use. Our approaches successfully provide quantitative indicators reflecting historical developments of artistic styles. Applying them, it is possible to deduce that the Jackson Pollock's drip paintings are not typical art work, of course, these are still controversial in the art world.
There are several limitations of our approaches and we provide suggestions for future works. First, although the database is quite large, our dataset does not cover all paintings of the 10 art historical periods. In this reason, it is possible that there exist sampling bias in our results which we have not yet figured out. For better statistics, analyzing much bigger (higher resolution) images such as the Google Art Project^{24} will give us more concrete insight for artistic style. Another possible error is unintended color distortion while converting original paintings into digital images, which may cause color information loss or bias. Even though we have checked that our results are not significantly changed from artificial color quality reductions, we could not follow all possible distortion effects. It is also true that present colors in the paintings are different from the original ones when they were completed. Old paintings are hard to preserve and usually suffer from degradation of physical materials of paintings such as oxidation and corrosion. These are big remaining issues not only for this study but also for all stylometric analyses in arts. Nonetheless, we expect that our quantitative study would be helpful to bridge the gap between art and science.
Methods
Source of dataset and statistics of paintings
In this study, we analyzed the digital images of European paintings in the Web Gallery of Art which exhibits artworks ranging from 11th century to mid19th century^{15}. The European paintings are classified into 10 art historical periods: medieval, early renaissance, northern renaissance, high renaissance, mannerism, baroque, rococo, neoclassicism, romanticism and realism. We filtered nonpainting images, such as sculptures, miniatures, illustrations, architecture, pottery, glass paintings and wares. The number of refined images for each period is summarized in SI Table S1. In total we have analyzed 8,798 painting artworks. As shown in Fig. S1, over 94% of images are larger than 700 × 700 pixels and the largest one is 1350 × 1533. Therefore, the quality of the images is good enough to perform a statistical analysis. Furthermore, in order to discuss the difference between paintings and photographs, two more datasets are collected for hyperrealism and photographs. We collected 105 hyperrealism images from hyperrealism artists' web sites^{25,26,27,28,29,30,31}, the largest one is 2974 × 1954 and the two sets of photographs from the official Instagram site of National Geographic^{32} and the online photo gallery of a Korean portal site^{33}.
Boxcounting dimensions
In order to investigate the fractal patterns of painted colors in the RGB color space, we measured boxcounting dimensions^{20}. The boxcounting dimension is defined as the following:
where N(ε) is the number of nonempty boxes and the side length of each box is ε. A ε value represents the color quality in a digitized unit, for example, ε = 1 corresponds to 256^{3} possible colors in 24bit RGB color system and ε = 32 is associated with 8^{3} possible colors in 8bit RGB color system. Each ε value corresponds to log_{2}(256/ε)^{3}bit RGB color system. Changing ε = 32, 16, 8, 4, 2 and 1 (see Figure S6 in the supplement) and examining N(ε) for each ε, we measured d_{box}(ε).
Grayscale transformation
To consider brightness surfaces of images, we converted digital color images into grayscale images using the following weighted filter:
where R, G and B are the red, green and blue intensities of a pixel and I_{grayscale} is the brightness of a certain color, which is interpreted as a height on the image. The reason for the difference in weighting values is due to the color sensitivity of a human eye^{34} and there exist several other weighting filters for R, G and B intensities for specific purposes. However, there was no significant difference in the results with different filters.
Twopoint height difference correlation function
To measure the roughness exponents of brightness (height) surfaces, a twopoint height difference correlation (HDC) function is calculated^{22}. The definition is
which follows the simple scaling form, G(r) ~ r^{2α}, for small r and where r is a distance between two pixel points, the overbar represents the spatial average at a fixed distance r for all possible points, N_{r} is the number of possible pairs at a distance r, h(x) is the height at a point x (0 ≤ h(x) ≤ 255) and α is the roughness exponent. The roughness exponent was measured in a doublelogarithmic plot of G versus r, where the fitting range was used from r_{a} = 10 to r_{b}, where the HDC saturates to the same value both for the original and randomized paintings. It approximately corresponds to 30% of the image width and a square root of 9% of the image area.
Image entropy
Entropy of a grayscale image^{23}, is given by the following equation:
where p(x) = h(x)/S, h(x) is the height at a point of the brightness surface (0 ≤ h(x) ≤ 255) and S is the sum of all height values in the image for normalization. A weighting factor m(x) is given by m(x) = 1+σ^{2}(x), where the local height variance is calculated only over for its surrounding neighbor pixels and itself at a position x. Since this image entropy depends on an image size, all images are resized to 500 × 500 pixels by Lanczos algorithm before measuring the image entropy.
References
Taylor, R., Micolich, A. & Jonas, D. Fractal analysis of Pollock's drip paintings. Nature 399, 422 (1999).
Lyu, S., Rockmore, D. N. & Farid, H. A digital technique for art authentification. Proc. Natl. Acad. Sci. U.S.A. 101, 17006–17010 (2004).
Johnson, C. R., Jr et al. Image processing for artist identification–computerized analysis of Vincent van Gogh's painting brushstrokes. IEEE Signal Proc. Mag. Special Issue on Visual Cultural Heritage. 25, 37–48 (2008).
Bressan, M., Cifarelli, C. & Perronnin, F. An Analysis of the relationship between painters based on their work. 15th IEEE Int. Conf. Image Proc. 113–116 (2008).
Olshausen, B. A. & DeWeese, M. R. Applied mathematics: The statistics of style. Nature 463, 1027–1028 (2010).
Hughes, J. M., Graham, D. J. & Rockmore, D. N. Quantification of artistic style through sparse coding analysis in the drawings of Pieter Bruegel the Elder. Proc. Natl. Acad. Sci. U.S.A. 107, 1279–1283 (2010).
De Morgan, S. E. Memoir of Augustus de Morgan, by his wife Sophia Elisabeth de Morgan, with Selection of his Letters (Longmans, London, 1882).
Lutostowski, W. The Origin and Growth of Platos Logic (Longmans, Green, London., 1897).
Holmes, D. I. & Kardos, J. Who was the author? An introduction to stylometry. Chance 16, 5–8 (2003).
Hughes, J. M., Foti, N. J., Krakauer, D. C. & Rockmore, D. N. Quantitative patterns of stylistic influence in the evolution of literature. Proc. Natl. Acad. Sci. U.S.A. 109, 7862–7686 (2012).
Manaris, B. et al. Zipf's law, music classification and aesthetics. Comput. Music J. 29, 55–69 (2005).
Huron, D. The ramp archetype: A study of musical dynamics in 14 piano composers. Psychol. Music 19, 33–45 (1991).
Casey, M., Rhodes, C. & Slaney, M. Analysis of minimum distances in highdimensional music spaces. IEEE Trans. Speech Audio Proc. 16, 1015–1028 (2008).
Sapp, C. Hybrid numeric/rank similarity metrics for musical performances. Proc. ISMIR 99, 501–506 (2008).
Krén, E. & Marx, D. Web Gallery of Art, image collection, virtual museum, searchable database of European fine arts (10001900) http://www.wga.hu/ (1996) Date of access 09/05/2009
The National Gallery, London: Western European painting 1250–1900 http://www.nationalgallery.org.uk/paintings/glossary/chiaroscuro Date of access 03/04/2013.
Earls, I. Renaissance Art: A Topical Dictionary (Greenwood Press, 1987).
Zipf, G. K. Human Behaviour and the Principle of Least Effort: An Introduction to Human Ecology (AddisonWesley Press, Cambridge, 1949).
Abramowitz, M. & Stegun, I. A. Handbook of Mathematical Functions: with Formulas, Graphs and Mathematical Tables (Dover Publications, New York., 1965).
Gouyet, J. F. & Mandelbrot, B. Physics and Fractal Structures. (SpringerVerlag, New York., 1996).
Gage, J. Color and Meaning: Art, Science and Symbolism (University of California Press, Berkeley., 2000).
Barabási, A. L. & Stanley, H. E. Fractal Concepts in Surface Growth (Cambridge University Press, Cambridge, 1995).
Brink, A. D. Using spatial information as an aid to maximum entropy image threshold selection. Pattern Recogn. Lett. 17, 29–36 (1996).
Google Inc. Google Cultural Institute http://www.google.com/culturalinstitute/project/artproject (2011) Date of Access 14/02/2011
Jacques Bodin gallery http://www.jacquesbodin.com/ Date of Access 07/04/2010.
Roberto Bernardi http://www.robertobernardi.com/ Date of Access 07/04/2010.
Raphaella Spence http://www.raphaellaspence.com/ Date of Access 07/04/2010.
Hubert de Lartigue Accueil http://www.hubertdelartigue.com/ Date of Access 07/04/2010.
Gus Heinze on artnet http://www.artnet.com/artist/26318/gusheinze.html Date of Access 07/04/2010.
Bernardo Torrens http://www.bernardotorrens.com/ Date of Access 07/04/2010.
The Hyperrealism Paintings by Denis Peterson http://www.denispeterson.com/ Date of Access 07/04/2010.
National Geographic official Instagram. http://instagram.com/natgeo (1999) Date of access 20/08/2014
Naver, officially launched in 1999, was the first Korean portal site to develop its own search engine. http://www.naver.com/ (1999) Date of access 20/07/2009
Pratt, W. K. Digital Image Processing (John Wiley & Sons, New York, 1991).
Acknowledgements
This work was supported by the National Research Foundation of Korea (NRF) Grant funded by the Ministry of Science, ICT & Future Planning (No. 20110028908).
Author information
Affiliations
Contributions
D.K. designed and performed research, analyzed data and wrote the paper; S.W.S. designed and performed research and wrote the paper; H.J. designed research and wrote paper. All authors discussed the results and commented on the manuscript.
Ethics declarations
Competing interests
The authors declare no competing financial interests.
Electronic supplementary material
Supplementary Information
Supplementary Information
Rights and permissions
This work is licensed under a Creative Commons AttributionNonCommercialNoDerivs 4.0 International License. The images or other third party material in this article are included in the article's Creative Commons license, unless indicated otherwise in the credit line; if the material is not included under the Creative Commons license, users will need to obtain permission from the license holder in order to reproduce the material. To view a copy of this license, visit http://creativecommons.org/licenses/byncnd/4.0/
About this article
Cite this article
Kim, D., Son, SW. & Jeong, H. LargeScale Quantitative Analysis of Painting Arts. Sci Rep 4, 7370 (2014). https://doi.org/10.1038/srep07370
Received:
Accepted:
Published:
Further reading

Dating ancient paintings of Mogao Grottoes using deeply learnt visual codes
Science China Information Sciences (2018)
Comments
By submitting a comment you agree to abide by our Terms and Community Guidelines. If you find something abusive or that does not comply with our terms or guidelines please flag it as inappropriate.