In the long and complex history of thermodynamics, the generalization of the principle of energy conservation to include heat stands as probably the most significant single landmark. In the 1840s James Joule in England carried out a careful series of experiments on what was then called the “mechanical equivalent of heat” and what we would now call the conversion of work to heat. His results went a long way toward establishing that heat and work are not two different things, but two manifestations of the same thing.^{1} Expressing it in terms of modern units, one can say that the calorie and the joule are not units of distinct concepts, but different units for the same concept.

Joule’s work not only led to a broader understanding of energy conservation and to the first law of thermodynamics, it also cleared the way for a full understanding of the second law of thermodynamics, provided a basis for an absolute temperature scale, and laid the groundwork for the submicroscopic mechanics of the kinetic theory. Progress in the half century before Joule’s work had been impeded by a pair of closely related difficulties: an incorrect view of the nature of heat, and an incomplete understanding of the way in which heat engines provide useful work. To be sure, there had been important insights in this period, such as Carnot’s statement of the second law of thermodynamics in 1824. But such progress as there was did not fit together into a single structure, nor did it provide a base on which to build. Not until 1850, when the great significance of the general principle of energy conservation was appreciated by at least a few scientists, was Carnot’s work incorporated into a developing theoretical structure. The way was then cleared for a decade of rapid progress. In the 1850s, the first and second laws of thermodynamics were first stated as general unifying principles, the kinetic theory was rediscovered and refined, the concepts of heat and temperature were given submicroscopic as well as macroscopic definitions, and the full significance of the ideal-gas law was understood. The great names of the period were James Joule, William Thomson (Lord Kelvin), and James Clerk Maxwell in England, Rudolph Clausius and August Krönig in Germany.

One way to give structure to the historical development of a major theory is to follow the evolution of its key concepts. This is particularly instructive for the study of thermodynamics, because its basic concepts—heat, temperature, and entropy—exist on two levels, the macroscopic and the submicroscopic. The refinement of these concepts led both to a theoretical structure for understanding a great part of nature and to a bridge between two worlds, the large and the small. Here I address the entropy concept.

Like heat and temperature, entropy was given first a macroscopic definition, later a molecular definition. Being a much subtler concept than either heat or temperature (in that it does not directly impinge on our senses), entropy was defined only after its need in the developing theory of thermodynamics became obvious. Heat and temperature were familiar ideas refined and revised for the needs of quantitative understanding. Entropy was a wholly new idea, formally introduced and arbitrarily named when it proved to be useful in expressing the second law of thermodynamics in quantitative form. As a useful but unnamed quantity, entropy entered the writings of both Kelvin and Clausius in the early 1850s. Finally in 1865, it was formally recognized and christened “entropy” by Clausius, after a Greek word for transformation. Entropy, as he saw it, measured the potentiality of a system for transformation.

The proportionality of entropy to the logarithm of an intrinsic probability for the arrangement of a system (see Essay T5) was stated first by Ludwig Boltzmann in 1877. This pinnacle of achievement in what had come to be called statistical mechanics provided the last great thermodynamics link between the large-scale and small-scale worlds. Although we now regard Boltzmann’s definition based on the molecular viewpoint as the more fundamental, we must not overlook the earlier macroscopic definition of entropy given by Clausius (which in most applications is easier to use). Interestingly, Clausius expressed entropy simply and directly in terms of the two already familiar basic concepts, heat and temperature. He stated that a change of entropy of any part of a system is equal to the increment of heat added to that part of the system divided by its temperature at the moment the heat is added, provided the change is from one equilibrium state to another:

Here *S *denotes entropy, *H *denotes heat, and *T *denotes the absolute temperature. For heat gain, ∆*H *is positive and entropy increases. For heat loss, ∆*H *is negative and entropy decreases. How much entropy change is produced by adding or subtracting heat depends on the temperature. Since the temperature *T *appears in the denominator in this equation, a lower temperature enables a given increment of heat to produce a greater entropy change.

There are several reasons why Clausius defined not the entropy itself, but the change of entropy. For one reason, the absolute value of entropy is irrelevant, much as the absolute value of potential energy is irrelevant. Only the change of either of these quantities from one state to another matters. Another more important reason is that there is no such thing as “total heat.” Since heat is energy transfer (by molecular collisions), it is a dynamic quantity measured only in processes of change. An increment of heat ∆*H *can be gained or lost by part of a system, but it is meaningless to refer to the total heat *H *stored in that part. (This was the great insight about heat afforded by the discovery of the general principle of energy conservation in the 1840s.) What is stored is internal energy, a quantity that can be increased by mechanical work as well as by heat flow. Finally, it should be remarked that Clausius’ definition refers not merely to change, but to *small *change. When an otherwise inactive system gains heat, its temperature rises. Since the symbol *T *in the equation above refers to the temperature at which heat is added, the equation applies strictly only to increments so small that the temperature does not change appreciably as the heat is added. If a large amount of heat is added, this equation must be applied over and over to the successive small increments, each at slightly higher temperature (a procedure facilitated by integral calculus).

To explain how the macroscopic definition of entropy given by Clausius (the equation just above) and the submicroscopic definition of entropy given by Boltzmann (the equation involving a logarithm in Essay T5) fit together is a task beyond the scope of this discussion. Nevertheless I can, through an idealized example, make it reasonable that these two definitions, so different in appearance, are closely related. To give the Clausius definition a probability interpretation I need to discuss two facts: (1) Addition of heat to a system increases its disorder and therefore its entropy. (2) The disordering influence of heat is greater at low temperature than at high temperature. The first of these facts is related to the appearance of the factor ∆*H *on the right side of the equation above. The second is related to the inverse proportionality of entropy change to temperature. Not to prove these facts but to make them seem reasonable, I consider an idealized simple system consisting of just three identical molecules, each one capable of existing in any one of a number of equally spaced energy states. The overall state of this system can be represented by the triple-ladder diagram below, in which each rung corresponds to a molecular energy state. Dots on the three lowest rungs would indicate that the system possesses no internal energy. The pictured dots on the second, third, and bottom rungs indicate that the system has a total of five units of internal energy, two units possessed by the first molecule, three by the second, and none by the third.

The intrinsic probability associated with any given total energy is proportional to the number of different ways in which that energy can be divided. This is now a probability of *energy *distribution, not a probability of spatial distribution. However, the reasoning is much the same for these two kinds of distribution. The intrinsic (a priori) probability for a distribution of molecules in space is proportional to the number of different ways in which that distribution can be obtained. To give a related example, the probability of throwing 7 with a pair of dice is greater than the probability of throwing 2 because there are more ways to get a total of 7 than to get a total of 2.

The table above enumerates all the ways in which up to five units of energy can be divided among our three idealized molecules. The triplets of numbers in the second column indicate the occupied rungs of the three energy ladders. It is an interesting and instructive problem to deduce a formula for the numbers in the last column, (Hint: The number of ways to distribute 6 units of energy is 28.) However, since this is a highly idealized picture of very few molecules, precise numerical details are less important than are the qualitative features of the overall pattern. The first evident feature is that the greater the energy, the more different ways there are to divide the energy. Thus a higher probability is associated with greater internal energy. This does not mean that the system, if isolated and left alone, will spontaneously tend toward a higher probability state, for that would violate the law of energy conservation. Nevertheless, we associate with the higher energy state a greater probability and a greater disorder. When energy is added from outside via heat flow, the entropy increase is made possible. This makes reasonable the appearance of the heat increment factor, ∆*H*, in the Clausius formula above.

Looking further at the table, we ask whether the addition of heat produces a greater disordering effect at low temperature than at high temperature. For simplicity we can assume that temperature is proportional to total internal energy, as it is for a simple gas, so that the question can be rephrased: Does adding a unit of heat at low energy increase the entropy of the system more than adding the same unit of heat at higher energy? Answering this question requires a little care, because of the logarithm that connects probability to entropy (the Boltzmann formula). The relative probability accelerates upward in the table. In going from 1 to 2 units of energy, the number of ways to distribute the energy increases by three, from 2 to 3 units it increases by four, from 3 to 4 units it increases by five, and so on. However, the entropy, proportional to the logarithm of the probability, increases more slowly at higher energy. The relevant measure for the increase of a logarithm is the *factor *of growth. From 0 to 1 unit of energy, the probability trebles, from 1 to 2 units it doubles, from 2 to 3 units it grows by 67%, and so on, by ever decreasing factors of increase. Therefore the entropy grows most rapidly at low internal energy (low temperature). This makes reasonable the appearance of the temperature factor “downstairs” on the right side of the Clausius equation.

1^{} Recall that heat and work are not really *forms* of energy but are modes of energy *transfer.*